What a Broken POC Taught Me About Data Platform Encapsulation

The Situation

After returning to a client environment, I attempted to run dbt and it failed immediately. The database schema had changed: columns renamed, others removed. Initially, this looked like a delivery failure.

The failure was not due to logic, tooling, or deployment. It was caused by upstream schema changes that invalidated the experiment’s assumptions.

A hidden assumption in this POC was that the source data would remain roughly stable while we experimented. In real client environments, that rarely holds.

Why the Decision Was Reasonable

The POC environment was mutable, assumptions about upstream stability were reasonable at the time, and rapid experimentation was the priority. There was no negligence; only a focus on learning quickly.

What Changed

The mutable environment evolved:

Columns renamed or removed
Dependencies shifted without notice
Original assumptions no longer held

Even short-term experiments are vulnerable to hidden dependencies.

The Cost That Arrived Late

The cost appeared as:

Failed runs
Loss of learning velocity
Frustration and rework

How I Think About This Now

The failure was not a tooling issue, but a delivery decision about how much instability the experiment could tolerate.

I apply encapsulation as risk management, even in POCs:

Protect assumptions explicitly
Clearly separate environments, data, and dependencies
Ensure learning continues uninterrupted despite external changes

A Closing Reflection

Good design in data platforms begins with managing assumptions and dependencies explicitly. Even temporary work benefits from clear boundaries to protect learning velocity.