Under the Hood
A series of deep dives into how data platforms really work, from hidden configurations to architecture choices that keep everything running.
- Stop Scattering Metadata. Start Shipping Contracts
Why I advocated for ODCS-style data contracts: fewer scattered configs, clearer ownership, and a cleaner path to data quality.
- CDC Isn’t Streaming. It’s Correctness
I used to think CDC was a tool choice. Turns out it’s a design choice, with consequences. Lessons learned from building ingestion pipelines at scale.
- Auto Loader: The End of File Listing Hell
How Databricks Auto Loader replaced manual file tracking and changed ingestion pipelines for good.
- Databricks Under the Hood: The Platform Magic Behind Your Workspace
Every notebook you run depends on invisible rules and settings that protect your data and budget. Let’s lift the curtain on the Databricks admin world.