Asia/Kolkata
My BlogDecember 25, 2025

That Time I Tried to Make my own Database From Scratch

Suvan Gowri Shanker
That Time I Tried to Make my own Database From Scratch
It all started when I was trying to build collaborative, local-first applications. I wanted my apps to respond instantly, work offline, and resolve conflicts automatically when devices reconnected. The core problem was building a system that supports optimistic local writes — no locks, no central leader — while guaranteeing deterministic convergence of replicas under partitions, message loss, and device churn. This property is formally known as Strong Eventual Consistency (SEC).
Beyond the basic question of "does it converge?", I realized a practical CRDT database had to handle real-world challenges:
I spent a lot of time reading papers and evaluating different CRDT approaches. Here's a quick taxonomy of what I explored:
Approach
Core Idea
Trade-off
CvRDTs (state-based)Join-semilattice mergeFull-state shipping is expensive
CmRDTs (operation-based)Commutative concurrent operationsRequires stronger dissemination guarantees
δ-CRDTs (delta-state)Delta-mutators return minimal stateBandwidth-efficient, preserves causal consistency
Merkle-CRDTsMerkle-DAG clocksDecoupled from membership size, requires DAG management
After evaluating the pros and cons, I designed the Merkle-Delta CRDT Store (MDCS). I structured it into four tightly integrated layers:

1. δ-CRDT Core

Efficient incremental dissemination using a document-oriented composition of maps, sets, and registers.

2. Merkle-Clock Sync

Open membership and recovery via content-addressed DAG synchronization.

3. Stability-Guided Compaction

Bounded metadata growth through principled pruning when intervals are acknowledged.

4. Reactivity Guardrails

Exposing buffered operations to avoid waiting on unrelated updates.
One of the most interesting parts of the build was the synchronization architecture. It operates purely on two complementary layers:
  1. Layer A (Delta-interval anti-entropy): Uses delta-state anti-entropy with acknowledgements for causal merging and garbage collection.
  2. Layer B (Merkle-Clock summaries): Uses Merkle-Clock roots as compact frontier identifiers for discovery and gap repair.
Building Carnelia was a massive learning experience. On the plus side, my approach avoids strict dissemination assumptions. Deltas act as idempotent joins, handling duplication gracefully. The open membership model works well for dynamic sets, and the metadata compaction is principled. However, the trade-offs are real. Multiple layers mean careful integration and deep testing. The Merkle history can grow without snapshots, and building explicit structures over a primarily non-serializable base demands careful invariant design. In the end, while it isn't perfect, it's a solid foundation for the resilient, local-first apps I set out to build!
Share this post:
Suvan GSgreeenboiGreenArcadeSuvan GSgreeenboiGreenArcadeSuvan GSgreeenboi
Suvan GSgreeenboiGreenArcadeSuvan GSgreeenboiGreenArcadeSuvan GSgreeenboi