Blockchain · Dec 2025

Indexing the chain: a read model auditors trust

Diana MunteanuBackend engineering6 min read

An auditor asks a simple question. Show me every transfer this account received in March and prove none are missing. If your answer is to loop over blocks by calling a node, you have already lost. Nodes are built to reach consensus, not to answer product questions. The moment a feature or an audit depends on the chain, you need a read model between the two — something queryable, stable, and reconcilable. Building that read model well is most of the real work in a serious on-chain product.

Why you never query a node for product features

A node answers questions about blocks and state, not about your business. Ask it for one account's twelve month history and you get slow, rate limited pagination that changes shape under load and disappears when the node falls behind or gets replaced. There is no join, no filter you actually want, no stable ordering across a reorg. Point a dashboard or a compliance export straight at a node and you have coupled your product uptime to an infrastructure component that was never designed to serve reads. Every mature system puts an indexer in between.

The indexer is a projection, not a cache

An indexer subscribes to the chain, decodes the events you care about, and writes them into a database shaped like your product. Think of it as an append driven projection rather than a cache. You do not store what a screen happens to need today, you store the decoded event stream and derive read models from it. The single rule that keeps this sane is idempotency. Every event has a natural key — block hash, transaction hash, log index — and every projection must apply that event exactly once no matter how many times it is delivered. Get idempotency right and you can replay the entire chain into a fresh database and land on byte identical state, the property every later guarantee rests on.

Reorgs are not an edge case

On any probabilistic chain the tip is a rumor until it is buried. Blocks you already indexed can be orphaned, and a naive indexer that only ever appends will happily serve transactions that no longer exist. So reorgs are a first class case, not an exception handler. We tag every projected row with the block hash that produced it, track finality depth, and when a reorg arrives we roll back the affected blocks and reapply the new canonical chain. Because the projections are idempotent, rollback and replay are ordinary operations rather than a panic. What the product sees is a read model that always matches the canonical chain, with unconfirmed data clearly marked as such.

Idempotency and reorg handling buy you correctness in theory. Reconciliation proves it in practice. A background job re-derives balances and counts straight from the chain and compares them against the read model, and any drift raises an alarm before a user or an auditor ever sees it. On top of that verified model we build plain-language audit views — not raw logs, but readable statements like this account received this amount in this transaction at this time, each one traceable back to a block and a log index. For one regional bank tokenization rail we ran exactly this stack and moved from pilot to production in four weeks with full audit-trail coverage, because the read model could always be proven against the chain.

If you cannot reconcile your read model against the chain on demand, you do not have a read model, you have a guess.
— Protocore · Backend engineering

Raw events are not a product and a node is not a database. The value is in the projection that sits between them — idempotent, reorg aware, and continuously reconciled — turning an adversarial event stream into something a product can query and an auditor can trust. Build that layer deliberately and the hard questions, show me everything, prove nothing is missing, become one query instead of a fire drill.

Have a system to build?

Tell us the problem. We'll come back with an architecture and a plan.

Start a project