Skip to main content

Substrate Processor Architecture

The processor service is a Node.js process responsible for data ingestion, transformation and data persisting into the target database. By convention, the processor entry point is at src/main.ts. It is run as:
node lib/main.js
For local runs, one normally additionally exports environment variables from .env using dotenv:
node -r dotenv/config lib/main.js

SubstrateBatchProcessor

The Squid SDK provides the SubstrateBatchProcessor class for indexing Substrate-based blockchains. By convention, the processor object is defined at src/processor.ts. The SubstrateBatchProcessor is specifically designed for:
  • Polkadot, Kusama, and other Substrate-based networks
  • Processing events, calls, and extrinsics
  • Runtime version-aware data decoding
  • Efficient batch processing of blockchain data
See the general settings reference for detailed configuration options.

Configuration

A processor instance should be configured to define the block range to be indexed, and the selectors of data to be fetched from SQD Network and/or a node RPC endpoint. Key configuration methods:
On Substrate, calling setRpcEndpoint() is a hard requirement, as chain RPC is used to retrieve chain metadata for proper data decoding.

processor.run()

The actual data processing is done by the run() method called on a processor instance (typically at src/main.ts). The method has the following signature:
run<Store>(
  db: Database<Store>,
  batchHandler: (
    context: DataHandlerContext<Store, F extends FieldSelection>
  ) => Promise<void>
): void
The db parameter defines the target data sink, and batchHandler is an async void function defining the data transformation and persistence logic. It repeatedly receives batches of SQD Network data stored in context.blocks, transforms them and persists the results to the target database using the context.store interface.
To see the processor in action, check out the Substrate indexing tutorial.

Batch context

Batch handler takes a single argument of DataHandlerContext type:
export interface DataHandlerContext<Store, F extends FieldSelection> {
  _chain: Chain;
  log: Logger;
  store: Store;
  blocks: BlockData<F>[];
  isHead: boolean;
}
Here, F is the type of the argument of the setFields() processor configuration method. Store type is inferred from the Database instance passed into the run() method.

ctx._chain

Internal handle for direct access to the underlying chain state via RPC calls. Rarely used directly, but rather by the facade access classes generated by the typegen tools.

ctx.log

The native logger handle. See Logger reference.

ctx.store

Interface for the target data sink. See Persisting data.

ctx.blocks

On-chain data items are grouped into blocks, with each block containing a header and iterables for all supported data item types. Boundary blocks are always included into the ctx.blocks iterable with valid headers, even when they do not contain any requested data. It follows that batch context always contains at least one block. For Substrate, each block provides the following iterables:
  • block.events - Runtime events emitted during block execution
  • block.calls - Runtime calls (dispatchables) executed in the block
  • block.extrinsics - Signed or unsigned extrinsics included in the block
See the context interfaces reference for detailed type information. Depending on the data item type, items within the iterables can be canonically ordered by how the data is recorded on-chain. The shape of item objects is determined by the processor configuration done via the .setFields() method. An idiomatic use of the context API is to iterate first over blocks and then over each iterable of each block:
processor.run(new TypeormDatabase(), async (ctx) => {
  for (let block of ctx.blocks) {
    for (let event of block.events) {
      // filter and process events
    }
    for (let call of block.calls) {
      // filter and process calls
    }
    for (let extrinsic of block.extrinsics) {
      // filter and process extrinsics
    }
  }
});
The canonical ordering of ctx.blocks enables efficient in-memory data processing. For example, multiple updates of the same entity can be compressed into a single database transaction.
The processor cannot ensure that data not meeting its filters will be excluded from iterables. It only guarantees the inclusion of data that matches the filters. Therefore, it is necessary to filter the data in the batch handler prior to processing.

ctx.isHead

Is true if the processor has reached the chain head. The last block in ctx.blocks is then the current chain tip.

Substrate-specific features

Runtime versions

Substrate chains can upgrade their runtime, changing the structure of events, calls, and storage. The SubstrateBatchProcessor is aware of runtime versions and uses the typegen tools to generate runtime version-aware decoders.

Types bundles

For chains with custom types, you may need to provide a types bundle to help the processor decode the data correctly.

Storage queries

Substrate processors can query historical chain storage using state queries. This is useful for fetching additional context not available in events or calls.

Next steps