Substrate Processor Architecture

The processor service is a Node.js process responsible for data ingestion, transformation and data persisting into the target database. By convention, the processor entry point is at src/main.ts. It is run as:

node lib/main.js

For local runs, one normally additionally exports environment variables from .env using dotenv:

node -r dotenv/config lib/main.js

SubstrateBatchProcessor

The Squid SDK provides the SubstrateBatchProcessor class for indexing Substrate-based blockchains. By convention, the processor object is defined at src/processor.ts. The SubstrateBatchProcessor is specifically designed for:

Polkadot, Kusama, and other Substrate-based networks
Processing events, calls, and extrinsics
Runtime version-aware data decoding
Efficient batch processing of blockchain data

See the general settings reference for detailed configuration options.

Configuration

A processor instance should be configured to define the block range to be indexed, and the selectors of data to be fetched from SQD Network and/or a node RPC endpoint. Key configuration methods:

setGateway() - Configure SQD Network data source
setRpcEndpoint() - Configure RPC data source (required for metadata)
addEvent() - Request events
addCall() - Request calls
addExtrinsic() - Request extrinsics
setFields() - Select data fields

On Substrate, calling setRpcEndpoint() is a hard requirement, as chain RPC is used to retrieve chain metadata for proper data decoding.

`processor.run()`

The actual data processing is done by the run() method called on a processor instance (typically at src/main.ts). The method has the following signature:

run<Store>(
  db: Database<Store>,
  batchHandler: (
    context: DataHandlerContext<Store, F extends FieldSelection>
  ) => Promise<void>
): void

The db parameter defines the target data sink, and batchHandler is an async void function defining the data transformation and persistence logic. It repeatedly receives batches of SQD Network data stored in context.blocks, transforms them and persists the results to the target database using the context.store interface.

To see the processor in action, check out the Substrate indexing tutorial.

Batch context

Batch handler takes a single argument of DataHandlerContext type:

export interface DataHandlerContext<Store, F extends FieldSelection> {
  _chain: Chain;
  log: Logger;
  store: Store;
  blocks: BlockData<F>[];
  isHead: boolean;
}

Here, F is the type of the argument of the setFields() processor configuration method. Store type is inferred from the Database instance passed into the run() method.

`ctx._chain`

Internal handle for direct access to the underlying chain state via RPC calls. Rarely used directly, but rather by the facade access classes generated by the typegen tools.

`ctx.log`

The native logger handle. See Logger reference.

`ctx.store`

Interface for the target data sink. See Persisting data.

`ctx.blocks`

On-chain data items are grouped into blocks, with each block containing a header and iterables for all supported data item types. Boundary blocks are always included into the ctx.blocks iterable with valid headers, even when they do not contain any requested data. It follows that batch context always contains at least one block. For Substrate, each block provides the following iterables:

block.events - Runtime events emitted during block execution
block.calls - Runtime calls (dispatchables) executed in the block
block.extrinsics - Signed or unsigned extrinsics included in the block

See the context interfaces reference for detailed type information. Depending on the data item type, items within the iterables can be canonically ordered by how the data is recorded on-chain. The shape of item objects is determined by the processor configuration done via the .setFields() method. An idiomatic use of the context API is to iterate first over blocks and then over each iterable of each block:

processor.run(new TypeormDatabase(), async (ctx) => {
  for (let block of ctx.blocks) {
    for (let event of block.events) {
      // filter and process events
    }
    for (let call of block.calls) {
      // filter and process calls
    }
    for (let extrinsic of block.extrinsics) {
      // filter and process extrinsics
    }
  }
});

The canonical ordering of ctx.blocks enables efficient in-memory data processing. For example, multiple updates of the same entity can be compressed into a single database transaction.

The processor cannot ensure that data not meeting its filters will be excluded from iterables. It only guarantees the inclusion of data that matches the filters. Therefore, it is necessary to filter the data in the batch handler prior to processing.

`ctx.isHead`

Is true if the processor has reached the chain head. The last block in ctx.blocks is then the current chain tip.

Substrate-specific features

Runtime versions

Substrate chains can upgrade their runtime, changing the structure of events, calls, and storage. The SubstrateBatchProcessor is aware of runtime versions and uses the typegen tools to generate runtime version-aware decoders.

Types bundles

For chains with custom types, you may need to provide a types bundle to help the processor decode the data correctly.

Storage queries

Substrate processors can query historical chain storage using state queries. This is useful for fetching additional context not available in events or calls.

Next steps

General Settings

Configure your processor with gateways, RPC endpoints, and more

Data Requests

Learn how to request events, calls, and extrinsics

Context Interfaces

Explore the TypeScript interfaces for context objects

Data Storage

Learn about storing your processed data

Substrate

​Substrate Processor Architecture

​SubstrateBatchProcessor

​Configuration

​processor.run()

​Batch context

​ctx._chain

​ctx.log

​ctx.store

​ctx.blocks

​ctx.isHead

​Substrate-specific features

​Runtime versions

​Types bundles

​Storage queries

​Next steps

General Settings

Data Requests

Context Interfaces

Data Storage

Substrate Processor Architecture

SubstrateBatchProcessor

Configuration

`processor.run()`

Batch context

`ctx._chain`

`ctx.log`

`ctx.store`

`ctx.blocks`

`ctx.isHead`

Substrate-specific features

Runtime versions

Types bundles

Storage queries

Next steps