Skip to main content
This page is a definitive end-to-end guide into practical squid development. It uses templates to simplify the process. Check out Squid from scratch for a more educational barebones approach.
Feel free to also use the template-specific sqd scripts defined in commands.json to simplify your workflow. See sqd CLI cheatsheet for a short intro.

Prepare the environment

  • Node v20.x or newer
  • Git
  • Squid CLI
  • Docker (if your squid will store its data to PostgreSQL)
See also the Environment set up page.

Understand your technical requirements

Consider your business requirements and find out
  1. How the data should be delivered. Options:
  2. What data should be delivered
  3. What are the technologies powering the blockchain(s) in question. Supported options: Note that you can use SQD via RPC ingestion even if your network is not listed.
  4. What exact data should be retrieved from blockchain(s)
  5. Whether you need to mix in any off-chain data

Example requirements

Start from a template {#templates}

Although it is possible to compose a squid from individual packages, in practice it is usually easier to start from a template.
  • A minimal template intended for developing EVM squids. Indexes ETH burns.
sqd init my-squid-name -t evm
  • A starter squid for indexing ERC20 transfers.
sqd init my-squid-name -t https://github.com/subsquid-labs/squid-erc20-template
sqd init my-squid-name -t gravatar
sqd init my-squid-name -t multichain
  • USDC transfers -> local CSV
sqd init my-squid-name -t https://github.com/subsquid-labs/file-store-csv-example
  • USDC transfers -> local Parquet
sqd init my-squid-name -t https://github.com/subsquid-labs/file-store-parquet-example
  • USDC transfers -> CSV on S3
sqd init my-squid-name -t https://github.com/subsquid-labs/file-store-s3-example
  • USDC transfers -> BigQuery dataset
sqd init my-squid-name -t https://github.com/subsquid-labs/squid-bigquery-example
After retrieving the template of choice install its dependencies:
cd my-squid-name
npm i
Test the template locally. The procedure varies depending on the data sink:
1

Launch a PostgreSQL container

docker compose up -d
2

Build the squid

bash npm run build
3

Apply the DB migrations

bash npx squid-typeorm-migration apply
4

Start the squid processor

node -r dotenv/config lib/main.js
You should see output that contains lines like these ones:
04:11:24 INFO  sqd:processor processing blocks from 6000000
04:11:24 INFO  sqd:processor using archive data source
04:11:24 INFO  sqd:processor prometheus metrics are served at port 45829
04:11:27 INFO  sqd:processor 6051219 / 18079056, rate: 16781 blocks/sec, mapping: 770 blocks/sec, 544 items/sec, eta: 12m
5

Start the GraphQL server

Run the following command in a separate terminal:
npx squid-graphql-server
Then visit the GraphiQL console to verify that the GraphQL API is up.
When done, shut down and erase your database with docker compose down.

The bottom-up development cycle {#bottom-up-development}

The advantage of this approach is that the code remains buildable at all times, making it easier to catch issues early.

I. Regenerate the task-specific utilities {#typegen}

Retrieve JSON ABIs for all contracts of interest (e.g. from Etherscan), taking care to get ABIs for implementation contracts and not proxies where appropriate. Assuming that you saved the ABI files to ./abi, you can then regenerate the utilities with
npx squid-evm-typegen ./src/abi ./abi/*.json --multicall
Or if you would like the tool to retrieve the ABI from Etherscan in your stead, you can run e.g.
npx squid-evm-typegen \
  src/abi \
  0xdAC17F958D2ee523a2206206994597C13D831ec7#usdt
The utility classes will become available at src/abi. See also EVM typegen code generation.

II. Configure the data requests {#processor-config}

Data requests are customarily defined at src/processor.ts. Edit the definition of const processor to:
  1. Use a data source appropriate for your chain and task.
    • It is possible to use RPC as the only data source, but adding a SQD Network data source will make your squid sync much faster.
    • RPC is a hard requirement if you’re building a real-time API.
    • If you’re using RPC as one of your data sources, make sure to set the number of finality confirmations so that hot blocks ingestion works properly.
    • On low block time, high data rate networks (e.g. Arbitrum) use WSS endpoints if latency is critical.
  2. Request all event logs, transactions, execution traces and state diffs that your task requires, with any necessary related data (e.g. parent transactions for event logs).
  3. Select all data fields necessary for your task (e.g. gasUsed for transactions).
See reference documentation for more info and processor configuration showcase for a representative set of examples.

III. Decode and normalize the data {#batch-handler-decoding}

Next, change the batch handler to decode and normalize your data. In templates, the batch handler is defined at the processor.run() call in src/main.ts as an inline function. Its sole argument ctx contains:
  • at ctx.blocks: all the requested data for a batch of blocks
  • at ctx.store: the means to save the processed data
  • at ctx.log: a Logger
  • at ctx.isHead: a boolean indicating whether the batch is at the current chain head
  • at ctx._chain: the means to access RPC for state calls
This structure (reference) is common for all processors. Each item in ctx.blocks contains the data for the requested logs, transactions, traces and state diffs for a particular block, plus some info on the block itself. See EVM batch context reference. Use the .decode methods from the contract ABI utilities to decode events and transactions, e.g.
import * as erc20abi from "./abi/erc20";

processor.run(db, async (ctx) => {
  for (let block of ctx.blocks) {
    for (let log of block.logs) {
      if (log.topics[0] === erc20abi.events.Transfer.topic) {
        let { from, to, value } = erc20.events.Transfer.decode(log);
      }
    }
  }
});
See also the EVM data decoding.

(Optional) IV. Mix in external data and chain state calls output {#external-data}

If you need external (i.e. non-blockchain) data in your transformation, take a look at the External APIs and IPFS page. If any of the on-chain data you need is unavalable from the processor or incovenient to retrieve with it, you have an option to get it via direct chain queries.

V. Prepare the store {#store}

At src/main.ts, change the Database object definition to accept your output data. The methods for saving data will be exposed by ctx.store within the batch handler.
1

Define the database schema

Define the schema of the database (and the core schema of the OpenReader GraphQL API if it is used) at schema.graphql.
2

Regenerate the TypeORM model classes

npx squid-typeorm-codegen
The classes will become available at src/model.
3

Compile the models code

bash npm run build
4

Ensure access to a blank database

The easiest way to do so is to start PostgreSQL in a Docker container with:
docker compose up -d
If the container is running, stop it and erase the database with:
docker compose down
before issuing a docker compose up -d.
The alternative is to connect to an external database. See this section to learn how to specify the connection parameters.
5

Regenerate a migration

rm -r db/migrations
npx squid-typeorm-migration generate
You can now use the async functions ctx.store.upsert() and ctx.store.insert(), as well as various TypeORM lookup methods to access the database.See the typeorm-store guide and reference for more info.

VI. Persist the transformed data to your data sink {#batch-handler-persistence}

Once your data is decoded, optionally enriched with external data and transformed the way you need it to be, it is time to save it.
For each batch, create all the instances of all TypeORM model classes at once, then save them with the minimal number of calls to upsert() or insert(), e.g.:
import { EntityA, EntityB } from "./model";

processor.run(new TypeormDatabase(), async (ctx) => {
  const aEntities: Map<string, EntityA> = new Map(); // id -> entity instance
  const bEntities: EntityB = [];

  for (let block of ctx.blocks) {
    // fill the containets aEntities and bEntities
  }

  await ctx.store.upsert([...aEntities.values()]);
  await ctx.store.insert(bEntities);
});
It will often make sense to keep the entity instances in maps rather than arrays to make it easier to reuse them when defining instances of other entities with relations to the previous ones. The process is described in more detail in the step 2 of the BAYC tutorial.If you perform any database lookups, try to do so in batches and make sure that the entity fields that you’re searching over are indexed.See also the patterns and anti-pattens sections of the Batch processing guide.

The top-down development cycle

The bottom-up development cycle described above is convenient for inital squid development and for trying out new things, but it has the disadvantage of not having the means of saving the data ready at hand when initially writing the data decoding/transformation code. That makes it necessary to come back to that code later, which is somewhat inconvenient e.g. when adding new squid features incrementally. The alternative is to do the same steps in a different order:
  1. Update the store
  2. If necessary, regenerate the utility classes
  3. Update the processor configuration
  4. Decode and normalize the added data
  5. Retrieve any external data if necessary
  6. Add the persistence code for the transformed data

GraphQL options

Store your data to PostgreSQL, then consult Serving GraphQL for options.

Scaling up

If you’re developing a large squid, make sure to use batch processing throughout your code. A common mistake is to make handlers for individual event logs or transactions; for updates that require data retrieval that results in lots of small database lookups and ultimately in poor syncing performance. Collect all the relevant data and process it at once. A simple architecture of that type is discussed in the BAYC tutorial. You should also check the Cloud best practices page even if you’re not planning to deploy to SQD Cloud - it contains valuable performance-related tips. Many issues commonly arising when developing larger squids are addressed by the third party @belopash/typeorm-store package. Consider using it. For complete examples of complex squids take a look at the Giant Squid Explorer and Thena Squid repos.

Next steps