BigQuery support
Squids can store their data to BigQuery datasets using the@subsquid/bigquery-store package. Define and use the Database object as follows:
src/main.ts
bqis aBigQueryinstance. When created without arguments like this it’ll look at theGOOGLE_APPLICATION_CREDENTIALSenvironment variable for a path to a JSON with authentication details.datasetis the path to the target dataset.
tableslists the tables that will be created and populated within the dataset. For every field of thetablesobject an eponymous field of thectx.storeobject will be created; callinginsert()orinsertMany()on such a field will result in data being queued for writing to the corresponding dataset table. The actual writing will be done at the end of the batch in a single transaction, ensuring dataset integrity.
Deploying to SQD Cloud
We discourage uploading any sensitive data with squid code when deploying to SQD Cloud. To pass your credentials JSON to your squid, create a Cloud secret variable populated with its contents:src/main.ts write the contents to a file:
src/main.ts
GOOGLE_APPLICATION_CREDENTIALS variable and request the secret in the deployment manifest:
squid.yaml
Examples
An end-to-end example geared towards local runs can be found in this repo. Look at this branch for an example of a squid made for deployment to SQD Cloud.Troubleshooting
Transaction is aborted due to concurrent update
This means that your project has an open session that is updating some of the tables used by the squid. Most commonly, the session is left by a squid itself after an unclean termination. You have two options:- If you are not sure if your squid is the only app that uses sessions to access your BigQuery project, find the faulty session manually and terminate it. See Get a list of your active sessions and Terminate a session by ID.
-
DANGEROUS If you are absolutely certain that the squid is the only app that uses sessions to access your BigQuery project, you can terminate all the dangling sessions by running
Replace
region-uswith your dataset’s region in the code above. You can also enableabortAllProjectSessionsOnStartupand supplydatasetRegionin your database config to perform this operation at startup:This method will cause data loss if, at the moment when the squid starts, some other app happens to be writing data anywhere in the project using the sessions mechanism.
Error 413 (Request Entity Too Large)
Squid produced too much data per batch per table and BigQuery refused to handle it. Begin by finding out which table causes the issue (e.g. by countinginsert() calls), then enable pagination for that table:

