Download the Dataset
The previous Parquet export format v1 is now deprecated. See the note below. Please follow the instructions for the new v2 format.
The entire Verifier Alliance dataset is exported continuously as Parquet files, a modern columnar data format. Parquet files are compressed, efficient to query, and widely supported by data tools. (Quick tutorial).
The export is hosted on Google Cloud Storage and accessible via an S3-compatible API at export.verifieralliance.org.
Export v2 Format
The export format has undergone a redesign to make it more efficient and easier to use. The v2 format follows these principles:
- New data is uploaded daily.
- Each database table is stored as a set of Parquet files.
- Files are partitioned by row ranges and ordered by
created_attimestamps. - Append-only pattern: New data is added to new files; existing files are not modified. Only the most recent file for each table may be updated while it is not full yet. This design is possible because in the underlying database verified contracts are only inserted and rows are never updated.
- File metadata (checksums, sizes, timestamps) is provided directly by the Google Cloud Storage API.
- Files use zstd compression built into the Parquet format.
The dataset is available at export.verifieralliance.org/?prefix=v2/. All files of the v2 format are stored under the v2/ prefix.
Downloading and Syncing the Dataset
To download the entire dataset, you can run this command:
curl -s 'https://export.verifieralliance.org/?prefix=v2/' | \
grep -oP '(?<=<Key>)[^<]+' | \
xargs -I {} curl -L -O https://export.verifieralliance.org/{}
Alternatively, the AWS CLI makes it easy to download and keep the dataset in sync. The following command downloads the entire dataset on the first run, and on subsequent runs only downloads new or modified files:
aws s3 sync s3://verifier-alliance-parquet-export/v2/ ./verifier-alliance-dataset --endpoint-url https://storage.googleapis.com --no-sign-request
Working with Parquet Files
Once downloaded, you can query and analyze Parquet files using various tools and libraries. Here are some popular options to give you a head start:
- Pandas: Read data from Parquet files in Python
- DuckDB: SQL queries on Parquet files
- pg_parquet: PostgreSQL extension for copying Parquet data into a Postgres database
API
For more fine-grained control, you can browse and download files directly using the S3-compatible Google Cloud Storage API:
List all v2 files:
https://export.verifieralliance.org/?prefix=v2/
List files for a specific table:
https://export.verifieralliance.org/?prefix=v2/verified_contracts/
Download a specific file:
https://export.verifieralliance.org/v2/verified_contracts/verified_contracts_0_1000000.parquet
The API returns XML responses following the Google Cloud Storage XML API specification.
Available Tables
The Parquet export is available for all VerA database tables: verified_contracts, sources, compiled_contracts_sources, compiled_contracts, contract_deployments, contracts, and code.
API Parameters
The most important parameters of the listing API are the following:
- prefix: Filter results to objects whose names begin with this prefix (e.g.,
?prefix=v2/verified_contracts/) - marker: Start listing after this object name (for pagination)
- max-keys: Maximum number of objects to return in one response
The response from the listing API might be truncated, which is indicated by the IsTruncated field of the result. The marker parameter can be used to paginate through results by setting it to the NextMarker of the previous response.
Example with pagination:
https://export.verifieralliance.org/?prefix=v2/verified_contracts/&max-keys=2&marker=v2/verified_contracts/verified_contracts_1000000_2000000.parquet
Metadata
The listing API provides detailed metadata for each of the Parquet files:
<ListBucketResult xmlns="http://doc.s3.amazonaws.com/2006-03-01">
<Name>verifier-alliance-parquet-export</Name>
<Prefix>v2/</Prefix>
<Marker/>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>v2/code/code_0_100000.parquet</Key>
<Generation>1766065018286394</Generation>
<MetaGeneration>1</MetaGeneration>
<LastModified>2025-12-18T13:36:58.292Z</LastModified>
<ETag>"ba687acd0afab85ed203a593479f0ce3"</ETag>
<Size>101591414</Size>
</Contents>
<!-- More entries... -->
</ListBucketResult>
Most important fields:
- Key: The file path (download at
https://export.verifieralliance.org/{Key}) - LastModified: When the file was last uploaded/modified
- ETag: MD5 hash of the file contents (use this to detect changes)
- Size: File size in bytes
Legacy Format (v1)
The v1 Parquet export format is no longer updated. All new data is only available in the v2 format. Please migrate to v2 for access to current data.
The legacy v1 format files can still be accessed via non-prefixed paths in the bucket (e.g., https://export.verifieralliance.org/verified_contracts/verified_contracts_0_1000000.parquet).
The v1 format used a JSON manifest file at https://export.verifieralliance.org/manifest.json listing all available Parquet files. However, this format was not append-only. Each daily export regenerated all files, requiring users to download the entire dataset again after every update. The manifest also did not include checksums or modification timestamps, making it difficult to determine what changed between exports.
Export Script
The source code of the export script is available at https://github.com/verifier-alliance/parquet-export.