Benefits
Object Storage Advantages
- Large Historical Data – Access large datasets in analytics-ready formats.
- Research Flexibility – Run proprietary analyses and test strategies without API rate constraints.
- Cost Efficiency – Avoid repeated API calls when retrieving extensive historical data.
- Pipeline Integration – Easily connect to existing ETL, ELT, and analytics workflows.
Cloud Data Warehouse Advantages
- High-Performance Queries – Optimized for complex, large-scale analytical workloads.
- Seamless Data Integration – Integrate with diverse data sources and workflows.
- Cloud-Native Scalability – Scale storage and compute as your data needs grow.
- Enterprise-Grade Security – Advanced compliance and security features.
- User-Friendly SQL Access – Intuitive querying for analysts and engineers.
- Built-In Transformation – Native tools for processing, cleansing, and enriching data.
Delivery Methods
Amazon S3 — Parquet
Retrieve large historical datasets from Amazon S3 in Apache Parquet format, optimized for performance and compatibility with analytics tools. Apache Parquet format offers several key advantages:- Columnar Storage – Stores data by column instead of row, enabling highly efficient compression and encoding.
- High Performance – Delivers faster processing for large datasets and complex analytical queries.
- Efficient Compression – Achieves better compression ratios than row-based formats like JSON.
- Analytics-Optimized – Designed for fast querying and analytical workloads.
- Seamless Integration – Fits easily into existing data pipelines and big data ecosystems.
- Broad Compatibility – Supported across major data warehousing, analytics, and machine learning platforms.
Snowflake Data Warehouse
Most datasets are available in Snowflake, providing scalable, cloud-native data warehousing with efficient storage, fast retrieval, and powerful SQL-based analysis.Getting Started
Working with Parquet Files
If you only want to see the available fields, download a sample parquet file and load it as a pandas dataframe:Access and Provisioning
Amazon S3 Access
Customers need their own AWS credentials for S3 access provisioning. Contact your Account Executive if you’re interested in downloading data via S3.Important Access Requirements
Note: Our S3 data buckets are configured as Requester Pays buckets, meaning your company will be responsible for any Amazon data transfer fees incurred during downloads. To access the data, you must include the following in your request headers:Ensure this setting is included in all requests to avoid access issues.
- Header:
x-amz-request-payer: requester
- Parameter:
--request-payer requester
(for CLI requests)
Snowflake Access
Customers need their own Snowflake account for data sharing access. Visit Snowflake’s Marketplace to access sample files, or contact us for full access.Next Steps
Choose the delivery method that best fits your infrastructure and analytical needs:- Amazon S3: Ideal for downloading and storing large historical datasets for offline analysis
- Snowflake: Perfect for real-time querying and advanced analytics with SQL