Note: 1Password SaaS Manager was previously named Trelica. Some commands and integrations still use or refer to Trelica. Follow directions as written to avoid errors.
Query your Parquet exports directly from S3 without downloading files.
Prerequisites
# Install DuckDB (one-time)
# macOS
brew install duckdb
# or Linux:
# curl -L https://github.com/duckdb/duckdb/releases/latest/download/duckdb_cli-linux-amd64.zip -o duckdb.zip && unzip duckdb.zipSetup AWS credentials
DuckDB reads AWS credentials from standard locations:
# Option 1: AWS CLI (recommended)
aws sso login
eval $(aws configure export-credentials --format env)
aws sts get-caller-identity
# Option 2: Environment variables
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_DEFAULT_REGION="us-east-1"Recommended setup – automated initialization
The export generates initialization scripts to help you query the data. Choose based on your use case:
Local files
init_duckdb_local.sql - Creates views that query local Parquet files directly (no data duplication)
S3 files
For BI tools: Use init_duckdb_s3.sql (TABLES)
- Downloads data once from S3 into DuckDB for fast queries
- Best for: Dashboards, reporting, BI tools in containers (e.g. Metabase)
- Performance: Fast (data cached in DuckDB)
- Trade-off: Requires manual refresh to see new data
For ad-hoc queries: Use init_duckdb_s3_views.sql (VIEWS)
- Queries S3 Parquet files directly on each SELECT
- Best for: DuckDB CLI exploration, always-fresh data
- Performance: Slower (reads from S3 on each query)
- Trade-off: Always shows latest data without refresh
Example: BI tool setup (tables)
# Download the init script (tables version)
aws s3 cp s3://<YOUR-BUCKET>/trelica/2025-10-22/init_duckdb_s3.sql .
# Initialize the database
duckdb trelica_dw.db < init_duckdb_s3.sqlExample: ad-hoc queries (views)
# Download the init script (views version)
aws s3 cp s3://<YOUR-BUCKET>/trelica/2025-10-22/init_duckdb_s3_views.sql .
# Initialize the database
duckdb trelica_dw.db < init_duckdb_s3_views.sqlNow you can query all tables using simple names:
-- Start DuckDB
duckdb trelica_dw.db -ui
-- Query tables directly
SELECT * FROM person WHERE email LIKE '%@example.com';
-- Join across tables
SELECT
p.name,
p.email,
t.name as team_name
FROM person p
JOIN person_team pt ON p.person_id = pt.person_id
JOIN team t ON pt.team_id = t.team_id;Query multiple dates
Track changes over time by querying multiple export dates:
-- Count persons in each export
SELECT
regexp_extract(filename, '(\d{4}-\d{2}-\d{2})', 1) as export_date,
COUNT(*) as person_count
FROM read_parquet(
's3://<YOUR-BUCKET>/trelica/*/person.parquet',
filename=true
)
GROUP BY export_date
ORDER BY export_date;
-- Query all data across dates
SELECT * FROM read_parquet('s3://<YOUR-BUCKET>/trelica/*/person.parquet');
-- Include export date in results
SELECT
*,
regexp_extract(filename, '(\d{4}-\d{2}-\d{2})', 1) as export_date
FROM read_parquet(
's3://<YOUR-BUCKET>/trelica/*/person.parquet',
filename=true
);Common queries
Using the automated setup, query with simple table names:
-- Count records per table
SELECT COUNT(*) FROM person;
SELECT COUNT(*) FROM app_user;
-- Check schema
DESCRIBE SELECT * FROM person;
-- Find people by email domain
SELECT name, email, department
FROM person
WHERE email LIKE '%@example.com';
-- License usage by team
SELECT
t.name as team,
COUNT(DISTINCT p.person_id) as person_count,
COUNT(DISTINCT au.app_user_id) as app_user_count
FROM person p
LEFT JOIN person_team pt ON p.person_id = pt.person_id
LEFT JOIN team t ON pt.team_id = t.team_id
LEFT JOIN app_user au ON p.person_id = au.person_id
GROUP BY t.name
ORDER BY person_count DESC;Export query results
-- Export to CSV
COPY (
SELECT * FROM person
) TO 'person_export.csv' (HEADER, DELIMITER ',');
-- Export filtered results
COPY (
SELECT * FROM person WHERE department = 'Engineering'
) TO 'engineering_team.csv' (HEADER, DELIMITER ',');
-- Export joined data to Parquet
COPY (
SELECT
p.name,
p.email,
t.name as team
FROM person p
JOIN person_team pt ON p.person_id = pt.person_id
JOIN team t ON pt.team_id = t.team_id
) TO 'person_teams.parquet' (FORMAT PARQUET);Refreshing data
Tables (init_duckdb_s3.sql)
When using tables, you need to manually refresh to see new data:
# Re-run the init script to reload data from S3
duckdb trelica_dw.db < init_duckdb_s3.sqlThis is ideal for BI tools where you want controlled refreshes (e.g., nightly updates).
Views (init_duckdb_s3_views.sql)
When using views, data is always fresh - no refresh needed! Each query reads directly from S3.
Advanced: IAM role assumption
If using IAM role instead of access keys:
-- Set role ARN in DuckDB
SET s3_role_arn='arn:aws:iam::1234567890:role/TrelicaDWRole';
-- Query as normal - DuckDB will assume the role automatically
SELECT * FROM person LIMIT 10;Troubleshooting
Access denied
# Verify AWS credentials work
aws s3 ls s3://<YOUR-BUCKET>/trelica/
# If SSO, ensure session is active
aws sso login-- In DuckDB, check region is set correctly
SELECT current_setting('s3_region');
-- Update if needed (must match bucket region)
SET s3_region='us-east-1';httpfs extension not found
-- Install and load the extension
INSTALL httpfs;
LOAD httpfs;IO Error: No files found
Check that:
- You used an S3 init script (
init_duckdb_s3.sqlorinit_duckdb_s3_views.sql), not the local one (init_duckdb_local.sql) - The export date exists in S3
- httpfs is installed and loaded
# Verify the date exists
aws s3 ls s3://<YOUR-BUCKET>/trelica/2025-10-22/
# Verify init scripts are uploaded
aws s3 ls s3://<YOUR-BUCKET>/trelica/2025-10-22/init_duckdb_s3.sql
aws s3 ls s3://<YOUR-BUCKET>/trelica/2025-10-22/init_duckdb_s3_views.sqlPerformance tips
1. Filter early - DuckDB pushes filters down to Parquet
-- Good: Filter pushed to Parquet
SELECT * FROM person WHERE email LIKE '%@example.com';
-- Bad: Reads entire file first
SELECT * FROM (SELECT * FROM person) WHERE email LIKE '%@example.com';2. Select only needed columns - Parquet is columnar
-- Only reads name and email columns from Parquet
SELECT name, email FROM person;3. Use date partitioning - Read only relevant dates
-- Read only October 2025 exports
SELECT * FROM read_parquet('s3://<YOUR-BUCKET>/trelica/2025-10-*/person.parquet');Integration with other tools
Once tables or views are created, connect any tool that supports DuckDB:
-
Python:
duckdb.connect('trelica_dw.db') - Tableau: DuckDB JDBC driver
- Superset: DuckDB database connection
-
Jupyter:
%load_ext sql+%sql duckdb:///trelica_dw.db
Example Python:
import duckdb
con = duckdb.connect('trelica_dw.db')
df = con.execute("""
SELECT * FROM person
WHERE email LIKE '%@example.com'
""").df()
print(df)By accessing or using 1Password Developer Tools, you agree to the API and SDK Terms of Service.
Comments
0 comments
Please sign in to leave a comment.