dbt seed: You're Wasting Time Manually Loading CSV Files
You need a lookup table in your warehouse. Simple, right?
Wrong. You're opening your database GUI, clicking through import wizards, mapping columns, fixing encoding errors, and 20 minutes later you still haven't started your actual work.
Here's what this costs you: 20 minutes per CSV import × 3 times per week × $75/hour = $75 weekly. That's $3,900 yearly spent on manual CSV imports.
dbt seed loads CSVs in 3 seconds with one command.
The Difference
Without dbt seed:
- Open database GUI, click through import wizards
- Fix encoding errors and column type mismatches
- Repeat every time the data changes
- Hope everyone on your team did it the same way
With dbt seed:
- Drop CSV in
seeds/folder - Run
dbt seed - Everyone gets identical data, instantly
Three CSV Files, Three Seconds
Place your CSVs in the seeds/ directory:
seeds/
raw_customers.csv
raw_orders.csv
raw_payments.csv
Run one command:
dbt seed --profiles-dir .
Done. Three tables created, all data loaded. No GUI, no clicking, no errors.
Use Them Like Any Other Table
Reference seeds in your models:
select * from {{ ref('raw_customers') }}
dbt handles dependencies automatically. Your seeds load before your models run.
When to Use Seeds
Perfect for:
- Lookup tables (country codes, status lists)
- Demo data for tutorials
- Test fixtures for validating logic
Critical rule: Seeds are for small files only (under 1MB). Larger data? Use proper ETL tools.
Reload When Data Changes
Updated your CSV? Reload in 3 seconds:
dbt seed --full-refresh
Tables drop and recreate with new data. Everyone on your team runs the same command, gets identical results.
What You Gain
- Load CSVs in 3 seconds, not 20 minutes
- Save $3,900 yearly in wasted import time
- Version control your data (CSVs live in git)
- Guarantee everyone has identical reference data
- Never click through import wizards again
What it costs: Dropping a CSV file in a folder.
Stop manually importing CSVs. Use dbt seed.
Enjoyed this post? Get more like it.
Subscribe to get my latest posts about data engineering, AI, and modern data stack delivered to your inbox.