Section 12

Build Your First Data System

Seven concrete steps from raw CSV to an Excel output.

Step 1: Define Sources

Example: CSV file, API, database

  • claims.csv
  • mvr.csv
  • training.csv

Step 2: Create Folder Structure

bash
/project
  /extract
  /load
  /transform
  /output
  main.py

Step 3: Extract Data

python
import pandas as pd

df = pd.read_csv("claims.csv")
print(df.head())

Step 4: Load to Bronze

sql
CREATE TABLE bronze.raw_claims AS
SELECT * FROM read_csv_auto('claims.csv');

Step 5: Transform to Silver

sql
CREATE TABLE silver.claims_clean AS
SELECT DISTINCT claim_id, status, amount
FROM bronze.raw_claims;

Step 6: Build Gold Layer

sql
CREATE TABLE gold.claims_summary AS
SELECT status, COUNT(*) as total_claims
FROM silver.claims_clean
GROUP BY status;

Step 7: Export

python
df = con.execute("SELECT * FROM gold.claims_summary").df()
df.to_excel("output.xlsx")