Section 27

New Project Template

Use this structure to start any new data system.

Folder Structure

bash
/project
  /data
    /input
    /output
  /sql
    bronze.sql
    silver.sql
    gold.sql
  main.py
  requirements.txt

requirements.txt

txt
duckdb
pandas
openpyxl

main.py (Base Template)

python
import duckdb

DB_PATH = "project.duckdb"

def run_sql_file(con, path):
    with open(path, "r") as file:
        con.execute(file.read())

def main():
    con = duckdb.connect(DB_PATH)

    run_sql_file(con, "sql/bronze.sql")
    run_sql_file(con, "sql/silver.sql")
    run_sql_file(con, "sql/gold.sql")

    print("Pipeline completed")

if __name__ == "__main__":
    main()

How to Use

  1. Copy this template.
  2. Add your input data to /data/input.
  3. Update SQL files for your dataset.
  4. Run:
bash
pip install -r requirements.txt
python main.py