DBT + Singdata Lakehouse Quickstart
This guide uses the jaffle-shop-clickzetta sample project to walk you through the complete workflow for connecting dbt to Singdata Lakehouse: installing dependencies, configuring the connection, loading data, building models, and running tests.
jaffle-shop is the standard sample project maintained by dbt, simulating a coffee shop's order data. It includes 6 raw data tables, 6 staging views, 7 mart tables, and 27 data quality tests.
Prerequisites
- Python 3.10 or higher (3.12 recommended)
- A working Singdata Lakehouse instance (requires a workspace, vcluster, username, and password)
Step 1: Clone the Project
Step 2: Install dbt-clickzetta
Create a virtual environment and install the adapter:
Verify the installation:
Example output:
Step 3: Configure the Connection
Copy the connection template and fill in your instance information:
Edit profiles.yml and replace the placeholders in angle brackets:
Verify the connection:
All checks showing OK means the connection is successful.
Step 4: Install Dependency Packages
This project depends on dbt_utils (1.3.3) and dbt_date (0.17.1). After installation:
Step 5: Load Raw Data
Loads the 6 CSV files under seeds/jaffle-data/ into the dbt_jaffle.raw schema, approximately 150,000 rows total. This normally completes in about 1 minute:
Step 6: Build Models
Builds 13 models in dependency order (6 staging views + 7 mart tables):
After the build completes, the following objects will appear in the dbt_jaffle schema of Singdata Lakehouse:
| Object | Type | Description |
|---|---|---|
stg_customers | View | Cleaned customer data |
stg_locations | View | Cleaned store data |
stg_order_items | View | Cleaned order line item data |
stg_orders | View | Cleaned order data |
stg_products | View | Cleaned product data |
stg_supplies | View | Cleaned supply data |
customers | Table | Customer wide table (with spending summary) |
orders | Table | Order wide table (with product and amount details) |
order_items | Table | Order line item wide table |
locations | Table | Store dimension table |
products | Table | Product dimension table |
supplies | Table | Supply dimension table |
metricflow_time_spine | Table | MetricFlow time axis helper table |
Step 7: Run Data Quality Tests
Runs 27 data quality tests covering uniqueness, non-null, referential integrity, and business logic validation:
Step 8: View Project Documentation (Optional)
Open http://localhost:8080 in your browser to view the complete data lineage graph and field descriptions.
Project Structure
models/staging/: Raw data cleaning layer, materialized as viewsmodels/marts/: Business wide table layer, materialized as tablesseeds/jaffle-data/: 6 CSV raw data filesmacros/clickzetta_seed_batch_size.sql: Singdata Lakehouse adaptation macro, controls batch write sizeprofiles.yml.example: Connection configuration template
Frequently Asked Questions
Q: What should I do if dbt debug reports a connection failure?
Check that service, instance, and workspace in profiles.yml are filled in correctly. service is the API endpoint domain name, not an HTTP URL — do not add the https:// prefix.
Q: Why are unit tests disabled?
The original jaffle-shop project includes 3 dbt unit tests. The SQL they generate uses cast(null as string not null) syntax, which Singdata Lakehouse does not currently support in cast expressions with not null constraints. Therefore, unit tests are uniformly disabled in dbt_project.yml. All 27 data tests run normally.
Command Reference
| Command | Purpose |
|---|---|
dbt deps | Install dependency packages declared in packages.yml |
dbt seed | Load CSV files as database tables |
dbt run | Compile and execute all models |
dbt test | Run data quality tests |
dbt docs generate | Generate project documentation |
dbt docs serve | View documentation in a local browser |
