Snowflake Dynamic Tables Migration in Practice: Bronze–Silver–Gold Pipeline
If you have built a Medallion data pipeline on Snowflake using Dynamic Tables, migrating to Singdata Lakehouse involves mainly 4 syntax differences. The core SQL query logic does not need to change.
This article demonstrates the complete migration process with a real project: migrating a Bronze–Silver–Gold three-layer pipeline based on Snowflake Dynamic Tables to Singdata Lakehouse, using the TPC-H standard dataset available on both platforms. All SQL has been validated with cz-cli.
Full code: snowflake2lakehouse-dynamic-tables
Original Project
snowflake2lakehouse-dynamic-tables is adapted from Techy-Malay/snowflake-bsg-dynamic-tables, demonstrating how to implement a Bronze–Silver–Gold architecture on Snowflake using Dynamic Tables. The project uses the TPC-H ORDERS table as the data source and outputs a daily sales summary after three layers of Dynamic Table processing.
The migrated code is in the 03_lakehouse/ directory; the original Snowflake SQL is preserved in 01_snowflake/ for comparison.
Conclusion First
The core SQL query logic does not need to change. All 4 changes are platform configuration replacements: TARGET_LAG → REFRESH INTERVAL, WAREHOUSE → VCLUSTER, DATA_RETENTION_TIME_IN_DAYS moved to a separate post-creation statement, and DOWNSTREAM cascade refresh replaced with independent per-layer intervals.
| Change | Effort | Notes |
|---|---|---|
| Dynamic Table refresh parameters | Very low | TARGET_LAG → REFRESH INTERVAL, WAREHOUSE → VCLUSTER |
| DOWNSTREAM cascade refresh | Low | Singdata Lakehouse has no such concept; each layer sets its own REFRESH INTERVAL independently |
| Time Travel retention | Very low | Moved from inline CREATE TABLE option to post-creation ALTER TABLE SET PROPERTIES |
| Data source reference | Very low | SNOWFLAKE_SAMPLE_DATA.TPCH_SF1 → clickzetta_sample_data.tpch_100g |
Cleaning, deduplication (QUALIFY), aggregation, date truncation — the core SQL logic syntax is identical and requires no changes.
Technology Stack Comparison
| Original (Snowflake) | After Migration (Lakehouse) | |
|---|---|---|
| Compute resource | WAREHOUSE = compute_wh | VCLUSTER default |
| Refresh strategy | TARGET_LAG = '5 minutes' | REFRESH INTERVAL '5' MINUTE |
| Dependency propagation | TARGET_LAG = 'DOWNSTREAM' (auto cascade) | No such concept; each layer refreshes independently |
| Manual refresh | ALTER DYNAMIC TABLE ... REFRESH | REFRESH DYNAMIC TABLE ... |
| Time Travel retention | DATA_RETENTION_TIME_IN_DAYS = 1 (inline CREATE TABLE option) | ALTER TABLE ... SET PROPERTIES ('data_retention_days' = '1') (separate post-creation statement) |
| Sample dataset | SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS (1GB) | clickzetta_sample_data.tpch_100g.orders (100GB) |
| Schema reference | USE SCHEMA + unqualified name, or fully qualified name | Same; both forms supported |
| Deduplication syntax | QUALIFY ROW_NUMBER() OVER (...) = 1 | Same syntax, fully supported |
| Date truncation | DATE_TRUNC('day', ts) | Same syntax, fully supported |
The main changes are platform configuration — replacing Snowflake Virtual Warehouse with Lakehouse VCluster, and TARGET_LAG with REFRESH INTERVAL. The core SQL logic for data processing is completely unchanged: cleaning, deduplication, aggregation — these are written identically on Lakehouse as on Snowflake.
Architecture Overview
Migration Steps
Step 1: Replace Sample Dataset Reference
Snowflake's built-in sample data is accessed via the SNOWFLAKE_SAMPLE_DATA database; Lakehouse's shared dataset is accessed via clickzetta_sample_data.
Snowflake:
Lakehouse:
Two changes:
SNOWFLAKE_SAMPLE_DATA.TPCH_SF1→clickzetta_sample_data.tpch_100g(different dataset name, same column names)DATA_RETENTION_TIME_IN_DAYS = 1moved from inline DDL option to a separate post-creationALTER TABLE ... SET PROPERTIES
Step 2: Replace Dynamic Table Syntax
This is the core part of the migration, involving 3 syntax changes.
TARGET_LAG → REFRESH INTERVAL
Snowflake uses TARGET_LAG to declare acceptable data latency (the platform automatically determines refresh frequency); Lakehouse uses REFRESH INTERVAL to set a fixed refresh cycle.
Snowflake:
Lakehouse:
WAREHOUSE → VCLUSTER
Snowflake Virtual Warehouse corresponds to Lakehouse VCluster. Use your VCluster name (most instances default to default).
Handling TARGET_LAG = 'DOWNSTREAM'
Snowflake supports TARGET_LAG = 'DOWNSTREAM', which lets upstream table refreshes automatically trigger downstream table refreshes in a dependency cascade. Lakehouse does not have this concept — each Dynamic Table refreshes independently according to its own REFRESH INTERVAL.
Practical recommendation: set Tier 1 (Bronze/Silver) refresh intervals shorter than Tier 2 (Gold) to approximate cascade behavior. For example, Bronze/Silver at 5 minutes and Gold at 10 minutes ensures Bronze/Silver data is up to date when Gold refreshes.
Step 3: Replace Manual Refresh Command
Snowflake:
Lakehouse:
Note that refreshes must be executed manually in dependency order — Lakehouse does not auto-cascade:
Step 4: Fully Compatible Parts (No Changes Needed)
The following syntax is identical between Lakehouse and Snowflake:
QUALIFY ROW_NUMBER() OVER (...) = 1 (Silver layer deduplication)
DATE_TRUNC('day', ts) (Gold layer date aggregation)
CAST(order_ts AS TIMESTAMP) (Silver layer type conversion)
Validation Results
All SQL has been validated by running on a Lakehouse instance via cz-cli:
| Table | Row Count | Notes |
|---|---|---|
orders_stg | 100,000 | Sampled from 150M-row TPC-H dataset |
bronze_orders | 100,000 | Added ingestion_ts and source_system columns |
silver_orders | 100,000 | After QUALIFY deduplication (TPC-H source data has no duplicates) |
gold_sales_summary | 103 | 103 distinct order dates, total sales $15 billion |
After running, clean up all Lakehouse objects with:
Migration Conclusion
Snowflake Dynamic Tables and Lakehouse Dynamic Tables have highly compatible SQL query logic. This project validates the following conclusions:
Fully compatible (no changes needed):
QUALIFY ROW_NUMBER() OVER (...) = 1deduplicationDATE_TRUNC('day', ts)date truncationCAST(col AS TYPE)type conversion- Standard aggregation functions:
COUNT,SUM,AVG CURRENT_TIMESTAMP()system function
4 changes required:
| Difference | Snowflake | Lakehouse |
|---|---|---|
| Compute resource | WAREHOUSE = wh_name | VCLUSTER vcluster_name |
| Refresh strategy | TARGET_LAG = 'N minutes' | REFRESH INTERVAL 'N' MINUTE |
| Dependency cascade | TARGET_LAG = 'DOWNSTREAM' | No such concept; each layer sets its own interval |
| Manual refresh | ALTER DYNAMIC TABLE ... REFRESH | REFRESH DYNAMIC TABLE ... |
| Time Travel retention | DATA_RETENTION_TIME_IN_DAYS = N (inline CREATE TABLE option) | ALTER TABLE ... SET PROPERTIES ('data_retention_days' = 'N') (separate post-creation statement) |
