Lakehouse Batch Insert Guide
Overview
Inserting data into tables is a fundamental operation in data warehouse construction. Singdata Lakehouse supports multiple data insertion methods, including single-row inserts, multi-row inserts, query result inserts, and overwrite inserts. This guide categorizes insertion methods by business scenario to help you quickly master efficient data loading techniques.
Quick Navigation
- Single-Row and Multi-Row Insert -- Insert small amounts of data using VALUES
- Query Result Insert -- Batch import using INSERT INTO ... SELECT
- Overwrite Insert -- Refresh partitions or entire tables using INSERT OVERWRITE
- Specified Column Insert -- Insert only certain columns; others use defaults or NULL
SQL Commands Covered
| Command | Purpose | Applicable Scenario |
|---|---|---|
INSERT INTO ... VALUES | Insert literal data | Test data, configuration tables, small record counts |
INSERT INTO ... SELECT | Insert query results | ETL data transfer, batch import |
INSERT OVERWRITE | Overwrite write | Refresh partition data, rebuild table data |
Prerequisites
The following examples use a simulated product table products:
Single-Row and Multi-Row Insert
Use the VALUES clause to insert one or more rows. Suitable for test data or small batches of configuration data.
Verify Results:
| product_id | product_name | category | price | stock |
|---|---|---|---|---|
| 1 | iPhone 15 | Phone | 8000 | 100 |
| 2 | MacBook Pro | Laptop | 15000 | 50 |
| 3 | AirPods | Audio | 1200 | 200 |
Query Result Insert
Use INSERT INTO ... SELECT to batch insert query results into the target table, the most common data writing method in ETL pipelines.
Verify Results:
| product_id | product_name | price |
|---|---|---|
| 1 | iPhone 15 | 8000 |
Overwrite Insert
Use INSERT OVERWRITE to overwrite existing data in the target table (or partition). Commonly used for daily data refreshes or rebuilding a specific partition.
Specified Column Insert
When only certain columns need to be inserted, explicitly specify column names. Unspecified columns will use default values or NULL.
Verify Results:
| product_id | product_name | category | price | stock |
|---|---|---|---|---|
| 4 | iPad Air | Tablet | NULL | NULL |
Clean Up Test Data
After completing insert verification, it is recommended to clean up test tables:
Important Notes
- Large Batch Import: For imports exceeding ten thousand rows, use
COPY INTOfrom a Volume or use data sync tasks. - Type Matching: Data types in
VALUESmust match the table definition. Types like DATE, TIME, JSON require literal syntax (e.g.,DATE '2024-06-01'). - Dynamic Table Limitation: Dynamic Tables do not support direct
INSERT INTO; data is refreshed automatically by upstream table changes. - Transactionality: A single
INSERToperation is atomic -- either all rows succeed or all fail.
