November 12, 2024 — 0.9 Lakehouse Platform Product Update Release Notes
This release (Release 2024.11.12) introduces a series of new features, enhancements, and bug fixes. These updates will be rolled out in phases to the following regions, and are expected to be completed within one to two weeks from the release date, depending on your region.
- Alibaba Cloud Shanghai
- Tencent Cloud Shanghai
- Tencent Cloud Beijing
- Tencent Cloud Guangzhou
- AWS Beijing
- International — Alibaba Cloud Singapore
- International — AWS Singapore
Data Lake Usability Enhancements
-
Automatic Schema Inference: Supports automatic schema inference for structurable file formats (such as CSV, Parquet, and ORC files) stored in Volumes, eliminating the need to know column names and data type information in advance.
-
Federated Query Expansion:
- Added federated query support for Databricks Unity Catalog
- Hive federated query: When Hive tables use the Iceberg format, Iceberg format reading is now supported
Intelligence
- Auto Index: Automatically recommends cluster keys and sort keys. Recommended columns can be used as sort keys, with preference given to columns that frequently appear in filter predicates. Setting these columns as table sort keys can accelerate query execution.
Incremental Computation
- Dynamic Table DML Support: DML commands are now supported for direct data correction. After modifying data with DML, the next refresh will be a full refresh. Currently supports INSERT, DELETE, and TRUNCATE; MERGE INTO and UPDATE are not yet supported. By default, DML modification of DT content will raise an error to prevent accidental operations. To enable operations, set
set cz.sql.dt.allow.dml = true;. - New Partitioned Dynamic Table: Partitioned dynamic tables are defined via
SESSION_CONFIGS()['dt.arg.xx']and will refresh incrementally. The refresh command must explicitly specify partitions:REFRESH DYNAMIC TABLE dt PARTITION partition_spec;. If these parameters are used on a regular table, although the syntax is not restricted by the lakehouse, the regular table will undergo a full refresh usingREFRESH DYNAMIC TABLE dt ;.
SQL Capability Updates
Syntax Support
- CTE syntax now supports
INSERT INTOat the beginning. Example below:
Function Support
| Function Name | Description |
|---|---|
| unnest | Expands elements of an array into multiple rows |
SDK Interfaces
- JDBC interface now supports the vector type.
Behavioral Changes
-
Incremental Computation Dynamic Table: In the new version, if the user is not simply dropping/adding columns or modifying the SELECT definition statement, adding a column definition must only pass through via SELECT without participating in any computation that affects other columns in order to trigger an incremental refresh. If a newly added column participates in computation, the REFRESH task will degrade to a full refresh after
CREATE OR REPLACEis executed. -
Quota Constraints:
- Trial Account Limitation: The total number of data objects per instance is limited to 1,000
-
Data Ingestion:
- Kafka Pipe Adjustment: Minimum interval adjusted from 1 second to 10 seconds
