In this release (Release 2025.04.22), we have introduced a series of new features, enhancements, and fixes. These updates will be rolled out in phases to the following regions and are expected to be completed within one to two weeks from the release date, depending on your specific region.
Domestic Regions
- Alibaba Cloud (Shanghai)
- Tencent Cloud (Shanghai/Beijing/Guangzhou)
- AWS (Beijing)
International Regions
- Alibaba Cloud (Singapore)
- AWS (Singapore)
New Features and Enhancements
Federated Query Enhancement
- **【Preview Release】**ORC Format Support: External Catalog/External Schema now supports read and write operations for ORC - formatted tables from HMS.
Data Import and Export Optimization
Import Command COPY INTO <table>
- Smart line break recognition: Supports both
\r\n
and\n
formats. on_error=abort|continue
strategy:continue
mode: Skips compression format errors and continues execution.abort
mode: Terminates immediately upon encountering an error.- After execution, it can display the list of imported files.
Export Command COPY INTO <location>
writebom=true
parameter: When exporting CSV files, it includes a BOM header, resolving the issue of Chinese character garbling when opened in Excel and improving cross - platform compatibility.overwrite=true
parameter: Empties the target folder (including subdirectories) before importing.
Pipe Function Enhancement
- Continuous Pipe file import: Supports ignoring errors (skipping compression format errors) and continuing execution via the
on_error=continue
parameter.
SQL Features
- INTERVAL Type Extension: Allows expressions in
INTERVAL expr unit
, such asinterval 1+2 year
. - Metadata Display: The
DESC
/SHOW
commands now display schema information of the share type. - Data Sampling: Added TABLESAMPLE sampling syntax for efficient data sampling analysis.
- Vector Search: Supports the ef parameter, which can be configured before executing a query:
set cz.vector.index.search.ef=64;
- Inverted Index: When creating an inverted index, supports the
'mode' = 'max_word'
parameter to enable a more fine - grained tokenization mode:properties ('analyzer' = 'chinese', 'mode' = 'max_word');
Functions
Built - in Functions
- Improved Performance of GET_JSON_OBJECT: The implementation of GET_JSON_OBJECT has been optimized to enhance JSON parsing efficiency.
Custom SQL Functions
- SQL Function Enhancement: Supports creating user - defined SQL functions of the RETURNS TABLE type.
UDF (User - Defined Functions)
External Function now supports referencing resource files via VOLUME addresses.
-
User Volume Address Format:
volume:user://~/upper.jar
user
indicates the use of the User Volume protocol.~
represents the current user and is a fixed value.upper.jar
is the name of the target file.
-
Table Volume Address Format:
volume:table://table_name/upper.jar
table
indicates the use of the Table Volume protocol.table_name
is the name of the table and should be filled in according to the actual situation.upper.jar
is the name of the target file.
-
Volume Address Format:
volume://volume_name/upper.jar
volume_name
is the name of the created volume.upper.jar
is the name of the target file.
Volume
- 【Preview Release】 New internal named Volume object: Named Volume is a user - defined storage location, primarily used for staging data files before importing them into a table. Unlike automatically created user - level (User Volume) and table - level (Table Volume) volumes, a Named Volume must be explicitly created by the user and offers more flexible configuration options. This makes it better suited for team collaboration and complex data loading scenarios. Additionally, the internal Volume is stored in the internal storage space managed by Cloud Lakehouse, eliminating the need for additional cloud storage configuration and providing users with a more convenient and efficient storage solution.
Caching
- Preload Cache Status Query: Previously, the cache usage could only be checked via
SHOW VCLUSTER vcname PRELOAD CACHED STATUS
when the virtual cluster was in a running state. Now, this command also supports checking the cache status and indicating the cluster running state when the virtual cluster is in a paused state.
INFORMATION SCHEMA
- The Information Schema has added the OBJECT_PRIVILEGES view, which allows querying all data object privilege grants within the system:
- It can directly query all privileges granted to a specified user (including those indirectly obtained through roles).
- It can directly query which users have been granted privileges for a specified object (such as a table or view, including those indirectly granted through roles).
- It currently does not support querying function authorization situations.
- The authorization data within the view may have a delay of up to 15 minutes compared to real - time data.
SDK
Python SDK
The Python SDK has been enhanced to support the SQLAlchemy 2 interface.
BUG Fixes
- Python SDK
- Fixed the issue of the
hints
parameter being ineffective in theexecutemany
method. - Resolved the error in the Bulkload SDK when writing to partitioned tables.
- Fixed the error in the Python SDK when executing the
optimize
syntax.
- Fixed the issue of the
Behavioral Changes
- Simplified External Function Authorization: Changed from requiring both
USE FUNCTION
andVOLUME
usage privileges to only needing theUSE FUNCTION
privilege. - PIPE Function: Added a new
failed
state (previously only RUNNING and PAUSED states existed). This facilitates monitoring systems to capture exceptions and allows for setting up alert mechanisms to notify about the FAILED state. It can be viewed viaDESC <pipe_name>
.