2025.05.20 Lakehouse Studio Release Notes

In this release (Release 2025.05.20), we have introduced a series of new features, enhancements, and fixes. Please note that these updates will be rolled out gradually to the following regions over one to two weeks starting from the release date, depending on your region:

  • Tencent Cloud Shanghai
  • Tencent Cloud Beijing
  • Tencent Cloud Guangzhou
  • Alibaba Cloud Shanghai
  • Alibaba Cloud Singapore
  • AWS Singapore
  • AWS Beijing

Incompatible Changes

  • None

New Features

Data Sync

  • Batch Sync: CSV files exported to AWS S3 now support tab delimiter separation.
  • Batch Sync: One-click table creation for Lakehouse target now supports automatic detection of PK fields from source primary key tables and reflects them in the CREATE TABLE statement.
  • Real-time Sync: Supports custom extended fields and setting them as composite primary keys for additional identification scenarios of source data.
  • Real-time Sync: Added display of the number of data rows requiring full synchronization from the source in the task details monitoring page, facilitating assessment of overall data volume and sync progress.

Task Development

  • Task Groups: Added new capabilities including adding downstream nodes within task groups, bucketing support for visualizations, and bulk download history management to help you manage tasks, analyze data, and retrieve results more efficiently. Specific features are as follows:
  • Task Groups: Task groups now support batch addition of downstream child nodes in a task's lineage chain based on a specific node. Supports adding single level, all levels, or custom levels. *Note: Task lineage chain refers to task chains that have been submitted and published to Operations Center.
  • Task Query Results: Visualization chart X-axis now includes bucketing capability. For time and date types, supports flexible data aggregation by time units such as year, month, day, etc.
  • Bulk Download: Added bulk data download history management. After clicking download, you can view download records from the past 3 days. Once the download link is generated, you can click to download. Download links are retained for a maximum of 1 hour.

Operations Center

  • Scheduled Task List: Added display of the most recent instance's run time and status information for the current scheduled task in the scheduled task list.

Monitoring & Alerting

  • Added health monitoring for real-time sync tasks to detect whether data is being consumed and processed normally (based on CheckPoint).

Optimizations

  • Product-wide: Global page design optimization, adjusted page adaptive capability based on browser width.

  • Data Sync:

    • Optimized batch sync page loading performance, improving page response speed and operation fluidity.
    • Field mapping in batch integration tasks now supports manual field selection and copying of field information.
    • SLS batch data integration supports passing parameters to specify data consumption start time and end time.
    • Optimized namespace selection behavior when Kafka is used as source in batch sync.
    • When creating new data sources for single-table real-time sync, only lists selectable data sources that support real-time reading.
  • Task Development:

    • Optimized operation experience within task groups, including editing canvas experience improvements such as quick operation methods, overall page layout, added operations jump, and improved prompt experience for moving/adding task group nodes.
    • SQL task data result table column settings now include search capability, supporting quick search and filtering of column information display.
    • Optimized behavior prompts in the visualization chart area when data query results return no field values.
    • When selecting a data table from the data tree and clicking query, if the current script cursor is not in the implementation area, the query information will automatically position to the code area upon insertion.
    • Optimized interaction behavior for slow page rendering when data query results contain very large data volumes.
    • Task parameter configuration dialog now supports direct modification and save capability. After configuring parameter information and clicking save, both the current script code and parameter information will be saved directly.
    • Optimized parameter value box to display full information with line wrapping when field information is too long.
    • Added entry point to cluster management page in the cluster selection area.
    • Refined handling of comma display for thousand separators in data query results.
  • Data Catalog:

    • Optimized data detail preview functionality to support users selecting all available cluster information under the current instance for data preview.
    • Added force refresh functionality to the data management table details page, supporting manual refresh of table details page to obtain latest metadata information.
  • Data Upload: Optimized backend performance for data upload, improving file upload response speed.

  • Operations Center: Optimized default configuration for instance reruns, with "include current node" selected by default.

  • Data Quality:

    • Automatically fills in and disables switching of the current workspace when creating new quality rules.
    • Enhanced display of quality rule IDs on the page to assist in distinguishing rules with identical or similar names.
    • Displays rule type and validation details in the validation content of quality rules.

Bug Fixes

  • Data Sync: Fixed issue with incorrect incremental sync results for PostgreSQL data source bit and varbit data types.
  • Task Development: Fixed issue where parameter values did not switch linkedly when the parameter value source was changed from task group parameters to task parameters.
  • Data Catalog: Fixed issue where custom service users were not visible or selectable on the authorization page.

Known Limitations

  • There are no known major limitations introduced in this release.