December 12, 2024 Lakehouse Studio Release Notes

In this release (Release 2024.12.12), we have introduced a series of new features, enhancements, and fixes. Please note that these updates will be gradually rolled out to the following regions, and the updates will be completed within one to two weeks from the release date, depending on your region.

  • Tencent Cloud Beijing Region
  • Tencent Cloud Shanghai Region
  • Tencent Cloud Guangzhou Region
  • Alibaba Cloud Shanghai Region
  • Alibaba Cloud Singapore Region
  • Amazon Beijing Region
  • Amazon Cloud Singapore

Incompatible Changes

  • Permission Changes: This release includes fine-grained permission control optimizations. The original instance_sre and workspace_sre roles no longer have task development and release permissions for all workspaces, with permissions reduced to read-only. After the adjustment, only the workspace_admin and workspace_dev roles can create, edit, configure scheduling properties, and submit releases within the workspace. This adjustment does not affect the original roles' access to data or the normal execution of periodic scheduling tasks.
  • Data Quality: Quality verification result data is adjusted to be retained for a maximum of 3 months.

New Features

  • Product Global: Supports white-screen development, scheduling orchestration, and operation and maintenance monitoring for Databricks task nodes (SQL and Notebook).
  • Data Source: Added Amazon Redshift data source, supporting data writing through offline synchronization tasks.
  • Data Synchronization: Unified resource pool, supporting the configuration of synchronous computing clusters in tasks.
  • Task Development: Added intelligent parsing function for scheduling dependencies and task outputs of offline synchronization tasks.
  • Monitoring and Alerts: Added support for monitoring and alerting on scheduling run delay metrics for periodic task instances.
  • Computing Cluster: Added synchronous computing clusters, which can be selected for use in offline and real-time data synchronization tasks.

Optimizations and Improvements

  • Product Global: Adjusted and optimized the product function permission points of built-in roles. Based on the latest built-in roles, adjusted the function permission restrictions in different product scenarios, including workspace, development, task operation and maintenance, data source, cluster, job history, etc.
  • Account Center: Optimized the display of the billing bill bar chart on the account homepage for the past 30 days.
  • Data Synchronization: In scenarios where the number of source table fields exceeds the number of target table fields, supports listing the extra fields of the source table.
  • Data Synchronization: For offline synchronization task configuration, the default field mapping is adjusted to column name same-name mapping.
  • Data Synchronization: For multi-table real-time synchronization, in mirror synchronization mode, all database tables are not selected by default.
  • Task Development: Page load optimization to solve the problem of page stuttering caused by opening offline integration and task group DAG.
  • Task Development: Added horizontal view mode to the task group DAG, optimizing the task arrangement page utilization.
  • Task Development: Optimized the scenario where the parameter name is too long, causing abnormal value input when running code with parameters.
  • Operation and Maintenance Center: The instance operation and maintenance search box supports searching for temporary instances by task name.
  • Operation and Maintenance Center: Optimized the log information of offline synchronization tasks, exposing complete error logs to assist in locating key information of issues.
  • Computing Cluster: The computing cluster specifications support finer-grained specification adjustments to improve resource utilization. The resource specification expression is changed from codes like XSmall, Large, etc., to numerical expressions in units of CRU (Compute Resource Unit) such as 1CRU, 2CRU, etc.

Bug Fixes

  • Data Synchronization: Fixed the issue where multi-table real-time synchronization would cause a page error after being deleted on other pages, defaulting to a new page after deletion.
  • Data Synchronization: Fixed the issue where the tar.gz file was successfully imported in offline synchronization, but the data was incorrect.
  • Task Development: Fixed the issue where the task lineage level was particularly high on the instance details page of the operation and maintenance center, causing the location to fail after clicking the "Error Status" category.
  • Task Development: Fixed the issue where scheduling parameters and external parameters affected each other in task development.

Known Limitations

  • Synchronous computing clusters currently do not support querying the load and usage on the cluster.