March 22, 2024 Lakehouse Studio Release Notes

This release (Release 2024.03.22) introduces the following new features, updates (enhancements and fixes), and behavior changes. The updates will be gradually rolled out to the following regions:

  • Alibaba Cloud Shanghai Region
  • Tencent Cloud Shanghai Region
  • Alibaba Cloud Singapore Region
  • Tencent Cloud Beijing Region

Note: The updates will be completed within one to two weeks from the release date, depending on your region.

Incompatible Changes

  • Monitoring Alerts: Rule configuration, the monitoring items in the trigger conditions, changed from supporting up to three items simultaneously to only one item.

New Features

  • Global: Support for custom logo replacement by tenant
  • Data Integration: Added multi-table real-time synchronization feature, with data freshness reaching second-level [Beta]
  • Data Integration: Offline synchronization of ES data sources supports data filtering and dynamic index name configuration
  • Data Integration: Offline synchronization of ES supports extracting values from JSON inner fields
  • Data Integration: Real-time synchronization of PostgreSQL supports converting UUID type to String type
  • Data Integration: Support for dirty data configuration, allowing users to view and download samples of dirty data.
  • Data Integration: Multi-table real-time synchronization tasks support "backfill synchronization" function, and support batch operations for single and multiple tables.
  • Data Integration: Added Lakehouse CDC data export to Mongo feature [Beta]
  • Data Integration: Real-time synchronization supports batch selection of tables for backfill by tenant
  • Task Development: Provides task group and task group parameter functions [Beta]
  • Task Development: JDBC node newly supports PG type
  • Task Development: Masking of sensitive information such as AK
  • Task Scheduling: Finalization of hot upgrade and seamless release
  • Task Scheduling: Support for setting the time interval for automatic rerun of tasks
  • Task Operations: Added data backfill function on DAG nodes, allowing preview of generated instance information after selecting the data backfill date range
  • Monitoring Alerts: Comprehensive upgrade of the monitoring alerts section, with page layout adjustments and optimization of user operation paths. Added multiple trigger rule calculation methods for metric monitoring
  • Monitoring Alerts: Added monitoring for multi-table real-time synchronization tasks, including metric monitoring "multi-table real-time synchronization task delay", event monitoring "multi-table real-time synchronization task failure", "multi-table real-time synchronization task target table field change failure"
  • Data Management: Upload data, support for adding data while creating a new table. Also, support for uploading more types of local files to the product, including new Parquet, Avro, Orc file types; support for custom column delimiters and more null value representations, etc.
  • Open System: Provides the first version of OpenAPI for customer use

Optimizations and Improvements

  • Global: Increased the option to display up to 100 rows per page in tables.
  • Global: Added DataGPT product entry on the account homepage.
  • Global: Optimizations and improvements in product interaction and user experience
  • Data Integration: Offline synchronization tasks, pre-checks for read permissions, field type matching, etc., to avoid task failure at the final write stage
  • Data Integration: Offline synchronization task concurrency support for manual input
  • Task Development: Enhanced version comparison capability, expanded the area range for copying content.
  • Job History: DAG graph supports scrolling operations using the touchpad, optimized interaction design
  • Task Operations: Added engine server-side time consumption in the instance execution log
  • Task Operations: Task name search supports case-insensitive search
  • Security Center: System service accounts are hidden when adding users to the workspace to prevent selection
  • Monitoring Alerts: Optimized the display of monitoring alert information for real-time synchronization tasks, detailing specific individual tables
  • Data Management: Optimized the display and drag-and-drop functionality of the data directory tree.

Bug Fixes

  • Real-time Synchronization: Fixed the issue of empty task failover display information
  • Offline Synchronization: Fixed the page error issue when switching OSS data sources
  • Task Development: Fixed the issue of missing dependency relationships for some tables in scheduling configuration
  • Data Management: Fixed the issue of inaccurate data lineage and the display of only a single dependency when there are multiple dependencies
  • Security Center: Fixed the issue where users with the workspace_admin role could not grant permissions
  • Monitoring Alerts: Fixed the issue of inaccurate timeout monitoring for data quality tasks
  • Data Security: Fixed the issue of SQL errors when performing operations on multiple selected objects while granting data permissions to new users
  • Metering and Billing: Fixed the issue of incorrect CRU metering display data

Known Limitations

  • Only offline integration tasks support dirty data management functionality.