October 25, 2023 Lakehouse Platform Release Notes

Overview

This update (Release 2023.10.25) brings a series of new features, enhancements, and security improvements to Singdata Lakehouse users. The update will be gradually rolled out in the following regions:

  • Alibaba Cloud Shanghai Region
  • Tencent Cloud Shanghai Region
  • Alibaba Cloud Singapore Region

Note: Depending on your region, your product version may be updated within one to two weeks after the release date.

New Features and Enhancements

Stream Processing Tasks

Incremental Materialized Views (Beta)

We have introduced the incremental materialized views feature, which allows materialized views to be incrementally refreshed based on changes in the base table data. This improvement significantly reduces the resource consumption required for materialized view refreshes while maintaining real-time data processing capabilities. Combined with the real-time data ingestion feature of the data ingestion service, you can quickly build an efficient real-time data processing workflow.

Table Stream Change Data Capture (Beta)

The new Table Stream feature can capture and record changes to table objects. Based on the existing table you specify, Table Stream utilizes the multi-version mechanism and incremental identification capability of Lakehouse Table to obtain change records of the source table through queries. The current version mainly supports capturing append operation records of the source table.

Data Import & Export

Real-time Upsert Write API

The real-time data loading service has added a real-time Upsert write API, supporting the real-time writing of database CDC (Change Data Capture) into Lakehouse tables. Using tools such as Flink Connector and SDK, users can achieve real-time updates and writes of database CDC data, improving the timeliness of data processing.

Data Lake Management and Analysis

Singdata Lakehouse now supports users in accessing and managing object storage data from cloud providers. With the Lakehouse SQL engine or AI and large language models (LLM), users can perform analysis in various scenarios, such as spatial geographic information data analysis, image parsing, and special format file processing. Additionally, users can utilize Singdata Lakehouse's permission system to control access to object storage data in the cloud. Specific features include:

  • STORAGE Connection Type: A new STORAGE connection type is added to store the authentication and connection information required to access object storage. It supports accessing object storage using Access Key Pair and Role authentication methods.
  • Data Lake Volume Objects: The data lake volume objects have been improved, achieving metadata localization and enhancing the management and governance capabilities of data lake data.
  • get_presigned_url Function: A new get_presigned_url function is added to generate access links with temporary tokens for files in object storage.
  • PUT/GET Command Implementation: The PUT and GET commands have been implemented, allowing users to upload and download data between local and volume using tools such as CLI, JDBC, and SDK.

Security Management

Time Travel Query (Beta)

The Time Travel query feature allows users to access historical data at any point within a defined time period, including changed or deleted data. This feature is of significant value for data recovery and audit scenarios.

Storage Encryption

Singdata Lakehouse now supports encrypting storage data at the workspace level. The platform provides managed key data encryption capabilities. When creating a workspace, users can choose whether to encrypt the data within the space. The encryption option is disabled by default, and users can choose to enable it based on their needs.

SQL Capability Updates

Data Types

  • The varchar and char types now support default lengths without specifying specific lengths.
  • Added support for interval ..week.
  • The interval format has been extended, allowing time units to be written into the string, for example: interval '365 day'.
  • Support for type conversion type conversion.

New SQL Functions

Ecosystem Tools

  • JDBC client supports PUT / GET commands: The JDBC client update supports PUT / GET commands for uploading and downloading data to and from Volume objects.
  • 【Preview】SQL Syntax Conversion Tool: Added a conversion tool for DorisDB SQL to Singdata Lakehouse SQL syntax, enabling rapid migration of DorisDB SQL jobs to Singdata Lakehouse.
  • JDBC supports connecting to Lakehouse services using the HTTP protocol.

Platform Optimization

  • The Lakehouse platform control service and compute service support online hot upgrades, avoiding service continuity disruptions during version upgrades.
  • Optimized compaction concurrency control to improve compaction efficiency.

Behavior Changes

  • The default maximum length for the Varchar data type, when not specified, has been adjusted from 65535 to 2147483647.

Bug Fixes

  • SQL: Fixed an issue where constant columns using aliases as GROUP BY fields were not recognized.