CONNECTION
In Lakehouse, the CONNECTION object plays a crucial role. It is responsible for storing authentication information and access management entities for third-party services, thereby protecting sensitive information during data processing. By using CONNECTION, users do not need to expose authentication information in plain text, ensuring data security. Additionally, CONNECTION supports STS (Security Token Service) authentication, allowing cross-account authorization to access services outside of Lakehouse, further enhancing the flexibility and security of data access.
Types of CONNECTION
Based on its purpose and the type of external service it connects to, the CONNECTION object is mainly divided into the following three types:
- Create API Connection: This type of CONNECTION is mainly used to store and protect the authentication information of third-party application services. Through API Connection, Lakehouse can securely interact with these services via API calls. Currently, the external services supported by API Connection include Alibaba Cloud's Function Compute (FC) and Tencent Cloud's Cloud Function service. API Connection is usually used in conjunction with Remote Function to remotely call these services within Lakehouse.
- Create Storage Connection: Storage Connection is mainly used to store the authentication information of third-party storage services, enabling Lakehouse to securely access and manage data in these storage services. The currently supported external storage services include Alibaba Cloud's Object Storage Service (OSS), Tencent Cloud's COS, and AWS S3.
- Create Catalog Connection: In a data lake architecture, Catalog Connection is a key component used to associate the data lake with external metadata storage (such as Hive Metastore). By creating a Catalog Connection, users can achieve unified management and access to metadata, thereby directly reading data stored in external systems. Lakehouse currently only supports connecting to Hive.