Create API CONNECTION
This type of CONNECTION is mainly used to store and protect the authentication information of third-party application services. Through API Connection, Lakehouse can securely interact with these services via API calls. Currently, the external services supported by API Connection include Alibaba Cloud Function Compute (FC) and Tencent Cloud Function Service.
Syntax
Syntax 1
- PROVIDER: Cloud function provider, such as TENCENT, ALIYUN, and AWS
- REGION: The region where the corresponding cloud function is located, such as 'cn-shanghai'. For Alibaba Cloud, refer to the link region, for Tencent Cloud refer to the link region such as: ap-beijing, for AWS refer to: China region endpoints, international endpoints region
- ROLE_ARN: The role assumed when creating the cloud function, such as acs:ram::12228000000000000:role/czudfrole
- CODE_BUCKET: The name of the object storage bucket where the cloud function program files are located.
- NAMESPCE: Required when using Tencent Cloud. For other cloud services, you can leave it blank or directly fill in 'default'. This value can be obtained as shown in the figure below
Syntax two
Parameter Description
connection_name
: The name of the connection to be created.TYPE
: Specifies the type of data source for the connection, such asCLOUD_FUNCTION
.WITH PROPERTIES
: Specifies the authentication and connection information required for the external data source.parameter_key
: Property key.parameter_value
: Property value.IF NOT EXISTS
: Optional parameter. If the specified connection already exists, no changes are made, and a message indicating the connection exists is returned; if not specified and the connection exists, an error message is returned.COMMENT
: Optional parameter for adding comment information.
Supported Data Source Types and Parameters
The following are the parameters required for the CLOUD_FUNCTION
type data source:
Parameter Name | Description | Example Value | Required |
---|---|---|---|
cloud_function.provider | Cloud function service provider | aliyun | Yes |
cloud_function.region | Region where the cloud function service is located | cn-beijing / cn-hangzhou | Yes |
cloud_function.role_arn | User's ARN authorization information | acs:ram::123456789012:role/YourRoleName | Yes |
cloud_function.namespace | Specifies the namespace of the external function, required for Tencent Cloud | your_namespace | Yes |
cloud_function.code_bucket | Object storage Bucket information where the user's code is stored | your_bucket_name | Yes |
Case Description
API CONNECTION is mainly used for creating EXTERNAL FUNCTION. The usage process of EXTERNAL FUNCTION is as follows:
- Users activate cloud function computing services (such as Alibaba Cloud Function Compute FC) and object storage services.
- Upload the function execution code & executable files, dependent libraries, models, and data files to object storage.
- Grant Singdata Lakehouse permission to operate the above services and access function files.
- Users call Remote function in Singdata Lakehouse SQL statements.
- Singdata Lakehouse sends an HTTP request to call the running function based on the provided service address and authentication information.
- Singdata Lakehouse retrieves the response information and returns the result. Therefore, you must activate function computing services and object storage services and grant Lakehouse permissions.
Example 1: Creating API CONNECTION on Alibaba Cloud
-
Environment Preparation UDF relies on Alibaba Cloud's "Object Storage" and "Function Compute" services. Please ensure that the relevant services are activated.
-
step1: Users activate cloud function computing services (such as Alibaba Cloud Function Compute FC) and object storage services.
-
step2. Alibaba Cloud operations: Create a permission policy (CzUdfOssAccess) in the Alibaba Cloud RAM console: Note: Users need to have RAM permissions.
- Access the Alibaba Cloud Resource Access Management (RAM) product console.
- In the left navigation bar, Permission Management -> Permission Policies, search for AliyunFCFullAccess in the permission control interface -> Edit the AliyunFCFullAccess permission policy to add the following "acs**:Service": "fc.aliyuncs.com"** part.
-
step3: Create a permission policy (CzUdfOssAccess) in the Alibaba Cloud RAM console: Note: The user needs to have RAM permissions
- Access the Alibaba Cloud Resource Access Management (RAM) product console
- In the left navigation bar, go to Permission Management -> Permission Policies, and select Create Permission Policy in the permission control interface
- On the Create Permission Policy page, select the Script Editor tab, and replace
[bucket_name_1|2|3]
below with the actual OSS bucket names. Note: According to Alibaba Cloud OSS conventions, the same bucket needs to have two Resource entries:"acs:oss:*:*:bucket_name_1"
and"acs:oss:*:*:bucket_name_1/*"
must both exist to achieve the authorization effect:
-
step4 Alibaba Cloud Console: Create a role in Alibaba Cloud RAM (e.g., CzUDFRole):
- In the RAM console, navigate to Identity Management -> Roles on the left sidebar, and click Create Role
- On the Create Role page, select the type as Alibaba Cloud Account, fill in the custom Role Name (e.g., CzUDFRole), select Other Cloud Account in Select Trusted Cloud Account, and enter: 1384322691904283 (Singdata Lakehouse Shanghai's cloud main account), then click Complete
- After creation, click Authorize Role:
- In System Policies, grant the AliyunFCFullAccess policy to the role CzUDFRole
- In Custom Policies, grant the newly created policy (CzUdfOssAccess) to the role
-
step5: After creation, click Authorize Role: In Custom Policies, grant the newly created policy (CzUdfOssAccess) to the role. In the role CzUDFRole details page, obtain the RoleARN information of the role:
'acs:ram::1222808864xxxxxxx:role/czudfrole'
-
step6: Fill the above role_arn into the syntax parameter, and create an Alibaba Cloud Function Compute connection
- step7: desc connection to obtain external ID information: In this example, the external ID is:
VW9UaGwYENBQ7cFp
- In Alibaba Cloud RAM -> Roles -> Trust Policy, modify the trust policy of CzUDFRole:
Example 2: Create API CONNECTION on Tencent Cloud
Environment Preparation UDF relies on Tencent Cloud's "Object Storage" and "Cloud Functions" services. Please ensure that the relevant services are activated.
-
Object Storage: Required in the Lakehouse deployment region (e.g., ap-shanghai) to store UDF base code;
-
Cloud Functions: After activating the Cloud Functions service, it is recommended to manually create a function using the template creation feature, preferably using templates with the WebFunc tag such as the Flask framework template. During this process, the Tencent Cloud console will guide users through some initial configurations, such as activating the log service (CLS) and other dependent services, creating necessary Access Control (CAM) roles, and granting necessary Access Control (CAM) permissions.
-
step1: Users activate Tencent Cloud's cloud function computing service. The cloud function region should be consistent with the Lakehouse service region.
-
step2: Data creation permission policy (LakehouseAccess):
- Log in to Tencent Cloud and go to the Access Management product console
- In the Access Management page, navigate to Policies on the left sidebar, and in the permission control interface, select Create Custom Policy -> Create by Policy Generator -> Visual Policy Generator.
- In the Visual Policy Generator tab, Service: Select Cloud Functions; Action: Select All Actions (you can make more granular selections based on actual needs); Resource: Select All Resources or Specific Resources as needed. In this case, select specific resources, use namespace authorization as shown below, click the edit button, select the region activated in step1, the resource can be * or a specified namespace, in this case, the namespace from step1: default. As shown in the red-marked area of the cloud function in Figure 2Click to create the policy
-
step3: Create Role CzUdfRole
- Create a new role
- Select Tencent Cloud Account
- Select Other Main Account 100029595716 (Singdata Main Account), keep other options as default, and click Next
- In the Configure Role Policy configuration, authorize the newly created LakehouseAccess custom policy to the current role. Click Next, and in Role Naming, fill in
LakehouseRole
to complete the creation. - After successful creation, go to the details page of the role
LakehouseRole
in the role list to get the RoleARN information of the role:qcs::cam::uin/1000*******:roleName/LakehouseRole
- Remember the RoleArn, for example:
qcs::cam::uin/1000*******:roleName/LakehouseRole
-
step4: Activate COS and create a new BUCKET
- Create a new bucket to store udf code, the region should be consistent with the Lakehouse service region. The newly created bucket is myfunction as shown below
- Authorize Lakehouse to access the bucket (myfunction)
- Go to the Access Management product console. Find the newly created "LakehouseAccess" policy. Select Edit
- Select the Visual Policy Generator. Add permissions
- Create a new bucket to store udf code, the region should be consistent with the Lakehouse service region. The newly created bucket is myfunction as shown below
-
Service: Select Object Storage (cos); Action: Select All Actions (you can make more granular selections based on actual needs); Resource: Select All Resources or Specific Resources as needed. In this example, select specific resources, which is
myfunction-131xxxxx
in Shanghai. -
step5: Create Connection on the Lakehouse side * Execute the following command in Studio or the Lakehouse JDBC client:
- Note: To prevent the ROLE_ARN from being obtained by third parties for unauthorized data access, you can use
EXTERNAL ID
as an additional layer of verification to ensure that access is only allowed when the request contains the presetEXTERNAL ID
. This means that even if a third party knows some other access information (such as the role ARN), they cannot access the resource without the correctEXTERNAL ID
. - During the API Connection process, Lakehouse will generate this EXTERNAL ID, which can be configured into the role verification of the COS account to achieve access control:
- Note: To prevent the ROLE_ARN from being obtained by third parties for unauthorized data access, you can use
- Client side: Enter the Tencent Cloud Access Management console, Role -> CzUDFRole -> Role Carrier -> Manage Carrier, select Add Account -> select Current Main Account, and fill in the main account ID:
100029595716
(Singdata's Tencent Cloud main account), and check Enable Verification, enter the EXTERNAL_ID from the previous DESC result, click Confirm -> Update
Example 3: Create API CONNECTION on AWS
-
Environment Preparation UDF relies on Alibaba Cloud's "Object Storage" and "Lambda Function" services. Please ensure that the relevant services are activated.
-
step1: User activates Lambda and Object Storage services on the cloud
-
step2: Create permission policy on AWS side (LakehouseAccess):
- Log in to the AWS cloud platform and enter the Identity and Access Management (IAM) product console
- In the IAM page's left navigation bar, go to Account Settings, in the Security Token Service (STS) section, find the Endpoint list, locate the region corresponding to the Singdata Lakehouse for the current instance, and if the STS Status is not enabled, please enable it.
- In the IAM page's left navigation bar, go to Policies, in the Policies interface, select Create Policy, and choose Json in the policy editor.
- Add the policy to allow Singdata Lakehouse to access the S3 bucket and directory. Below is a sample policy, please replace
<bucket>
with the actual bucket and path prefix name.
-
Select Next, enter the policy name (e.g., LakehouseAccess) and description (optional)
- Click Create Policy to complete the policy creation
step3: Create a role on the AWS side (LakehouseVolumeRole):
- Log in to the AWS cloud platform and go to the Identity and Access Management (IAM) product console
- In the IAM page's left navigation bar, go to Roles -> Create role -> AWS account, select Another AWS account, and enter
028022243208
in the Account ID
- Select Next, on the Add permissions page, choose the policy created in step2
LakehouseAccess
, then select Next - Fill in the Role name (e.g.,
LakehouseVolumeRole
) and description, click Create role to complete the role creation - On the role details page, obtain the value of Role ARN to create the STORAGE CONNECTION
step4: Create an API CONNECTION on the Singdata Lakehouse side:
- Execute the following commands in Studio or the Lakehouse JDBC client:
- During the process of creating a storage connection, Lakehouse will generate this EXTERNAL ID. You can configure this EXTERNAL ID into the Trust Policy of the AWS IAM role (
LakehouseVolumeRole
) created in step 3 to achieve additional access control:
- In the AWS IAM console, navigate to Roles in the left sidebar, find the role created in step 3 and enter the role details page. In Trust relationships, replace the value of
sts:ExternalId
000000
with theEXTERNAL_ID
from the DESC result. Click Update to complete the role policy update.