ClickZetta Connector Official Python SDK

clickzetta-connector is the official Python SDK for Singdata ClickZetta Lakehouse, providing a database API call interface that follows the PEP-249 specification and bulk data upload (bulkload) functionality. With this SDK, you can easily interact with ClickZetta in Python applications.

Installation

Install clickzetta-connector using pip:

pip install clickzetta-connector
  • Note clickzetta-connector by default includes all cloud environment dependencies and does not support the installation method clickzetta-connector[xxx]. If you need to use cloud environment packages, please refer to the installation methods below.

Additionally, we support on-demand installation of cloud environment packages (cloud environment packages include the clickzetta-connector base package)

typecommandcomment
Nonepip install "clickzetta-connector-python"Only install the base package, generally used for local development
allpip install "clickzetta-connector-python[all]"Install the base package and all cloud environment packages
s3pip install "clickzetta-connector-python[s3]"Install the base package and Amazon cloud environment package
amazonpip install "clickzetta-connector-python[amazon]"Install the base package and Amazon cloud environment package
awspip install "clickzetta-connector-python[aws]"Install the base package and Amazon cloud environment package
osspip install "clickzetta-connector-python[oss]"Install the base package and Alibaba cloud environment package
aliyunpip install "clickzetta-connector-python[aliyun]"Install the base package and Alibaba cloud environment package
cospip install "clickzetta-connector-python[cos]"Install the base package and Alibaba cloud environment package
tencentpip install "clickzetta-connector-python[tencent]"Install the base package and Tencent cloud environment package
gcppip install "clickzetta-connector-python[gcp]"Install the base package and Google cloud environment package
googlepip install "clickzetta-connector-python[google]"Install the base package and Google cloud environment package

Quick Start

Execute SQL Query

Below is a simple example demonstrating how to use clickzetta-connector to execute an SQL query:

from clickzetta import connect

### Establish Connection
conn = connect(
    username='your_username',
    password='your_password',
    service='api.singdata.com',
    instance='your_instance',
    workspace='your_workspace',
    schema='public',
    vcluster='default'
)
ParameterRequiredDescription
usernameYUsername
passwordYPassword
serviceYAddress to connect to the lakehouse, region.api.singdata.com. You can see the JDBC connection string in Lakehouse Studio Management -> Workspace
instanceYYou can see the JDBC connection string in Lakehouse Studio Management -> Workspace to view
workspaceYWorkspace in use
vclusterYVC in use
schemaYName of the schema to access

Simple Query Example

# Create a cursor object
cursor = conn.cursor()
# Execute SQL query
cursor.execute('SELECT \* FROM clickzetta\_sample\_data.ecommerce\_events\_history.ecommerce\_events\_multicategorystore\_live LIMIT 10;')
# Fetch query results
results = cursor.fetchall()
for row in results:
print(row)

Using SQL hints

In JDBC, SQL hints set through the set command can be passed via the parameters parameter. For supported parameters, refer to Parameter Management. Below is an example:

# Set the job run timeout to 30 seconds
my_param = {
    'hints': {
        'sdk.job.timeout': 30
    }
}
cursor.execute('YOUR_SQL_QUERY', parameters=my_param)

More Examples

1. Handling Query Results

The following example demonstrates how to handle query results, such as saving the results to a CSV file:

import csv

# Execute query
cursor.execute('SELECT * FROM clickzetta_sample_data.ecommerce_events_history.ecommerce_events_multicategorystore_live LIMIT 10;')

# Fetch query results
results = cursor.fetchall()

# Save results to CSV file
with open('output.csv', 'w', newline='', encoding='utf-8') as csvfile:
    csv_writer = csv.writer(csvfile)
    csv_writer.writerow([column[0] for column in cursor.description])
    csv_writer.writerows(results)
# Close connection
cursor.close()
conn.close()