ClickZetta Connector Official Python SDK

clickzetta-connector is the official Python SDK for Singdata ClickZetta Lakehouse, providing a database API call interface that follows the PEP-249 specification and bulk data upload (bulkload) functionality. With this SDK, you can easily interact with ClickZetta in Python applications.

Installation

Install clickzetta-connector using pip:


pip install clickzetta-connector

Note clickzetta-connector by default includes all cloud environment dependencies and does not support the installation method clickzetta-connector[xxx]. If you need to use cloud environment packages, please refer to the installation methods below.

Additionally, we support on-demand installation of cloud environment packages (cloud environment packages include the clickzetta-connector base package)

type	command	comment
None	`pip install "clickzetta-connector-python"`	Only install the base package, generally used for local development
all	`pip install "clickzetta-connector-python[all]"`	Install the base package and all cloud environment packages
s3	`pip install "clickzetta-connector-python[s3]"`	Install the base package and Amazon cloud environment package
amazon	`pip install "clickzetta-connector-python[amazon]"`	Install the base package and Amazon cloud environment package
aws	`pip install "clickzetta-connector-python[aws]"`	Install the base package and Amazon cloud environment package
oss	`pip install "clickzetta-connector-python[oss]"`	Install the base package and Alibaba cloud environment package
aliyun	`pip install "clickzetta-connector-python[aliyun]"`	Install the base package and Alibaba cloud environment package
cos	`pip install "clickzetta-connector-python[cos]"`	Install the base package and Alibaba cloud environment package
tencent	`pip install "clickzetta-connector-python[tencent]"`	Install the base package and Tencent cloud environment package
gcp	`pip install "clickzetta-connector-python[gcp]"`	Install the base package and Google cloud environment package
google	`pip install "clickzetta-connector-python[google]"`	Install the base package and Google cloud environment package

Quick Start

Execute SQL Query

Below is a simple example demonstrating how to use clickzetta-connector to execute an SQL query:


from clickzetta import connect

### Establish Connection
conn = connect(
    username='your_username',
    password='your_password',
    service='api.singdata.com',
    instance='your_instance',
    workspace='your_workspace',
    schema='public',
    vcluster='default'
)

Parameter	Required	Description
username	Y	Username
password	Y	Password
service	Y	Address to connect to the lakehouse, region.api.singdata.com. You can see the JDBC connection string in Lakehouse Studio Management -> Workspace Preview
instance	Y	You can see the JDBC connection string in Lakehouse Studio Management -> Workspace to view Preview
workspace	Y	Workspace in use
vcluster	Y	VC in use
schema	Y	Name of the schema to access

Simple Query Example


# Create a cursor object
cursor = conn.cursor()
# Execute SQL query
cursor.execute('SELECT \* FROM clickzetta\_sample\_data.ecommerce\_events\_history.ecommerce\_events\_multicategorystore\_live LIMIT 10;')
# Fetch query results
results = cursor.fetchall()
for row in results:
print(row)

Using SQL hints

In JDBC, SQL hints set through the set command can be passed via the parameters parameter. For supported parameters, refer to Parameter Management. Below is an example:


# Set the job run timeout to 30 seconds
my_param = {
    'hints': {
        'sdk.job.timeout': 30
    }
}
cursor.execute('YOUR_SQL_QUERY', parameters=my_param)

More Examples

1. Handling Query Results

The following example demonstrates how to handle query results, such as saving the results to a CSV file:


import csv

# Execute query
cursor.execute('SELECT * FROM clickzetta_sample_data.ecommerce_events_history.ecommerce_events_multicategorystore_live LIMIT 10;')

# Fetch query results
results = cursor.fetchall()

# Save results to CSV file
with open('output.csv', 'w', newline='', encoding='utf-8') as csvfile:
    csv_writer = csv.writer(csvfile)
    csv_writer.writerow([column[0] for column in cursor.description])
    csv_writer.writerows(results)
# Close connection
cursor.close()
conn.close()