Lakehouse Python SDK

ClickZetta Lakehouse Python SDK is a toolkit designed for Python developers to simplify the interaction process with ClickZetta Lakehouse. This SDK includes two main Python packages: clickzetta-connector and clickzetta-sqlalchemy. These two packages provide different programming interfaces to meet the needs of different scenarios.

clickzetta-connector

clickzetta-connector follows the PEP-249 specification and provides a SQL call interface in the style of the Python Database API. By using this interface, you can easily perform SQL queries, inserts, updates, and deletes in Python applications.

In addition, clickzetta-connector also supports bulk data upload (bulkload) functionality, which can significantly improve data import speed. This is particularly useful for scenarios involving large amounts of data.

Usage Example

Install clickzetta-connector, Python version 3.6 or above is required:


pip install clickzetta-connector

Connect to ClickZetta Lakehouse instance:


from clickzetta import connect

# Establish connection
conn = connect(username='username',
               password='password',
               service='api.singdata.com',
               instance='instance',
               workspace='quickstart_ws',
               schema='public',
               vcluster='default')


| **Parameter** | **Required** | **Description**                                                                                                                          |
| ------------- | ------------ | --------------------------------------------------------------------------------------------------------------------------------------- |
| username      | Y            | Username                                                                                                                                |
| password      | Y            | Password                                                                                                                                |
| service       | Y            | Address to connect to lakehouse, region.api.singdata.com. You can see the jdbc connection string in Lakehouse Studio management -> workspace![](../.topwrite/assets/image_1728887857029.png) |
| instance      | Y            | You can see the jdbc connection string in Lakehouse Studio management -> workspace to view![](../.topwrite/assets/image_1729051500396.png) |
| workspace     | Y            | Workspace in use                                                                                                                        |
| vcluster      | Y            | VC in use                                                                                                                               |
| schema        | Y            | Name of the schema to access                                                                                                            |

3. Execute SQL statements:


# Execute query
cursor = conn.cursor()
cursor.execute("SELECT * FROM your_table")
rows = cursor.fetchall()

# Print query results
for row in rows:
    print(row)

# Close connection
cursor.close()
conn.close()

clickzetta-sqlalchemy

clickzetta-sqlalchemy provides SQLAlchemy adaptation, allowing you to interact with ClickZetta Lakehouse using the SQLAlchemy style programming interface. This makes it easy to integrate ClickZetta Lakehouse into SQLAlchemy-based upper-layer applications, such as Superset, streamlit, etc.

Usage Example

Install clickzetta-sqlalchemy:


pip install clickzetta-sqlalchemy


2. Configure SQLAlchemy:


from sqlalchemy import create_engine
from sqlalchemy import text

# Establish connection, using the clickzetta:// prefix
engine = create_engine("clickzetta://username:password@instance.api.singdata.com/workspace?schema=schema&vcluster=default")


3. Using SQLAlchemy for Data Operations:


sql = text('select * from clickzetta_sample_data.ecommerce_events_history.ecommerce_events_multicategorystore_live;')

# Execute and get results
with engine.connect() as conn:
    results = conn.execute(sql)
    for r in results:
        print(r)

Through these two Python packages, you can easily interact with ClickZetta Lakehouse in your Python applications to meet various data processing needs.