Ecosystem
Singdata Lakehouse is compatible with mainstream data integration, BI, AI, and development tools, and is deployed on seven public clouds including Alibaba Cloud, Tencent Cloud, and AWS. This document summarizes verified third-party tools and connection solutions organized by category.
If the tool you need is not on the list, that does not mean it is unsupported — Lakehouse provides standard access via JDBC, MySQL protocol, and Python/Java SDKs, and any tool compatible with these protocols can connect directly. If you want to develop a new connector or integration solution based on Lakehouse, feel free to contact our partner team.
Cloud Platforms (CSP)
Lakehouse is deployed on seven clouds: Alibaba Cloud, Tencent Cloud, AWS, GCP, Huawei Cloud, Baidu AI Cloud, and Volcengine. Alibaba Cloud, Tencent Cloud, and AWS provide complete dedicated documentation (including storage connections, private network connections, and permission configuration); the configuration approach is consistent across all other cloud platforms. BYOS (Bring Your Own Storage) deployment is also supported — data is stored under the user's own cloud account and does not pass through the Singdata platform. See Supported Cloud Platforms and Private Storage Overview for details.
Data Integration
The following data integration tools are compatible with Lakehouse, covering offline batch, real-time CDC, message streaming, and log collection scenarios. Lakehouse also supports 50+ data sources (MySQL, Oracle, PostgreSQL, MongoDB, Hive, MaxCompute, etc.) via Studio Data Sync for direct access without third-party tools:
| Tool | Connection | Description | Reference |
|---|---|---|---|
| Apache Kafka | Kafka Connector | Real-time message stream writing to Lakehouse | Kafka Data Source |
| AutoMQ | Kafka Protocol | Next-generation message queue, compatible with Kafka protocol | AutoMQ Data Source |
| Airbyte | JDBC | Open-source ELT platform with a rich connector ecosystem | Airbyte Integration Guide |
| DataX | Plugin-based | Alibaba open-source tool, suitable for batch data sync | DataX Integration Guide |
| Apache Flink | Flink Connector | Stream processing engine for real-time writes to Lakehouse | Flink Connector |
| Apache Spark | Spark Connector | Large-scale data reads and writes for Lakehouse tables | Spark Connector |
| Logstash | Logstash Connector | Import log data into Lakehouse | Logstash Integration Guide |
| Bluepipe | Native integration | Real-time CDC sync from Oracle to Lakehouse | Bluepipe Sync Guide |
BI and Visualization
The following BI tools are compatible with Lakehouse. Any BI tool supporting JDBC, ODBC, or MySQL protocol can connect directly and is not limited to the list below:
| Tool | Connection | Description | Reference |
|---|---|---|---|
| FineBI | JDBC / MySQL | Leading domestic BI tool | JDBC Connection · MySQL Protocol |
| Tableau | JDBC | Suitable for complex visualizations and exploratory analysis | Tableau Connection Guide |
| Power BI | MySQL Protocol | Connect via MySQL protocol | Power BI Connection Guide |
| Apache Superset | SQLAlchemy | Open-source, suitable for self-service analytics | Superset Connection Guide |
| Metabase | JDBC | Open-source, easy to deploy, suitable for small and medium teams | Metabase Connection Guide |
| Apache Zeppelin | JDBC | Notebook-style data exploration | Zeppelin Connection Guide |
| Rath | JDBC | Open-source intelligent analytics with automatic insight support | Rath Connection Guide |
| Streamlit | Python SDK | Rapidly build data apps for data science teams | Streamlit Connection Guide |
Transformation and Compute Engines
The following data transformation tools and compute engines are compatible with Lakehouse:
| Tool | Connection | Description | Reference |
|---|---|---|---|
| dbt | dbt-clickzetta adapter | Data modeling and transformation, supports Dynamic Table materialization | dbt Integration Guide |
| Apache Spark | Spark Connector | Large-scale batch processing and machine learning | Spark Connector |
| Apache Flink | Flink Connector | Real-time stream processing | Flink Connector |
The dbt documentation series covers all scenarios from quick start to migration practice: jaffle-shop experience, Snowflake/BigQuery migration, incremental processing, real-time pipelines, and data quality testing. See DBT Practice Series.
AI and Machine Learning
The following AI frameworks and platforms are compatible with Lakehouse, supporting vector storage, RAG applications, and AI workflow scenarios:
| Tool | Integration | Description | Reference |
|---|---|---|---|
| LangChain | Python SDK | Vector storage and RAG application development | LangChain Integration |
| LlamaIndex | Python SDK | Data indexing and retrieval | LlamaIndex Integration |
| Dify | MCP Server / SDK | Vector database + file storage | Dify Integration Overview |
| N8N | MCP Server | Unified AI workflows | N8N Integration |
| MindsDB | JDBC | ML/LLM modeling and prediction on Lakehouse data | MindsDB Integration |
| Datus | MCP Server | Data engineering agent | Datus Integration |
| Zilliz | Joint solution | Vector database joint solution | Zilliz Joint Solution |
| Unstructured.io | SDK | Unstructured document parsing and vectorization | Unstructured.io Integration |
Lakehouse also provides an MCP Server that can be called by any AI Agent supporting the MCP protocol.
Programmatic Interfaces
Lakehouse provides the following native programming interfaces and SDKs:
| Interface | Language | Description | Reference |
|---|---|---|---|
| JDBC Driver | Java / JVM | Standard JDBC interface, compatible with all JVM ecosystems | JDBC Driver |
| MySQL Protocol | All | No client dependency, compatible with MySQL ecosystem | MySQL Protocol Connection |
| Python SDK | Python | PEP 249 compatible, supports batch/real-time writes | Python SDK |
| Java SDK | Java | Supports BulkLoad and real-time stream writes | Java SDK Batch Upload |
| SQLAlchemy | Python | Standard Python ORM / SQL toolkit | SQLAlchemy Connection |
| cz-cli | Shell | Command-line client: SQL + Studio Tasks + AI Agent | cz-cli Guide |
SQL Clients and Database Management Tools
These tools connect via JDBC or MySQL protocol, compatible with standard SQL operations:
| Tool | Connection | Description | Reference |
|---|---|---|---|
| DBeaver | JDBC | Open-source and free, community edition is sufficient, suitable for daily queries and data exploration | DBeaver Connection Guide |
| DataGrip | JDBC | JetBrains product with strong code completion and SQL analysis | DataGrip Connection Guide |
| SQL Workbench/J | JDBC | Lightweight, basic SQL execution | SQL Workbench/J Connection Guide |
| Navicat | MySQL Protocol | Visual management with intuitive operations | Navicat Connection Guide |
Data Lake Formats
Lakehouse is natively based on Apache Iceberg — tables are stored in Iceberg format, supporting time travel, partition evolution, schema evolution, and cross-engine access. Delta Lake and Hudi formats are also supported via external tables:
| Format | Relationship | Description | Reference |
|---|---|---|---|
| Apache Iceberg | Native format | Underlying format for all Lakehouse tables, cross-engine access | Spark + Iceberg Analytics |
| Delta Lake | External table | Open table format from the Databricks ecosystem | Delta Lake External Table |
| Apache Hudi | External table | Open table format optimized for streaming writes | Hudi External Table |
Federated Queries: Query Iceberg tables in Hive, Databricks, and Snowflake OpenCatalog directly via External Catalog, without data migration. See Federated Query.
Modern Data Stack
The following solution combinations show how to build a complete data platform using Lakehouse and ecosystem tools:
| Solution | Toolchain | Reference |
|---|---|---|
| ELT-oriented | Airbyte → Lakehouse → dbt → Metabase | ELT Modern Data Stack |
| Analytics-oriented | Lakehouse ← dbt → Superset | Analytics Modern Data Stack |
| BI + AI | Lakehouse + Zilliz | BI + AI Joint Solution |
Quick Navigation
- Understand product concepts: Key Concepts · Incremental Computing
- Start ingesting data: Data Integration · 50+ Data Source Support
- Connect BI tools: BI and Visualization
- Data modeling: dbt Integration Guide · DBT Practice Series
- Programmatic access: Programmatic Interfaces
- AI application development: AI and Machine Learning
