Ecosystem

Singdata Lakehouse is compatible with mainstream data integration, BI, AI, and development tools, and is deployed on seven public clouds including Alibaba Cloud, Tencent Cloud, and AWS. This document summarizes verified third-party tools and connection solutions organized by category.

If the tool you need is not on the list, that does not mean it is unsupported — Lakehouse provides standard access via JDBC, MySQL protocol, and Python/Java SDKs, and any tool compatible with these protocols can connect directly. If you want to develop a new connector or integration solution based on Lakehouse, feel free to contact our partner team.

Cloud Platforms (CSP)

Lakehouse is deployed on seven clouds: Alibaba Cloud, Tencent Cloud, AWS, GCP, Huawei Cloud, Baidu AI Cloud, and Volcengine. Alibaba Cloud, Tencent Cloud, and AWS provide complete dedicated documentation (including storage connections, private network connections, and permission configuration); the configuration approach is consistent across all other cloud platforms. BYOS (Bring Your Own Storage) deployment is also supported — data is stored under the user's own cloud account and does not pass through the Singdata platform. See Supported Cloud Platforms and Private Storage Overview for details.


Data Integration

The following data integration tools are compatible with Lakehouse, covering offline batch, real-time CDC, message streaming, and log collection scenarios. Lakehouse also supports 50+ data sources (MySQL, Oracle, PostgreSQL, MongoDB, Hive, MaxCompute, etc.) via Studio Data Sync for direct access without third-party tools:

ToolConnectionDescriptionReference
Apache KafkaKafka ConnectorReal-time message stream writing to LakehouseKafka Data Source
AutoMQKafka ProtocolNext-generation message queue, compatible with Kafka protocolAutoMQ Data Source
AirbyteJDBCOpen-source ELT platform with a rich connector ecosystemAirbyte Integration Guide
DataXPlugin-basedAlibaba open-source tool, suitable for batch data syncDataX Integration Guide
Apache FlinkFlink ConnectorStream processing engine for real-time writes to LakehouseFlink Connector
Apache SparkSpark ConnectorLarge-scale data reads and writes for Lakehouse tablesSpark Connector
LogstashLogstash ConnectorImport log data into LakehouseLogstash Integration Guide
BluepipeNative integrationReal-time CDC sync from Oracle to LakehouseBluepipe Sync Guide

BI and Visualization

The following BI tools are compatible with Lakehouse. Any BI tool supporting JDBC, ODBC, or MySQL protocol can connect directly and is not limited to the list below:

ToolConnectionDescriptionReference
FineBIJDBC / MySQLLeading domestic BI toolJDBC Connection · MySQL Protocol
TableauJDBCSuitable for complex visualizations and exploratory analysisTableau Connection Guide
Power BIMySQL ProtocolConnect via MySQL protocolPower BI Connection Guide
Apache SupersetSQLAlchemyOpen-source, suitable for self-service analyticsSuperset Connection Guide
MetabaseJDBCOpen-source, easy to deploy, suitable for small and medium teamsMetabase Connection Guide
Apache ZeppelinJDBCNotebook-style data explorationZeppelin Connection Guide
RathJDBCOpen-source intelligent analytics with automatic insight supportRath Connection Guide
StreamlitPython SDKRapidly build data apps for data science teamsStreamlit Connection Guide

Transformation and Compute Engines

The following data transformation tools and compute engines are compatible with Lakehouse:

ToolConnectionDescriptionReference
dbtdbt-clickzetta adapterData modeling and transformation, supports Dynamic Table materializationdbt Integration Guide
Apache SparkSpark ConnectorLarge-scale batch processing and machine learningSpark Connector
Apache FlinkFlink ConnectorReal-time stream processingFlink Connector

The dbt documentation series covers all scenarios from quick start to migration practice: jaffle-shop experience, Snowflake/BigQuery migration, incremental processing, real-time pipelines, and data quality testing. See DBT Practice Series.


AI and Machine Learning

The following AI frameworks and platforms are compatible with Lakehouse, supporting vector storage, RAG applications, and AI workflow scenarios:

ToolIntegrationDescriptionReference
LangChainPython SDKVector storage and RAG application developmentLangChain Integration
LlamaIndexPython SDKData indexing and retrievalLlamaIndex Integration
DifyMCP Server / SDKVector database + file storageDify Integration Overview
N8NMCP ServerUnified AI workflowsN8N Integration
MindsDBJDBCML/LLM modeling and prediction on Lakehouse dataMindsDB Integration
DatusMCP ServerData engineering agentDatus Integration
ZillizJoint solutionVector database joint solutionZilliz Joint Solution
Unstructured.ioSDKUnstructured document parsing and vectorizationUnstructured.io Integration

Lakehouse also provides an MCP Server that can be called by any AI Agent supporting the MCP protocol.


Programmatic Interfaces

Lakehouse provides the following native programming interfaces and SDKs:

InterfaceLanguageDescriptionReference
JDBC DriverJava / JVMStandard JDBC interface, compatible with all JVM ecosystemsJDBC Driver
MySQL ProtocolAllNo client dependency, compatible with MySQL ecosystemMySQL Protocol Connection
Python SDKPythonPEP 249 compatible, supports batch/real-time writesPython SDK
Java SDKJavaSupports BulkLoad and real-time stream writesJava SDK Batch Upload
SQLAlchemyPythonStandard Python ORM / SQL toolkitSQLAlchemy Connection
cz-cliShellCommand-line client: SQL + Studio Tasks + AI Agentcz-cli Guide

SQL Clients and Database Management Tools

These tools connect via JDBC or MySQL protocol, compatible with standard SQL operations:

ToolConnectionDescriptionReference
DBeaverJDBCOpen-source and free, community edition is sufficient, suitable for daily queries and data explorationDBeaver Connection Guide
DataGripJDBCJetBrains product with strong code completion and SQL analysisDataGrip Connection Guide
SQL Workbench/JJDBCLightweight, basic SQL executionSQL Workbench/J Connection Guide
NavicatMySQL ProtocolVisual management with intuitive operationsNavicat Connection Guide

Data Lake Formats

Lakehouse is natively based on Apache Iceberg — tables are stored in Iceberg format, supporting time travel, partition evolution, schema evolution, and cross-engine access. Delta Lake and Hudi formats are also supported via external tables:

FormatRelationshipDescriptionReference
Apache IcebergNative formatUnderlying format for all Lakehouse tables, cross-engine accessSpark + Iceberg Analytics
Delta LakeExternal tableOpen table format from the Databricks ecosystemDelta Lake External Table
Apache HudiExternal tableOpen table format optimized for streaming writesHudi External Table

Federated Queries: Query Iceberg tables in Hive, Databricks, and Snowflake OpenCatalog directly via External Catalog, without data migration. See Federated Query.


Modern Data Stack

The following solution combinations show how to build a complete data platform using Lakehouse and ecosystem tools:

SolutionToolchainReference
ELT-orientedAirbyte → Lakehouse → dbt → MetabaseELT Modern Data Stack
Analytics-orientedLakehouse ← dbt → SupersetAnalytics Modern Data Stack
BI + AILakehouse + ZillizBI + AI Joint Solution