Datus and Singdata Lakehouse Integration Overview
What is Datus
Datus is an open-source data engineering agent designed to build evolvable contextual environments for data systems. Datus represents a paradigm shift in data engineering: from the traditional approach of "building tables and data pipelines" to "providing domain-aware intelligent agents for analysts and business users."
CLI Quick Overview:

Web Quick Overview:

Core Components
Datus-CLI: An AI-driven command-line interface for data engineers, which can be understood as "Claude Code for data engineers." Key features include:
- Interactive SQL Writing: Generate and optimize SQL queries through natural language
- Subagent Building: Create domain-specific intelligent agents (subagents)
- Context Building: Interactively build and evolve contextual knowledge for data systems
Datus-Chat: A web chatbot that provides for data analysts:
- Multi-turn Conversations: Continuous data exploration and analysis dialogue
- Feedback Mechanisms: Built-in feedback systems including likes, issue reporting, success cases, etc.
- User-friendly: Optimized interface experience for non-technical users
Datus-API: A stable, accurate data service API for other agents or applications.
Technical Features
- Multi-AI Model Support: Integrates Qwen, DeepSeek, OpenAI, Claude, and other AI models
- Extensible Architecture: Supports MCP (Model Context Protocol) tool integration.
- Multi-data Source Connectivity: Supports various database and data warehouse platforms.
- Chinese Language Optimization: Specially optimized for Chinese language contexts and usage habits.
Integration Architecture
Architecture Description
User Interface Layer:
- Datus-CLI: Provides a command-line interface for data engineers
- Datus-Chat: Provides a web interface for data analysts and business users
Datus Agent Core:
- AI Model Layer: Supports multiple large language models, allowing selection of the most suitable model based on task type
- Subagent Management: Different intelligent agents handle different business scenarios.
- Context Management: Maintains the knowledge graph and query context of the data system.
Data Layer:
- Singdata Lakehouse: Provides data storage, computing, and SQL execution capabilities
Tool Extension Layer:
- Singdata Lakehouse MCP Server: The official MCP server provided by Singdata Lakehouse, extending system capabilities through standardized protocols and offering advanced management and analysis tools
Connection Relationship Description
- Datus <-> Singdata Lakehouse: Connected via the Datus-Singdata connector for database connectivity, supporting SQL query execution and metadata retrieval.
- Datus <-> Singdata Lakehouse MCP Server: Connected via the MCP protocol, invoking advanced management and analysis tools.
- Singdata Lakehouse MCP Server <-> Singdata Lakehouse: The MCP Server serves as an extension service for Singdata Lakehouse, able to access and manage the underlying data platform.
Integration Value
Datus + Singdata Lakehouse
Singdata Lakehouse, as a modern data lakehouse platform, has powerful data processing and storage capabilities. After integration with Datus:
- Lower the Barrier to Entry: Business users can directly query and analyze massive datasets without learning SQL
- Improve Analysis Efficiency: Natural language queries significantly reduce the time cost of data exploration
- Intelligent Insights: AI-driven query optimization and result interpretation help users better understand data
- Chinese-friendly: Optimized for Chinese language contexts, better suited for local users' habits.
Datus + Singdata Lakehouse MCP Server
Through integration with the official Singdata Lakehouse MCP Server, system capabilities are further extended:
- Instance Management: Intelligently switch between different Singdata Lakehouse instances and environments
- Job Monitoring: Query and analyze SQL job execution history and performance metrics.
- System Operations: Perform system status queries and configuration management through natural language.
- Advanced Analytics: Utilize specialized analysis tools for deep data insights.
- Workflow Automation: Encapsulate complex data processing workflows as simple natural language instructions.
Use Cases
- Data Analysts: Quickly explore and analyze business data, generate reports and insights
- Business Users: Users without technical backgrounds can easily query the data they need
- Data Engineers: Perform system management and job monitoring through MCP tools
- Decision Makers: Quickly access key business metrics and trend analysis
Last updated: November 2025
