Federation Query
Federation Query lets you query data in external data systems (Hive, Databricks, Iceberg, Snowflake, etc.) directly using standard SQL — no data migration or copying required. By creating an EXTERNAL CATALOG, you map an external data catalog into Lakehouse for unified cross-system queries.
Supported External Data Sources
| External System | Connection Method | Typical Use Case |
|---|---|---|
| Apache Hive | Hive Metastore URIs | In-place acceleration of existing Hive warehouses, replacing Presto/Trino |
| Databricks Unity Catalog | Databricks API | Cross-platform federated analytics without moving Databricks data |
| Iceberg REST Catalog | Iceberg REST API | Query any data lake compatible with the Iceberg REST protocol |
| Snowflake Open Catalog | Iceberg REST API + OAuth | Access Iceberg tables managed by Snowflake |
Core Concepts
There are three independent approaches to access external data, each suited to different scenarios:
External Catalog (Recommended)
Maps an external data system (Hive/Databricks/Snowflake/Iceberg REST) as a top-level catalog; the Schemas and Tables underneath automatically correspond to the external system's structure:
Queries use three-level naming: external_catalog.schema.table
Supports: Hive, Databricks Unity Catalog, Iceberg REST (including Snowflake Open Catalog)
External Schema (Standalone)
Without going through an External Catalog, directly mounts an external Hive Database into the current Workspace's internal Catalog, using two-level naming schema.table (equivalent to <current internal catalog>.schema.table). Direct HMS mapping — all tables under the entire Database are immediately queryable, and newly added tables are automatically visible without per-table definitions.
Supports: Hive (OSS/COS/GCS/HDFS). Read-only; DML is not supported.
External Table (Standalone)
Creates a single table pointing to external storage under an ordinary Schema in the current Workspace's internal Catalog, using two-level naming schema.table. Unlike External Schema: column names and types can be customized, and renaming and modifying comments are supported.
Supports: Kafka, Delta Lake, Hudi. Read-only; DML is not supported.
Comparison of the Three Approaches
| External Catalog | External Schema (Standalone) | External Table (Standalone) | |
|---|---|---|---|
| Catalog location | Independent external Catalog | Current Workspace's internal Catalog | Current Workspace's internal Catalog |
| Naming | Three-level catalog.schema.table | Two-level schema.table | Two-level schema.table |
| Use case | Cross-platform federated analytics | Mount an entire Hive database into the workspace | Custom mapping for a single external table |
| Supported sources | Hive, Databricks, Iceberg REST, Snowflake | Hive | Kafka, Delta Lake, Hudi |
| Schema definition | Auto-mapped from external system | Auto-mapped from HMS | Manually define column names and types |
| New external tables visible | Requires re-mapping | Automatically visible | Must be created one by one |
Selection guide: External Catalog vs External Schema
Quick Start
External Catalog depends on a pre-created Catalog Connection (Storage Connection → Catalog Connection → External Catalog). For the complete configuration steps, see the External Object User Guide.
Once configured, use standard SQL to query:
This Section
| Page | Description |
|---|---|
| External Object User Guide | Complete operations for creating, querying, and managing External Catalog / Schema / Table |
| Query Snowflake OpenCatalog Iceberg Tables | Federated queries on Snowflake-managed Iceberg data via Iceberg REST API |
| Databricks Unity Catalog Federation Query Practice | Full step-by-step setup guide with verified results and common error troubleshooting |
Related Documentation
- External Catalog Overview — Feature introduction, supported data sources, permission details
- External Catalog Federation Query Guide — Detailed operation examples and architecture principles
- Create External Catalog — DDL syntax reference
- Create Hive Catalog — Hive connection configuration details
- In-Place Lake Acceleration Guide — Complete guide for replacing Spark/Hive and Presto/Trino without moving data
- Databricks Unity Catalog Federation Query Practice — Full step-by-step setup guide with verified results
