Lakehouse Data Share — Cross-Account Data Sharing Getting Started Guide
1. Solution Overview
1.1 What is Data Share
Data Share is a zero-copy data sharing feature provided by Lakehouse, enabling cross-account or cross-service-instance data authorization and usage within the same service region. Unlike traditional data replication or import solutions, Data Share does not require manually building data sync links between enterprises. Instead, it directly provides read-only access to source data through "authorization," with real-time updates.
1.2 Core Advantages
- Zero-Copy, Low Latency: No need for duplicate storage or periodic syncing between enterprises; data changes can be propagated to consumers in real time.
- Secure and Controllable: Data is shared only in "read-only" form, and shared objects can be removed through permission control at any time.
- Easy to Scale: Use Views to flexibly filter data columns or rows, achieving fine-grained sharing control.
- Reduced Costs: Consumers do not need to pay for the storage of shared data; they only need to use their own compute resources to process the data.
1.3 Use Cases
- Real-Time Sharing: Company A shares transaction logs, user behavior data, etc., in real time with Company B, allowing Company B to query in real time without data sync.
- Multi-Party Collaborative Analysis: Multiple partners analyze based on the same real-time data, reducing duplicate storage and transfer costs.
- Data Provider-Consumer Model: Data providers open partial data to partners, while consumers obtain read-only data without bearing storage costs.
2. How It Works
The core working principle of Data Share is:
- Data Remains with the Provider: Data is always stored in the data provider's account.
- Share Metadata: Metadata and access permissions are shared with the consumer.
- Access Control: Consumers query data through controlled read-only access permissions.
- Real-Time Updates: After the provider's data is updated, consumers can immediately access the latest data.
Data Sharing Flow:

The data provider retains full control and can update or remove shared data at any time. When source data is updated, the consumer immediately obtains the latest data without additional data sync operations.
3. Prerequisites
- Both the data provider and the data consumer must have Lakehouse instances.
- Both parties must be within the same service region.
- The data provider must have the "Instance Admin" (instance_admin) role permission.
- Data sharing operations require the consumer to provide their service instance name.
4. Hands-On Guide: Data Provider
4.1 Creating a Share Object
The data provider (data owner) needs to execute the following steps to create and configure a share object:
4.1.1 SQL Approach
4.1.2 Web UI Approach
- Log in to the Lakehouse console with an account that has the "Instance Admin" role.
- In the left navigation, select Data Management -> Data Share, and click + New Share.
- Fill in the share name (e.g., taxi_data_share) and select the workspace.
- Click Add to select the data object to share (e.g., taxi_zone_lookup table).
- Under Receiving Instances, click Add and enter the consumer's service instance name.
- Click Confirm to complete creation.

4.1.3 Creating a View for Partial Data Sharing
If you only need to share part of a table's data, you can first create a view and then share the view:
4.2 Verifying the Share Configuration
After creation and configuration, you can verify that the share is correctly configured using the following commands:
Example output:
5. Hands-On Guide: Data Consumer
The data consumer (receiver) needs to execute the following steps to access shared data:
5.1 Viewing Available Shares
5.1.1 SQL Approach
Example output (viewed from the consumer instance):
5.1.2 Web UI Approach
- Log in to the data consumer's Lakehouse console.
- In the left navigation, select Data Management -> Data Share.
- Switch to the Shared with Me tab to see all received shares.

5.2 Creating a Local Schema Linked to Shared Data
5.2.1 SQL Approach
Where:
shared_taxi_data: The schema name created locally (customizable).source_instance_name: The provider's instance name.taxi_data_share: The share object name.source_schema: The provider's source schema name.
5.2.2 Web UI Approach
- Find the target share object in the Shared with Me tab.
- Click the Extract button.
- In the popup, select the Source Schema and specify the schema name to be created locally.
- Click Confirm to complete the extraction.

5.3 Using Shared Data
After creating the schema, the shared data appears in the local schema in read-only mode:
6. Complete Example: Cross-Account Sharing of Taxi Zone Data
6.1 Background Scenario
Company A operates New York City taxi services and maintains the taxi_zone_lookup table, containing all zone information. Company B is an analytics company that needs access to this zone data for analysis but does not need to copy the data.
6.2 Implementation Steps
6.2.1 View Source Table Data
First, Company A views the structure of the table to be shared:
The result shows the table structure as follows:
The table contains 265 rows of data, recording NYC taxi zone information.
6.2.2 Create the Share Object
Company A creates and configures the share object:
6.2.3 Verify the Share Configuration
Company A verifies that the share configuration is correct:
The output result shows:
The output includes the newly created share:
6.2.4 Company B Accesses the Shared Data
Company B (instance target_instance_name) performs the following operations to use the shared data:
Company B can query results and create reports without copying data. When Company A updates the original table data, Company B immediately sees the latest data.
7. Best Practices
7.1 Security Best Practices
- Use Views to Restrict Sensitive Data: Filter rows or columns through views to ensure that only necessary data is shared.
- Regularly Audit Share Objects: Use the
SHOW SHARESandDESC SHAREcommands to regularly check share contents. - Principle of Least Privilege: Grant only the necessary permissions (SELECT and READ METADATA).
- Do Not Share Credentials: Strictly prohibit sharing administrator accounts; use Data Share for secure data sharing.
7.2 Performance and Cost Optimization
- Use Filtered Views: Limit data volume through views to improve query performance.
- Pay Attention to Timezone Settings: Ensure that the provider and consumer use the same timezone settings to avoid time-related query issues.
8. FAQ and Troubleshooting
Issue 1: Unable to See Shared Data
Possible Causes:
- Incorrect permission configuration, such as not granting the
READ METADATApermission. - Incorrect instance name entry.
Solutions:
- The provider should use the
SHOW GRANTS TO SHAREcommand to check whether the objects in the share have been granted bothSELECTandREAD METADATApermissions. - Verify that the instance name is correct.
Issue 2: Unable to Query Shared Data
Possible Causes:
- Missing SELECT permission.
Solutions:
