Singdata - Documents

Data sharing is Singdata Lakehouse's cross-instance data sharing feature, allowing real-time sharing of tables or views with other service instances without copying data.

Traditional data sharing requires copying data from one system to another, which not only consumes storage resources but also requires maintaining synchronization pipelines, resulting in data update latency.

Lakehouse data sharing uses a zero-copy approach: the data provider creates a Share object and authorizes it, and the consumer creates a read-only schema via CREATE SCHEMA FROM SHARE to directly access the original data. Data changes are reflected instantly without synchronization.

Core Principles

No data movement: Consumers directly read the provider's original data when querying
Real-time synchronization: Provider data changes are immediately visible to consumers
Controllable permissions: Providers can add/remove shared data objects or consumer instances at any time
Read-only access: Consumers can only query; they cannot modify, delete, or re-share

Comparison with Data Synchronization

Dimension	Data Sharing	Data Synchronization
Data Copy	No copying, direct access to original data	Data copied to consumer side
Real-time	Real-time, provider changes immediately visible	Has latency, depends on sync frequency
Storage Cost	Consumer does not incur storage cost	Consumer incurs additional storage cost
Compute Cost	Consumer uses own compute resources for queries	Sync task consumes compute resources
Data Modification	Consumer read-only, cannot modify	Consumer can modify copied data
Use Case	Cross-team/cross-company data collaboration	Data migration, independent analysis

Core Concepts

Share Object

A Share is the carrier for data sharing, containing:

Data objects: Tables or views to be shared (up to 1,000)
Recipient instances: Consumer service instances authorized for access
Permissions: SELECT (query data) and READ METADATA (view metadata)

Provider

The data owner, responsible for:

Creating a Share object
Adding data objects to share into the Share
Configuring recipient instances (consumer's service instance name)

Consumer

The data user, responsible for:

Viewing shared Share objects
Executing CREATE SCHEMA FROM SHARE to create a read-only schema
Querying the shared data

Typical Application Scenarios

Scenario 1: Cross-enterprise Data Collaboration

Company A needs to provide sales data to partner Company B for analysis:

Company A creates a Share and adds the sales table to it
Company A adds Company B's service instance as a recipient
Company B creates a Schema from Share and can query Company A's sales data in real time
No need to establish high-cost real-time synchronization pipelines

Corporate headquarters shares unified master data (customers, products, regions) with various business departments:

Headquarters creates a Share containing all master data tables
Adds each business department's service instance as a recipient
Each department gets the latest master data in real time without maintaining it separately

Scenario 3: Data Products as a Service

A data service provider offers processed data products to clients:

Creates a Share containing data product tables
Adds recipient instances by client
Clients access the data product in real time; the provider can control the access scope at any time

Notes

Share scope limit: A Share can contain at most 1,000 tables or views
Partial data sharing: To share a subset of data in a table, create a View first and then share the View
Secondary sharing prohibited: Shared data cannot be re-shared to other instances
Data copy risk: Consumers can copy data via CREATE TABLE AS SELECT; providers should carefully choose the sharing scope
Workspace isolation: Share objects can only contain data objects from the same workspace

Operation Flow

Provider: CREATE SHARE --> GRANT TO SHARE --> ALTER SHARE ADD INSTANCE | Consumer: SHOW SHARES --> DESC SHARE --> CREATE SCHEMA FROM SHARE --> SELECT

Data Sharing

What Is Data Sharing