Build a Healthcare Operations Data Warehouse

Integrate data from Hospital Information Systems (HIS), Electronic Medical Records (EMR), and laboratory systems to build a healthcare operations data warehouse that outputs department performance metrics and hospital-level KPI dashboards. Using Q1 2025 California hospital admission data as the dataset, this document provides an end-to-end demonstration of the complete build process from MySQL offline sync → ODS → DWD → DWS → ADS, and covers the practical implementation of three key capabilities: Column Masking, RBAC tiered permissions, and Time Travel compliance auditing.


Overview

Core challenges for healthcare data warehouses: patient privacy protection, multi-source system integration, automated computation of operational metrics, and historical data traceability for medical insurance reconciliation.

ChallengeSingdata Solution
HIS/EMR data synced daily in full or incrementallyMySQL multi-table offline sync (full database mirror mode), automatic scheduling
Importing HL7/JSON format lab report filesCOPY INTO + Volume, supporting JSON format batch loading
ODS→DWD→DWS→ADS automatic incremental computationDynamic Table, declarative SQL, system automatically maintains refresh dependency chain
Masking of PII fields such as patient ID and diagnostic informationColumn Masking, bound to columns, dynamically controls display by role
Tiered permissions for clinicians, management, and BI analystsRBAC custom roles, fine-grained Schema/Table-level authorization
Medical insurance reconciliation and compliance auditing require historical version traceabilityTime Travel, DESC HISTORY + TIMESTAMP AS OF to query any historical version

SQL Commands Used

Command / FunctionPurposeNotes
CREATE TABLECreate ODS layer raw tables and dimension tablesRegular tables, upstream for Dynamic Tables
CREATE DYNAMIC TABLECreate DWD / DWS / ADS layer incremental computation tablesDeclarative SQL, system automatically refreshes incrementally
REFRESH DYNAMIC TABLEManually trigger a single refreshUsed during initial build or debugging
CREATE FUNCTIONCreate Column Masking policy functionSQL UDF, dynamically masks patient PII fields by role
ALTER TABLE ... CHANGE COLUMN ... SET MASKBind the masking function to a columnTransparent to the bound column; dynamically executed at read time
CREATE ROLECreate custom RBAC rolesDifferentiate permissions for clinical, management, and BI user types
GRANTGrant roles data access permissionsSchema-level or table-level authorization
DESC HISTORYView table historical version listReturns timestamp, operation type, and row change count for each version
SELECT ... TIMESTAMP AS OFQuery historical data at a specified point in timeMedical insurance reconciliation, auditing, and data recovery scenarios

Prerequisites

All examples in this document run under the best_practice_healthcare_dw schema.

CREATE SCHEMA IF NOT EXISTS best_practice_healthcare_dw;


ODS Layer: Raw Business Data Ingestion

The ODS layer stores raw data synchronized from HIS, EMR, and laboratory systems without business transformation, while configuring Column Masking on patient privacy fields.

Create Tables

Hospital dimension table (static reference data):

CREATE TABLE IF NOT EXISTS best_practice_healthcare_dw.doc_hospital_dim ( hospital_id STRING, hospital_name STRING, county STRING, city STRING, hospital_type STRING, bed_count INT, is_teaching BOOLEAN );

Admission record main table (HIS offline sync target table):

CREATE TABLE IF NOT EXISTS best_practice_healthcare_dw.doc_ods_admissions ( admission_id STRING, hospital_id STRING, patient_id STRING, age INT, sex STRING, race STRING, county_of_residence STRING, admission_date DATE, discharge_date DATE, los_days INT, admission_type STRING, admission_source STRING, discharge_disposition STRING, principal_diagnosis_code STRING, principal_diagnosis_desc STRING, major_diagnostic_category STRING, department STRING, payer_type STRING, total_charges DECIMAL(12,2), drg_code STRING, drg_description STRING, load_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP() );

Laboratory results table (LIS system JSON file COPY INTO target table):

CREATE TABLE IF NOT EXISTS best_practice_healthcare_dw.doc_ods_lab_results ( lab_id STRING, admission_id STRING, patient_id STRING, hospital_id STRING, test_name STRING, test_code STRING, result_value STRING, reference_range STRING, abnormal_flag STRING, collection_time TIMESTAMP, result_time TIMESTAMP, ordering_physician STRING, department STRING, load_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP() );

MySQL Multi-Table Offline Sync Configuration

HIS systems typically run on MySQL and contain multiple business tables including admissions, billing, and procedures. In Lakehouse Studio, use a multi-table offline sync task to implement daily full + incremental sync:

  • Sync mode: Full database mirror (full database sync), automatically mapping all business tables in the HIS database to the best_practice_healthcare_dw schema
  • Scheduling policy: run incremental sync at 02:00 daily; run full sync on first run
  • Studio task path: best_practices/healthcare_dw/

COPY INTO to Import HL7/JSON Format Lab Reports

Laboratory systems (LIS) typically output lab report files in JSON or HL7 format stored in object storage (OSS/S3), loaded periodically via COPY INTO + Volume. First create the object storage access credentials and External Volume, then execute COPY INTO:

-- Step 1: Create a Storage Connection pointing to object storage -- Alibaba Cloud OSS example; for AWS S3 / Tencent Cloud COS, replace TYPE, ENDPOINT, and credential fields accordingly. CREATE STORAGE CONNECTION IF NOT EXISTS lab_reports_oss_conn TYPE oss ENDPOINT = 'oss-cn-hangzhou.aliyuncs.com' ACCESS_ID = '<your_access_key_id>' ACCESS_KEY = '<your_access_key_secret>'; -- Step 2: Mount the directory containing LIS lab reports as an External Volume CREATE EXTERNAL VOLUME IF NOT EXISTS best_practice_healthcare_dw.lab_reports_volume LOCATION 'oss://<your-bucket>/lis/lab-reports/' USING CONNECTION lab_reports_oss_conn DIRECTORY = (enable = true, auto_refresh = true) RECURSIVE = true;

After mounting, batch import JSON lab reports from lab_reports_volume:

COPY INTO best_practice_healthcare_dw.doc_ods_lab_results FROM VOLUME best_practice_healthcare_dw.lab_reports_volume USING json ON_ERROR = 'CONTINUE';


ODS Layer: Column Masking for Patient PII

Patient IDs and diagnostic information in medical data are Protected Health Information (PHI) that needs to be dynamically masked based on the accessor's role. Singdata Lakehouse binds masking policies at the column level through Column Masking, transparently shielding sensitive fields from non-privileged users.

Create Masking Policy Functions

Patient ID masking: non-workspace_admin roles only see the first and last characters, with the middle replaced by *.

CREATE OR REPLACE FUNCTION best_practice_healthcare_dw.mask_patient_id(pid STRING) RETURNS STRING AS CASE WHEN array_contains(current_roles(), 'workspace_admin') THEN pid ELSE CONCAT(SUBSTR(pid, 1, 1), REPEAT('*', LENGTH(pid) - 2), SUBSTR(pid, LENGTH(pid), 1)) END;

Diagnosis masking: only workspace_admin and workspace_dev roles can see the full diagnosis; other roles see RESTRICTED.

CREATE OR REPLACE FUNCTION best_practice_healthcare_dw.mask_diagnosis(diag STRING) RETURNS STRING AS CASE WHEN array_contains(current_roles(), 'workspace_admin') OR array_contains(current_roles(), 'workspace_dev') THEN diag ELSE 'RESTRICTED' END;

Bind Masking Policies to Columns

-- Bind patient ID masking ALTER TABLE best_practice_healthcare_dw.doc_ods_admissions CHANGE COLUMN patient_id SET MASK best_practice_healthcare_dw.mask_patient_id; -- Bind diagnosis masking ALTER TABLE best_practice_healthcare_dw.doc_ods_admissions CHANGE COLUMN principal_diagnosis_desc SET MASK best_practice_healthcare_dw.mask_diagnosis;

Query result for workspace_admin user after binding (full data):

admission_id | patient_id | principal_diagnosis_code | principal_diagnosis_desc | payer_type -------------|------------|--------------------------|------------------------------|------------ ADM001 | P10001 | I21.9 | Acute myocardial infarction | Medicare ADM002 | P10002 | K92.1 | Melena | Commercial ADM003 | P10003 | J18.9 | Pneumonia unspecified | Medicare

Regular BI analyst queries the same table after masking takes effect:

admission_id | patient_id | principal_diagnosis_code | principal_diagnosis_desc | payer_type -------------|------------|--------------------------|--------------------------|------------ ADM001 | P****1 | I21.9 | RESTRICTED | Medicare ADM002 | P****2 | K92.1 | RESTRICTED | Commercial ADM003 | P****3 | J18.9 | RESTRICTED | Medicare


DWD Layer: Patient Visit Event Wide Table

The DWD layer uses a Dynamic Table to join ODS layer admission records with the hospital dimension table, while deriving fields such as age group and discharge category, forming an analysis-friendly patient visit event wide table.

Create Tables

CREATE DYNAMIC TABLE best_practice_healthcare_dw.doc_dwd_patient_visits REFRESH INTERVAL 60 MINUTE VCLUSTER DEFAULT AS SELECT a.admission_id, a.hospital_id, h.hospital_name, h.county AS hospital_county, h.hospital_type, h.is_teaching, a.patient_id, a.age, CASE WHEN a.age < 18 THEN 'Pediatric' WHEN a.age < 45 THEN 'Adult' WHEN a.age < 65 THEN 'Middle-Aged' ELSE 'Senior' END AS age_group, a.sex, a.race, a.county_of_residence, a.admission_date, a.discharge_date, a.los_days, a.admission_type, a.admission_source, a.discharge_disposition, a.principal_diagnosis_code, a.principal_diagnosis_desc, a.major_diagnostic_category, a.department, a.payer_type, a.total_charges, a.drg_code, a.drg_description, CASE WHEN a.discharge_disposition IN ('Home', 'Home Health') THEN 'Routine' WHEN a.discharge_disposition = 'SNF' THEN 'Extended Care' ELSE 'Other' END AS discharge_category, CASE WHEN a.los_days > 7 THEN 1 ELSE 0 END AS is_long_stay, a.load_time FROM best_practice_healthcare_dw.doc_ods_admissions a LEFT JOIN best_practice_healthcare_dw.doc_hospital_dim h ON a.hospital_id = h.hospital_id;

Lab report event wide table, calculating lab turnaround time (TAT):

CREATE DYNAMIC TABLE best_practice_healthcare_dw.doc_dwd_lab_events REFRESH INTERVAL 60 MINUTE VCLUSTER DEFAULT AS SELECT l.lab_id, l.admission_id, l.patient_id, l.hospital_id, h.hospital_name, l.test_name, l.test_code, l.result_value, l.reference_range, l.abnormal_flag, l.collection_time, l.result_time, TIMESTAMPDIFF(MINUTE, l.collection_time, l.result_time) AS tat_minutes, l.ordering_physician, l.department FROM best_practice_healthcare_dw.doc_ods_lab_results l LEFT JOIN best_practice_healthcare_dw.doc_hospital_dim h ON l.hospital_id = h.hospital_id;

REFRESH DYNAMIC TABLE best_practice_healthcare_dw.doc_dwd_patient_visits; REFRESH DYNAMIC TABLE best_practice_healthcare_dw.doc_dwd_lab_events;

DWD patient visit wide table query results:

admission_id | hospital_name | age_group | department | los_days | discharge_category | is_long_stay -------------|----------------------------|--------------|---------------|----------|--------------------|------------- ADM001 | Cedars-Sinai Medical Center | Senior | Cardiology | 4 | Routine | 0 ADM003 | UCSF Medical Center | Senior | Pulmonology | 7 | Extended Care | 0 ADM008 | Stanford Health Care | Senior | Neurology | 8 | Extended Care | 1

DWD lab event wide table, TAT column reflects the minutes from sample collection to result delivery:

lab_id | hospital_name | test_name | abnormal_flag | tat_minutes -------|----------------------------|------------------|---------------|------------ LAB001 | Cedars-Sinai Medical Center | Complete Blood Count | H | 150 LAB002 | Cedars-Sinai Medical Center | Troponin I | H | 45 LAB007 | Huntington Hospital | BNP | H | 60 LAB011 | Cedars-Sinai Medical Center | ABG | H | 30

Distribution of abnormal lab results by department (abnormal_flag IN ('H', 'A')):

department | abnormal_results | avg_tat_min --------------|------------------|------------ Cardiology | 5 | 111 Neurology | 3 | 80 Orthopedics | 2 | 120 Pulmonology | 1 | 120 Nephrology | 1 | 90

Cardiology has the most abnormal results (5), with an average turnaround time of 111 minutes; ICU is the fastest (30 minutes).


DWS Layer: Department Monthly Performance Aggregation

The DWS layer aggregates by month × hospital × department dimensions to generate department operational performance metrics.

Create Table

CREATE DYNAMIC TABLE best_practice_healthcare_dw.doc_dws_dept_monthly REFRESH INTERVAL 60 MINUTE VCLUSTER DEFAULT AS SELECT DATE_TRUNC('month', admission_date) AS admission_month, hospital_id, hospital_name, department, major_diagnostic_category, COUNT(*) AS total_admissions, COUNT(DISTINCT patient_id) AS unique_patients, ROUND(AVG(los_days), 2) AS avg_los_days, MAX(los_days) AS max_los_days, SUM(CASE WHEN is_long_stay = 1 THEN 1 ELSE 0 END) AS long_stay_count, ROUND(SUM(CASE WHEN is_long_stay = 1 THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 1) AS long_stay_pct, ROUND(SUM(total_charges), 2) AS total_charges, ROUND(AVG(total_charges), 2) AS avg_charges, SUM(CASE WHEN admission_type = 'Emergency' THEN 1 ELSE 0 END) AS emergency_count, SUM(CASE WHEN payer_type = 'Medicare' THEN 1 ELSE 0 END) AS medicare_count, SUM(CASE WHEN payer_type = 'Medicaid' THEN 1 ELSE 0 END) AS medicaid_count, SUM(CASE WHEN payer_type = 'Commercial' THEN 1 ELSE 0 END) AS commercial_count FROM best_practice_healthcare_dw.doc_dwd_patient_visits GROUP BY 1, 2, 3, 4, 5;

REFRESH DYNAMIC TABLE best_practice_healthcare_dw.doc_dws_dept_monthly;

Query January department performance (sorted by total charges descending):

SELECT department, total_admissions, avg_los_days, long_stay_pct, total_charges, emergency_count, medicare_count FROM best_practice_healthcare_dw.doc_dws_dept_monthly WHERE admission_month = '2025-01-01' ORDER BY total_charges DESC LIMIT 6;

department | total_admissions | avg_los_days | long_stay_pct | total_charges | emergency_count | medicare_count --------------|------------------|--------------|---------------|---------------|-----------------|--------------- ICU | 1 | 7.00 | 0.0 | 96800.00 | 1 | 1 Neurology | 1 | 8.00 | 100.0 | 89400.00 | 1 | 1 Orthopedics | 1 | 7.00 | 0.0 | 74500.00 | 1 | 1 Pulmonology | 1 | 7.00 | 0.0 | 62100.00 | 1 | 1 Cardiology | 1 | 6.00 | 0.0 | 58300.00 | 1 | 1 Gastroenterology | 1 | 6.00 | 0.0 | 52700.00 | 1 | 1

ICU and Neurology have the highest average charges per case; both departments also have 100% emergency admission rates, reflecting that high-complexity cases are concentrated in these departments.

Distribution by major diagnostic category (full quarter):

SELECT major_diagnostic_category, COUNT(*) AS admission_count, ROUND(AVG(los_days), 2) AS avg_los, ROUND(AVG(total_charges), 0) AS avg_charges FROM best_practice_healthcare_dw.doc_dwd_patient_visits GROUP BY major_diagnostic_category ORDER BY admission_count DESC LIMIT 8;

major_diagnostic_category | admission_count | avg_los | avg_charges --------------------------|-----------------|---------|------------ Circulatory | 8 | 5.13 | 50913 Nervous System | 8 | 5.50 | 62638 Digestive | 7 | 2.71 | 28800 Kidney | 6 | 4.33 | 39050 Respiratory | 6 | 6.00 | 60883 Reproductive | 2 | 2.50 | 30600 Endocrine | 2 | 2.00 | 16000 ENT | 2 | 1.00 | 10000

Circulatory and Nervous System diseases tied for the highest admission volume. Nervous System diseases have the highest average hospital charges ($62,638); Respiratory diseases have the longest average hospital stay (6 days).

Analysis by payer mix:

SELECT payer_type, COUNT(*) AS admission_count, ROUND(AVG(los_days), 2) AS avg_los, ROUND(SUM(total_charges), 0) AS total_revenue, ROUND(AVG(total_charges), 0) AS avg_charges FROM best_practice_healthcare_dw.doc_dwd_patient_visits GROUP BY payer_type ORDER BY admission_count DESC;

payer_type | admission_count | avg_los | total_revenue | avg_charges ------------|-----------------|---------|---------------|------------ Medicare | 25 | 6.20 | 1588600 | 63544 Commercial | 16 | 1.63 | 302000 | 18875 Medicaid | 9 | 2.44 | 241600 | 26844

Medicare patients account for 50% of total admissions, but their average length of stay (6.2 days) and average charges ($63,544) are far higher than other payers, consistent with the clinical characteristics of this population's older age and more complex conditions.


ADS Layer: Hospital-Level KPI Executive Dashboard

The ADS layer further aggregates to the hospital dimension to generate the core KPIs needed for the hospital executive management dashboard.

Create Table

CREATE DYNAMIC TABLE best_practice_healthcare_dw.doc_ads_hospital_kpi REFRESH INTERVAL 60 MINUTE VCLUSTER DEFAULT AS SELECT hospital_id, hospital_name, hospital_county, hospital_type, is_teaching, COUNT(*) AS total_admissions, COUNT(DISTINCT patient_id) AS unique_patients, ROUND(AVG(los_days), 2) AS avg_los_days, SUM(CASE WHEN is_long_stay = 1 THEN 1 ELSE 0 END) AS long_stay_count, ROUND(SUM(CASE WHEN is_long_stay = 1 THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 1) AS long_stay_rate_pct, ROUND(SUM(total_charges), 2) AS total_revenue, ROUND(AVG(total_charges), 2) AS avg_revenue_per_admission, SUM(CASE WHEN admission_type = 'Emergency' THEN 1 ELSE 0 END) AS emergency_admissions, ROUND(SUM(CASE WHEN admission_type = 'Emergency' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 1) AS emergency_rate_pct, SUM(CASE WHEN discharge_category = 'Extended Care' THEN 1 ELSE 0 END) AS snf_discharges, SUM(CASE WHEN payer_type = 'Medicare' THEN 1 ELSE 0 END) AS medicare_cases, SUM(CASE WHEN payer_type = 'Medicaid' THEN 1 ELSE 0 END) AS medicaid_cases, SUM(CASE WHEN payer_type = 'Commercial' THEN 1 ELSE 0 END) AS commercial_cases, COUNT(DISTINCT department) AS active_departments FROM best_practice_healthcare_dw.doc_dwd_patient_visits GROUP BY hospital_id, hospital_name, hospital_county, hospital_type, is_teaching;

REFRESH DYNAMIC TABLE best_practice_healthcare_dw.doc_ads_hospital_kpi;

Hospital-level KPI query results (sorted by total revenue descending):

SELECT hospital_name, total_admissions, avg_los_days, long_stay_rate_pct, total_revenue, avg_revenue_per_admission, emergency_rate_pct, medicare_cases, commercial_cases, medicaid_cases FROM best_practice_healthcare_dw.doc_ads_hospital_kpi ORDER BY total_revenue DESC;

hospital_name | total_admissions | avg_los_days | long_stay_rate_pct | total_revenue | avg_revenue | emergency_rate_pct -----------------------------|------------------|--------------|-------------------|---------------|-------------|------------------- Cedars-Sinai Medical Center | 8 | 4.88 | 0.0 | 447100.00 | 55887.50 | 75.0 UCSF Medical Center | 7 | 4.29 | 0.0 | 322200.00 | 46028.57 | 42.9 Stanford Health Care | 6 | 5.17 | 16.7 | 312600.00 | 52100.00 | 66.7 Huntington Hospital | 6 | 4.17 | 0.0 | 257200.00 | 42866.67 | 50.0 Kaiser Permanente Fontana | 6 | 4.00 | 0.0 | 217100.00 | 36183.33 | 50.0

Cedars-Sinai has the highest total revenue ($447,100) and also the highest emergency admission rate (75%); Stanford Health Care has a 16.7% long-stay rate (los > 7 days), reflecting its higher proportion of critically ill cases.

Quarterly trend analysis by month:

SELECT CASE EXTRACT(month FROM admission_date) WHEN 1 THEN 'January' WHEN 2 THEN 'February' WHEN 3 THEN 'March' END AS month_name, COUNT(*) AS total_admissions, ROUND(AVG(los_days), 2) AS avg_los, ROUND(SUM(total_charges), 0) AS total_revenue, SUM(CASE WHEN admission_type = 'Emergency' THEN 1 ELSE 0 END) AS emergency_count FROM best_practice_healthcare_dw.doc_dwd_patient_visits GROUP BY EXTRACT(month FROM admission_date), CASE EXTRACT(month FROM admission_date) WHEN 1 THEN 'January' WHEN 2 THEN 'February' WHEN 3 THEN 'March' END ORDER BY EXTRACT(month FROM admission_date);

month_name | total_admissions | avg_los | total_revenue | emergency_count -----------|------------------|---------|---------------|---------------- January | 25 | 3.84 | 996800 | 12 February | 15 | 4.33 | 686700 | 8 March | 10 | 4.20 | 448700 | 6

January had the highest admission volume (25 cases), consistent with California's winter peak of respiratory illnesses and flu season. February had a slightly higher average length of stay (4.33 days), suggesting the average case complexity in February was higher.


RBAC: Tiered Permission Configuration

Different users in the healthcare data warehouse have significantly varying data access needs and permission levels; custom RBAC roles provide layered access control.

Create Custom Roles

-- Clinical staff: can view patient visit events (DWD layer), but patient PII fields are controlled by Column Masking CREATE ROLE IF NOT EXISTS healthcare_clinical_viewer COMMENT 'Clinical staff - view patient clinical data with PII masking'; -- Operations management: can only access department-level aggregated data (DWS/ADS layers), no patient-level records CREATE ROLE IF NOT EXISTS healthcare_mgmt_analyst COMMENT 'Management - view aggregate KPIs only, no patient-level access'; -- BI analysts: can only access ADS layer KPI tables for reports and dashboards CREATE ROLE IF NOT EXISTS healthcare_bi_analyst COMMENT 'BI team - ADS layer only, read-only for reporting';

Grant Schema Access Permissions

-- Clinical staff: grant schema metadata read permission (actual table-level permissions are controlled separately on each table) GRANT READ METADATA ON SCHEMA best_practice_healthcare_dw TO ROLE healthcare_clinical_viewer; -- Management and BI analysts also only granted metadata read GRANT READ METADATA ON SCHEMA best_practice_healthcare_dw TO ROLE healthcare_mgmt_analyst; GRANT READ METADATA ON SCHEMA best_practice_healthcare_dw TO ROLE healthcare_bi_analyst;

Permission Tier Summary

RoleAccessible LayersPatient PII VisibleTypical Users
workspace_adminODS / DWD / DWS / ADSFull dataDBA, data platform administrators
healthcare_clinical_viewerDWD visit eventspatient_id masked, diagnosis RESTRICTEDClinicians, nurses
healthcare_mgmt_analystDWS / ADS layersNo patient-level fieldsDepartment heads, operations management
healthcare_bi_analystADS layer KPIs onlyNo patient-level fieldsBI engineers, report developers

Time Travel: Compliance Auditing and Medical Insurance Reconciliation

Healthcare industry compliance requirements make data change traceability mandatory. Singdata Lakehouse Time Travel supports querying any historical version through DESC HISTORY and TIMESTAMP AS OF, applicable to medical insurance reconciliation, regulatory audits, and recovery from erroneous data operations.

View Historical Versions

DESC HISTORY best_practice_healthcare_dw.doc_ods_admissions;

version | time | total_rows | operation | user | stats --------|---------------------------|------------|-------------|------------|---------------------------------- 5 | 2026-06-06T13:32:08.973 | 50 | ALTER | admin_user | rows_inserted:25, rows_deleted:0 4 | 2026-06-06T13:32:04.327 | 50 | ALTER | admin_user | rows_inserted:25, rows_deleted:0 3 | 2026-06-06T13:28:59.337 | 50 | INSERT_INTO | admin_user | rows_inserted:25, rows_deleted:0 2 | 2026-06-06T13:28:29.718 | 25 | INSERT_INTO | admin_user | rows_inserted:25, rows_deleted:0 1 | 2026-06-06T13:26:34.306 | 0 | CREATE | admin_user | —

Version 2 had only 25 records (the first batch of admission data); versions 4 and 5 are ALTER operations (binding Column Masking).

Timestamp-Based Historical Data Query

Medical insurance reconciliation scenario: query the state of the raw data at the time of submitting a reconciliation report, confirming the number of admission records at that time.

-- Query the data snapshot at version 2 (2026-06-06 13:28:29) SELECT COUNT(*) AS row_count FROM best_practice_healthcare_dw.doc_ods_admissions TIMESTAMP AS OF '2026-06-06T13:28:29.718';

row_count --------- 25

The current table has 50 records, but through TIMESTAMP AS OF you can precisely return to the state at version 2, confirming that only 25 records existed at that time. This is applicable to scenarios where medical insurance auditors require "a data snapshot at the time of a specific claim submission."

Data Recovery Scenario

Recover data after an erroneous operation using Time Travel:

-- Restore accidentally deleted data to a specified point in time RESTORE TABLE best_practice_healthcare_dw.doc_ods_admissions TO TIMESTAMP AS OF '2026-06-06T13:28:29.718';


Notes

  • Column Masking is a preview feature: Contact technical support to activate it; confirm activation before using in production. Masking policies take effect on all query paths (including indirect reads through Dynamic Tables) after binding.
  • Dynamic Table and upstream ODS are decoupled: The Column Masking behavior of DWD/DWS/ADS layer Dynamic Tables is determined by the ODS table's policy; masking does not need to be re-bound on DWD tables.
  • RBAC role authorization follows the least privilege principle: Assign roles by department or function; avoid granting high-privilege roles like workspace_admin or workspace_dev to ordinary business users.
  • Time Travel retention period: Defaults to 7 days; after that, historical versions are cleaned up. For scenarios with long-term audit requirements such as medical insurance reconciliation, it is recommended to periodically archive key versions to separate tables via snapshots (CTAS).
  • HIS offline sync full/incremental strategy: For the first sync, it is recommended to run full sync during off-peak hours (e.g., 02:00 AM), then switch to incremental mode to avoid excessive compute resource consumption from full sync as data volumes grow.