DataOps Pipeline Data Quality Gates Best Practices

Automatically running assertion checks after each pipeline layer refresh, isolating non-conforming data into a Quarantine zone, and triggering alerts when a gate fails — these are the core components of a DataOps closed-loop quality control system. This guide uses an e-commerce event stream as the dataset, demonstrating the complete end-to-end setup of Bronze → Quality Gate → Passed (Silver) / Quarantine, and covers three key capabilities: Dynamic Table built-in quality filtering, Lakehouse Studio task DAG dependency orchestration, and information_schema.job_history trend tracking.


Overview

The core challenge for DataOps data quality is: after the data warehouse refreshes, how do you know whether the data that just came in is trustworthy? Singdata Lakehouse addresses this with the following combination:

ProblemSolution
Bronze layer has null values, negative numbers, and out-of-range anomaliesDynamic Table built-in WHERE filter — only rows that pass checks flow into Silver
Non-conforming data needs separate storage for manual reviewdoc_events_quarantine Dynamic Table automatically labels rows with quarantine_reason
Quality check results need to be traceable and trend-analyzabledoc_quality_results + doc_quality_summary persist check details from each run
The order between pipeline refresh and quality checks must be guaranteedStudio Task DAG: refresh Passed/Quarantine first → then refresh Summary → then evaluate gate → then trigger Gold
Gate failures need timely push notificationsStudio Task configures Webhook to push alerts to Lark/DingTalk operations channels

SQL Commands Used

Command / FunctionPurposeNotes
CREATE TABLECreate Bronze raw event table, quality rules table, and quality results tableStatic tables serving as upstream data sources for Dynamic Tables
CREATE DYNAMIC TABLECreate Passed / Quarantine / Summary layersNo REFRESH INTERVAL — scheduling managed by Studio Tasks
REFRESH DYNAMIC TABLEManually trigger first refreshUsed for initial builds or debugging
FILTER (WHERE ...)Conditional filtering in aggregate functions, counts rows of each dirty data typeUsed for failed row count calculation in quality rules
SHOW TABLESView all objects under a schemaConfirm table creation status
sys.information_schema.job_historyTrack elapsed time and status of each Dynamic Table refreshAnalyze quality check run trends

Prerequisites

All examples in this guide run under the best_practice_dataops_quality schema.

CREATE SCHEMA IF NOT EXISTS best_practice_dataops_quality;


Bronze Layer: Raw Event Table with Dirty Data

Create the Table

CREATE TABLE IF NOT EXISTS best_practice_dataops_quality.doc_raw_events ( event_id STRING, user_id STRING, event_type STRING, amount DOUBLE, ts TIMESTAMP, region STRING, platform STRING, status STRING );

Insert Sample Data (with Dirty Data)

The dataset intentionally includes four types of quality problems: user_id IS NULL (3 rows), amount IS NULL (4 rows), purchase type with amount < 0 (1 row: EVT012), refund type with amount < 0 (4 rows: EVT003/EVT011/EVT021/EVT028 — valid negative refunds), amount > 10000 (2 rows), and logical cross-event duplicates (EVT001/EVT006/EVT016 have the same user_id + ts + amount).

Import from a local CSV file (recommended):

-- Step 1: Upload the local CSV file to User Volume via SQL PUT PUT '/path/to/doc_raw_events.csv' TO USER VOLUME FILE 'doc_raw_events.csv';

-- Step 2: COPY INTO the table from User Volume COPY INTO best_practice_dataops_quality.doc_raw_events FROM USER VOLUME USING csv OPTIONS('header'='true', 'sep'=',', 'nullValue'='') FILES ('doc_raw_events.csv');

You can also directly insert a small batch of test data inline (no CSV file needed):

INSERT INTO best_practice_dataops_quality.doc_raw_events VALUES ('EVT001','U001','purchase', 99.9, CAST('2026-06-01 08:00:00' AS TIMESTAMP), 'CN', 'iOS', 'completed'), ('EVT002','U002','purchase', 159.0, CAST('2026-06-01 08:05:00' AS TIMESTAMP), 'CN', 'Android', 'completed'), ('EVT003','U003','refund', -50.0, CAST('2026-06-01 08:10:00' AS TIMESTAMP), 'CN', 'Web', 'completed'), ('EVT004',NULL, 'purchase', 80.0, CAST('2026-06-01 08:15:00' AS TIMESTAMP), 'US', 'iOS', 'completed'), ('EVT005','U005','purchase', NULL, CAST('2026-06-01 08:20:00' AS TIMESTAMP), 'US', 'Android', 'completed'), ('EVT006','U001','purchase', 99.9, CAST('2026-06-01 08:00:00' AS TIMESTAMP), 'CN', 'iOS', 'completed'), ('EVT007','U007','purchase', 12500.0,CAST('2026-06-01 08:25:00' AS TIMESTAMP),'EU', 'Web', 'completed'), -- ... 33 rows total, dirty data distribution: -- null user_id: EVT004, EVT013, EVT022 -- null amount: EVT005, EVT014, EVT023, EVT033 -- purchase negative: EVT012(-999) (quarantine) -- refund negative: EVT003(-50), EVT011(-30), EVT021(-80), EVT028(-200) (valid refunds, not quarantined) -- amount > 10000: EVT007(12500), EVT019(15000) -- logical duplicates: EVT001/EVT006/EVT016 (U001 @ 08:00:00, 99.9) ...

Verify total row count:

SELECT COUNT(*) AS total_rows FROM best_practice_dataops_quality.doc_raw_events;

total_rows ---------- 33


Quality Rule Definitions

Quality rules are maintained in a table-driven way. Each rule includes: a check SQL expression (returning failure rate), an acceptable threshold, and an alert severity level.

Create the Table

CREATE TABLE IF NOT EXISTS best_practice_dataops_quality.doc_quality_rules ( rule_id STRING, rule_name STRING, target_table STRING, target_col STRING, sql_expr STRING, threshold DOUBLE, severity STRING, description STRING );

Insert 10 Quality Rules

Import from a local CSV file (recommended):

-- Step 1: Upload the local CSV file to User Volume via SQL PUT PUT '/path/to/doc_quality_rules.csv' TO USER VOLUME FILE 'doc_quality_rules.csv';

-- Step 2: COPY INTO the table from User Volume COPY INTO best_practice_dataops_quality.doc_quality_rules FROM USER VOLUME USING csv OPTIONS('header'='true', 'sep'=',', 'nullValue'='') FILES ('doc_quality_rules.csv');

You can also directly insert a small batch of test data inline (no CSV file needed):

INSERT INTO best_practice_dataops_quality.doc_quality_rules VALUES ('R001','user_id_not_null', 'doc_raw_events','user_id', 'COUNT(*) FILTER (WHERE user_id IS NULL) * 1.0 / COUNT(*)', 0.02, 'ERROR', 'user_id null rate must be below 2%'), ('R002','amount_not_null', 'doc_raw_events','amount', 'COUNT(*) FILTER (WHERE amount IS NULL) * 1.0 / COUNT(*)', 0.05, 'ERROR', 'amount null rate must be below 5%'), ('R003','amount_positive', 'doc_raw_events','amount', 'COUNT(*) FILTER (WHERE event_type = ''purchase'' AND amount < 0) * 1.0 / COUNT(*)', 0.0, 'ERROR', 'purchase amount must not be negative'), ('R004','amount_range_check', 'doc_raw_events','amount', 'COUNT(*) FILTER (WHERE amount > 10000) * 1.0 / COUNT(*)', 0.01, 'WARNING', 'amount > 10000 rate must be below 1%'), ('R005','event_type_whitelist', 'doc_raw_events','event_type', 'COUNT(*) FILTER (WHERE event_type NOT IN (''purchase'',''refund'',''login'')) * 1.0 / COUNT(*)', 0.0, 'ERROR', 'event_type must be in whitelist'), ('R006','status_whitelist', 'doc_raw_events','status', 'COUNT(*) FILTER (WHERE status NOT IN (''completed'',''pending'',''error'')) * 1.0 / COUNT(*)', 0.0, 'ERROR', 'status must be in whitelist'), ('R007','duplicate_event_id', 'doc_raw_events','event_id', '(COUNT(*) - COUNT(DISTINCT event_id)) * 1.0 / COUNT(*)', 0.0, 'ERROR', 'event_id must be unique'), ('R008','ts_not_future', 'doc_raw_events','ts', 'COUNT(*) FILTER (WHERE ts > CURRENT_TIMESTAMP()) * 1.0 / COUNT(*)', 0.0, 'ERROR', 'ts must not be in the future'), ('R009','region_not_null', 'doc_raw_events','region', 'COUNT(*) FILTER (WHERE region IS NULL) * 1.0 / COUNT(*)', 0.0, 'WARNING', 'region should not be null'), ('R010','duplicate_events', 'doc_raw_events','*', '(COUNT(*) - COUNT(DISTINCT event_id || CAST(ts AS STRING) || COALESCE(user_id,''''))) * 1.0 / COUNT(*)', 0.05,'WARNING', 'logical duplicate rate must be below 5%');

The sql_expr field in each rule stores a directly executable SQL expression. The actual check is performed by a quality check task that dynamically concatenates and runs the expression. threshold = 0.0 means zero tolerance — any single failed row triggers an alert.


Quality Check: Run Assertions and Record Results

Create the Results Table

CREATE TABLE IF NOT EXISTS best_practice_dataops_quality.doc_quality_results ( check_id STRING, rule_id STRING, rule_name STRING, target_table STRING, check_ts TIMESTAMP, total_rows BIGINT, failed_rows BIGINT, fail_rate DOUBLE, threshold DOUBLE, passed BOOLEAN, severity STRING, pipeline_run STRING );

Execute Quality Checks

Run each rule's SQL expression against the Bronze layer to collect actual failure rates:

SELECT COUNT(*) AS total_rows, COUNT(*) FILTER (WHERE user_id IS NULL) AS null_user_id, ROUND(COUNT(*) FILTER (WHERE user_id IS NULL) * 1.0 / COUNT(*), 4) AS null_user_id_rate, COUNT(*) FILTER (WHERE amount IS NULL) AS null_amount, ROUND(COUNT(*) FILTER (WHERE amount IS NULL) * 1.0 / COUNT(*), 4) AS null_amount_rate, COUNT(*) FILTER (WHERE event_type = 'purchase' AND amount < 0) AS negative_amount, ROUND(COUNT(*) FILTER (WHERE event_type = 'purchase' AND amount < 0) * 1.0 / COUNT(*), 4) AS negative_amount_rate, COUNT(*) FILTER (WHERE amount > 10000) AS outlier_amount, ROUND(COUNT(*) FILTER (WHERE amount > 10000) * 1.0 / COUNT(*), 4) AS outlier_amount_rate, COUNT(*) - COUNT(DISTINCT event_id) AS dup_event_id FROM best_practice_dataops_quality.doc_raw_events;

total_rows | null_user_id | null_user_id_rate | null_amount | null_amount_rate | negative_amount | negative_amount_rate | outlier_amount | outlier_amount_rate | dup_event_id -----------+--------------+-------------------+-------------+------------------+-----------------+----------------------+----------------+---------------------+------------- 33 | 3 | 0.0909 | 4 | 0.1212 | 1 | 0.0303 | 2 | 0.0606 | 0

Results interpretation: user_id null rate is 9.09%, far exceeding R001's threshold of 2%; amount null rate is 12.12%, exceeding R002's threshold of 5%; purchase type negative rate is 3.03%, triggering zero-tolerance rule R003 (negative values for refund type are valid refunds and are not counted); out-of-range rate is 6.06%, exceeding R004's threshold of 1%. event_id has no duplicates itself (R007 passes), but logical duplicates exist (events with the same user_id + ts + amount).

Write Check Results

Import from a local CSV file (recommended):

-- Step 1: Upload the local CSV file to User Volume via SQL PUT PUT '/path/to/doc_quality_results.csv' TO USER VOLUME FILE 'doc_quality_results.csv';

-- Step 2: COPY INTO the table from User Volume COPY INTO best_practice_dataops_quality.doc_quality_results FROM USER VOLUME USING csv OPTIONS('header'='true', 'sep'=',', 'nullValue'='') FILES ('doc_quality_results.csv');

You can also directly insert a small batch of test data inline (no CSV file needed):

INSERT INTO best_practice_dataops_quality.doc_quality_results VALUES ('CHK001','R001','user_id_not_null', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 3, 0.0909, 0.02, false, 'ERROR', 'run_2026060601'), ('CHK002','R002','amount_not_null', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 4, 0.1212, 0.05, false, 'ERROR', 'run_2026060601'), ('CHK003','R003','amount_positive', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 1, 0.0303, 0.0, false, 'ERROR', 'run_2026060601'), ('CHK004','R004','amount_range_check', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 2, 0.0606, 0.01, false, 'WARNING', 'run_2026060601'), ('CHK005','R005','event_type_whitelist','doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 0, 0.0, 0.0, true, 'ERROR', 'run_2026060601'), ('CHK006','R006','status_whitelist', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 0, 0.0, 0.0, true, 'ERROR', 'run_2026060601'), ('CHK007','R007','duplicate_event_id', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 0, 0.0, 0.0, true, 'ERROR', 'run_2026060601'), ('CHK008','R008','ts_not_future', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 0, 0.0, 0.0, true, 'ERROR', 'run_2026060601'), ('CHK009','R009','region_not_null', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 0, 0.0, 0.0, true, 'WARNING', 'run_2026060601'), ('CHK010','R010','duplicate_events', 'doc_raw_events', CAST('2026-06-01 12:00:00' AS TIMESTAMP), 33, 5, 0.1515, 0.05, false, 'WARNING', 'run_2026060601');


Quality Gate Layer: Passed and Quarantine Dynamic Tables

Passed Layer: Let Only Clean Data Flow into Silver

CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_dataops_quality.doc_events_passed AS SELECT event_id, user_id, event_type, amount, ts, region, platform, status FROM best_practice_dataops_quality.doc_raw_events WHERE user_id IS NOT NULL AND amount IS NOT NULL AND (event_type != 'purchase' OR amount >= 0) AND amount <= 10000 AND event_type IN ('purchase','refund','login') AND status IN ('completed','pending','error');

Quarantine Layer: Store Non-Conforming Data in Isolation

CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_dataops_quality.doc_events_quarantine AS SELECT event_id, user_id, event_type, amount, ts, region, platform, status, CASE WHEN user_id IS NULL THEN 'null_user_id' WHEN amount IS NULL THEN 'null_amount' WHEN event_type = 'purchase' AND amount < 0 THEN 'negative_amount' WHEN amount > 10000 THEN 'amount_out_of_range' WHEN event_type NOT IN ('purchase','refund','login') THEN 'invalid_event_type' ELSE 'other' END AS quarantine_reason FROM best_practice_dataops_quality.doc_raw_events WHERE user_id IS NULL OR amount IS NULL OR (event_type = 'purchase' AND amount < 0) OR amount > 10000 OR event_type NOT IN ('purchase','refund','login');

Quality Summary Dynamic Table

CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_dataops_quality.doc_quality_summary AS SELECT pipeline_run, check_ts, COUNT(*) AS total_rules, COUNT(*) FILTER (WHERE passed = true) AS passed_rules, COUNT(*) FILTER (WHERE passed = false) AS failed_rules, COUNT(*) FILTER (WHERE passed = false AND severity = 'ERROR') AS error_count, COUNT(*) FILTER (WHERE passed = false AND severity = 'WARNING') AS warning_count, ROUND(COUNT(*) FILTER (WHERE passed = true) * 1.0 / COUNT(*), 4) AS pass_rate, CASE WHEN COUNT(*) FILTER (WHERE passed = false AND severity = 'ERROR') > 0 THEN 'BLOCKED' WHEN COUNT(*) FILTER (WHERE passed = false AND severity = 'WARNING') > 0 THEN 'WARNING' ELSE 'PASSED' END AS gate_decision FROM best_practice_dataops_quality.doc_quality_results GROUP BY pipeline_run, check_ts;

gate_decision is the final gate verdict: any single failed ERROR rule outputs BLOCKED, and the downstream Gold layer's Studio Task will not trigger a refresh upon seeing BLOCKED.

Trigger the First Manual Refresh

REFRESH DYNAMIC TABLE best_practice_dataops_quality.doc_events_passed; REFRESH DYNAMIC TABLE best_practice_dataops_quality.doc_events_quarantine; REFRESH DYNAMIC TABLE best_practice_dataops_quality.doc_quality_summary;

View Passed Layer Results

SELECT COUNT(*) AS passed_count FROM best_practice_dataops_quality.doc_events_passed;

passed_count ------------ 23

View Quarantine Layer Distribution

SELECT quarantine_reason, COUNT(*) AS cnt FROM best_practice_dataops_quality.doc_events_quarantine GROUP BY quarantine_reason ORDER BY cnt DESC;

quarantine_reason | cnt -------------------+---- null_amount | 4 null_user_id | 3 amount_out_of_range| 2 negative_amount | 1

Results interpretation: Of 33 raw events, 23 passed all quality gates and entered the Silver layer; 10 were quarantined. The main quarantine reasons are amount null values (4 rows, e.g. EVT005/EVT014/EVT023/EVT033) and user_id null values (3 rows), followed by oversized amounts (EVT007/EVT019) and purchase type negative amounts (EVT012). Negative refund type amounts (EVT003/EVT011/EVT021/EVT028) are valid refunds and passed the quality gate to enter the Silver layer.

View Gate Decision

SELECT pipeline_run, total_rules, passed_rules, failed_rules, error_count, warning_count, pass_rate, gate_decision FROM best_practice_dataops_quality.doc_quality_summary;

pipeline_run | total_rules | passed_rules | failed_rules | error_count | warning_count | pass_rate | gate_decision ----------------+-------------+--------------+--------------+-------------+---------------+-----------+-------------- run_2026060601 | 10 | 5 | 5 | 3 | 2 | 0.5000 | BLOCKED

Results interpretation: In this run, 5 out of 10 rules failed, including 3 ERROR rules (user_id null rate, amount null rate, negative amount) and 2 WARNING rules (oversized amount, logical duplicates). gate_decision = BLOCKED means the downstream Gold layer should not refresh after this run.

View Rule Details

SELECT rule_id, rule_name, total_rows, failed_rows, fail_rate, threshold, passed, severity FROM best_practice_dataops_quality.doc_quality_results ORDER BY passed ASC, severity DESC, fail_rate DESC;

rule_id | rule_name | total_rows | failed_rows | fail_rate | threshold | passed | severity --------+-----------------------+------------+-------------+-----------+-----------+--------+--------- R010 | duplicate_events | 33 | 5 | 0.1515 | 0.05 | false | WARNING R004 | amount_range_check | 33 | 2 | 0.0606 | 0.01 | false | WARNING R002 | amount_not_null | 33 | 4 | 0.1212 | 0.05 | false | ERROR R001 | user_id_not_null | 33 | 3 | 0.0909 | 0.02 | false | ERROR R003 | amount_positive | 33 | 1 | 0.0303 | 0.0 | false | ERROR R009 | region_not_null | 33 | 0 | 0.0 | 0.0 | true | WARNING R005 | event_type_whitelist | 33 | 0 | 0.0 | 0.0 | true | ERROR R006 | status_whitelist | 33 | 0 | 0.0 | 0.0 | true | ERROR R007 | duplicate_event_id | 33 | 0 | 0.0 | 0.0 | true | ERROR R008 | ts_not_future | 33 | 0 | 0.0 | 0.0 | true | ERROR


Studio Task DAG Orchestration

Create Studio Task Refresh Jobs

Rather than writing REFRESH INTERVAL in Dynamic Table DDL, manage all refresh scheduling and dependencies uniformly in Lakehouse Studio.

In Lakehouse Studio under Development → Tasks, at path best_practices/dataops_quality/, create the following tasks in sequence:

  1. task_refresh_gate: Executes REFRESH DYNAMIC TABLE best_practice_dataops_quality.doc_events_passed and doc_events_quarantine

  2. task_refresh_summary: Executes REFRESH DYNAMIC TABLE best_practice_dataops_quality.doc_quality_summary

    • Depends on: task_refresh_gate
  3. task_gate_eval: Queries doc_quality_summary; if gate_decision = 'BLOCKED', interrupts and triggers an alert

    • Depends on: task_refresh_summary
  4. task_alert_webhook (optional): Calls a Webhook to push gate_decision + fail_rate to the Lark/DingTalk operations channel

    • Depends on: task_gate_eval (executes only on BLOCKED branch)
  5. task_refresh_gold: Executes downstream Gold layer refresh

    • Depends on: task_gate_eval (executes only on PASSED branch)

Configure scheduling: Cron expression 0/30 * * * ? (every 30 minutes), set in the task properties of task_refresh_gate.


sys.information_schema.job_history records the execution status of each Dynamic Table refresh, and can be used to track the run history of quality check tasks:

SELECT job_id, status, ROUND(execution_time, 2) AS exec_s, rows_produced, rows_inserted, start_time, SUBSTR(job_text, 1, 80) AS sql_preview FROM sys.information_schema.job_history WHERE pt_date = CAST(CURRENT_DATE() AS STRING) AND LOWER(job_text) LIKE '%best_practice_dataops_quality%' ORDER BY start_time DESC LIMIT 10;


Object Overview

SHOW TABLES IN best_practice_dataops_quality;

schema_name | table_name | is_dynamic ----------------------------------+-------------------------+----------- best_practice_dataops_quality | doc_events_passed | true best_practice_dataops_quality | doc_events_quarantine | true best_practice_dataops_quality | doc_quality_results | false best_practice_dataops_quality | doc_quality_rules | false best_practice_dataops_quality | doc_quality_summary | true best_practice_dataops_quality | doc_raw_events | false

Architecture structure:

doc_raw_events (Bronze, 33 rows) │ ├──[WHERE passed]──→ doc_events_passed (DT, 23 rows)──→ Gold Layer │ └──[WHERE failed]──→ doc_events_quarantine (DT, 10 rows) quarantine_reason: null_amount / null_user_id / amount_out_of_range / negative_amount (purchase only) doc_quality_rules (10 rules) │ └── [write check results] ──→ doc_quality_results (10 records/run) │ └──→ doc_quality_summary (DT) gate_decision: BLOCKED / WARNING / PASSED Studio Task DAG: task_refresh_gate → task_refresh_summary → task_gate_eval → (BLOCKED) task_alert_webhook → (PASSED) task_refresh_gold


Notes

  • Dynamic Tables do not use REFRESH INTERVAL: All Dynamic Table refresh cycles are managed uniformly in Studio Tasks. This allows attaching monitoring alerts, dependency conditions, and other rules to the same task, avoiding splitting DDL and scheduling configuration across two places.

  • Quarantine data is not automatically repaired: Data in doc_events_quarantine requires manual review to decide whether to repair and write back to the Bronze layer or discard. Consider running periodic cleanup tasks on the Quarantine table to avoid historical junk data accumulating continuously.

  • Semantics of FILTER (WHERE ...): COUNT(*) FILTER (WHERE condition) counts only rows satisfying the condition, equivalent to SUM(CASE WHEN condition THEN 1 ELSE 0 END), but with cleaner syntax. Not all aggregate functions support FILTERMEDIAN does not.

  • Dynamic Table first full refresh: The first REFRESH DYNAMIC TABLE performs a full scan of upstream; subsequent incremental refreshes only process rows that are new or changed since the last refresh point. If the Bronze layer uses INSERT OVERWRITE to write data, this causes the Dynamic Table to degrade to full refresh every time.

  • Gate thresholds need business adjustment: The example R001 threshold of 0.02 (2%) is reasonable for real-time data streams, but batch historical data migration scenarios may need temporary relaxation. It is recommended to maintain threshold versions in the doc_quality_rules table and restore strict thresholds after migration is complete.

  • gate_decision = BLOCKED does not affect already refreshed Silver data: BLOCKED only informs the downstream Gold layer not to refresh after this run. Data already in doc_events_passed will not be rolled back. If rollback is needed, use Time Travel (RESTORE TABLE ... TO TIMESTAMP) to restore to the snapshot before the last refresh.