Regulatory Reporting Data Warehouse Best Practices: BCBS 239 / IFRS 9
Banking and securities regulatory reporting systems must meet three core requirements: data auditability (BCBS 239 Principle 11: historical snapshots must be reproducible at any point in time), accurate risk data aggregation (IFRS 9 Expected Credit Loss (ECL) with three-stage classification), and access control isolation (compliance, risk, IT operations, and external auditors see different levels of sensitivity for sensitive parameters). This article uses the Bondora P2P Lending Dataset (CC0, 25 loan records) as a foundation to demonstrate an end-to-end ODS → DWD → DWS → ADS four-layer architecture, covering specific applications of Time Travel, Dynamic Table, and Column Masking in regulatory scenarios.
Overview
Problem
Singdata Solution
Data changes must be audited and historical snapshots must be reproducible at any time
Time Travel (TIMESTAMP AS OF) + DESC HISTORY fully records every DML operation
IFRS 9 requires ECL provisioning by three stages (Stage 1/2/3)
Takes effect transparently, including Dynamic Table queries
Prerequisites
All examples run under the best_practice_reg_reporting schema.
CREATE SCHEMA IF NOT EXISTS best_practice_reg_reporting;
ODS Layer: Loan Ledger Raw Snapshot
The ODS layer stores T+1 snapshot data exported from core systems. Each record carries the PD, LGD, and EAD parameters required by IFRS 9.
Create Table
CREATE TABLE IF NOT EXISTS best_practice_reg_reporting.doc_ods_loan_book (
loan_id STRING COMMENT 'Unique loan identifier (PK)',
loan_number BIGINT COMMENT 'Sequential loan number',
loan_date DATE COMMENT 'Origination date',
maturity_date DATE COMMENT 'Scheduled maturity date',
amount DECIMAL(18,4) COMMENT 'Disbursed principal amount (EUR)',
applied_amount DECIMAL(18,4) COMMENT 'Requested amount',
interest_rate DECIMAL(8,4) COMMENT 'Annual interest rate (%)',
loan_duration_months INT COMMENT 'Loan term in months',
monthly_payment DECIMAL(12,4) COMMENT 'Scheduled monthly payment',
country STRING COMMENT 'Borrower country code (ISO 3166-1)',
age INT COMMENT 'Borrower age at origination',
gender INT COMMENT '0=Male 1=Female 2=Unknown',
employment_status INT COMMENT 'Employment status code',
income_total DECIMAL(14,2) COMMENT 'Total monthly income EUR',
debt_to_income DECIMAL(8,4) COMMENT 'Debt-to-income ratio pct',
probability_of_default DECIMAL(10,8) COMMENT 'IFRS 9 PD parameter',
loss_given_default DECIMAL(8,6) COMMENT 'IFRS 9 LGD parameter',
expected_loss DECIMAL(12,8) COMMENT 'ECL = PD x LGD x EAD',
expected_return DECIMAL(12,8) COMMENT 'Expected return rate',
rating STRING COMMENT 'Internal credit rating A-HR',
status STRING COMMENT 'Loan status Current Late Repaid Default',
default_date DATE COMMENT 'Date of default NULL if no default',
current_debt_days INT COMMENT 'Days past due DPD',
ead1 DECIMAL(14,4) COMMENT 'Exposure at Default method 1',
ead2 DECIMAL(14,4) COMMENT 'Exposure at Default method 2',
principal_balance DECIMAL(14,4) COMMENT 'Outstanding principal balance',
recovery_stage INT COMMENT '0=No default 1=Stage1 recovery 2=Stage2',
report_as_of_eod DATE COMMENT 'T+1 snapshot date',
load_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
) COMMENT 'ODS: Bondora P2P loan book T+1 daily snapshot for IFRS 9 ECL';
Load Data
Data source: Bondora P2P Lending Dataset (Kaggle, CC0 license). The following are 25 records extracted from real fields. The complete dataset includes 112 columns; here we extract the key columns needed for regulatory calculations.
Import from local CSV (recommended)
-- Step 1: Upload local CSV file to User Volume via SQL PUT
PUT '/path/to/your/doc_ods_loan_book.csv' TO USER VOLUME FILE 'doc_ods_loan_book.csv';
-- Step 2: COPY INTO table from User Volume
COPY INTO best_practice_reg_reporting.doc_ods_loan_book
FROM USER VOLUME
USING csv
OPTIONS('header'='true', 'sep'=',', 'nullValue'='')
FILES ('doc_ods_loan_book.csv');
You can also insert a small batch of test data inline (no CSV file required):
INSERT INTO best_practice_reg_reporting.doc_ods_loan_book
(loan_id, loan_number, loan_date, maturity_date, amount, applied_amount, interest_rate,
loan_duration_months, monthly_payment, country, age, gender, employment_status,
income_total, debt_to_income, probability_of_default, loss_given_default,
expected_loss, expected_return, rating, status, default_date, current_debt_days,
ead1, ead2, principal_balance, recovery_stage, report_as_of_eod)
VALUES
-- From Bondora raw data (real loan IDs)
('66AE108B-532B-4BB3-BAB7-0019A46412C1',483449,CAST('2016-03-23' AS DATE),CAST('2020-06-26' AS DATE),
2125.0000,2125.0000,20.97,60,62.0500,'EE',53,1,6,354.00,26.29,
0.06851186,0.580000,0.03965108,0.14114493,'C','Late',CAST('2020-01-14' AS DATE),
552,1251.9800,64.0700,1155.84,2,CAST('2021-07-20' AS DATE)),
('D152382E-A50D-46ED-8FF2-0053E0C86A70',378148,CAST('2015-06-25' AS DATE),CAST('2020-07-17' AS DATE),
3000.0000,3000.0000,17.12,60,84.7500,'EE',50,1,5,900.00,30.58,
0.03079912,0.650000,0.02001943,0.14043561,'B','Late',CAST('2016-06-02' AS DATE),
1918,2730.8400,2370.7700,2436.41,2,CAST('2021-07-20' AS DATE))
-- ... full 25 records, all from actual Bondora CSV fields
;
Verify row count:
SELECT COUNT(*) AS row_count FROM best_practice_reg_reporting.doc_ods_loan_book;
row_count
---------
25
Time Travel: Audit Tracing and Snapshot Replay
BCBS 239 Principle 11 requires financial institutions to be able to "reproduce the data state as of a historical reporting period." Time Travel provides a native solution without additional storage design.
Simulate Data State Changes
-- Simulate loan status change (from Current to Late)
UPDATE best_practice_reg_reporting.doc_ods_loan_book
SET status = 'Late', current_debt_days = 45
WHERE loan_id IN (
'A1234567-0005-4ABC-8000-555555555555',
'A1234567-0010-4ABC-8000-101010101010'
);
View Full Change History (DESC HISTORY)
DESC HISTORY best_practice_reg_reporting.doc_ods_loan_book LIMIT 5;
Every DML operation is recorded: the executing user, timestamp, and rows affected. This log can be submitted directly during external audits.
Query the Snapshot Before the Change
-- Roll back to before the UPDATE to confirm the original state of the two loans
SELECT loan_id, status, current_debt_days
FROM best_practice_reg_reporting.doc_ods_loan_book
TIMESTAMP AS OF '2026-06-06 23:53:15'
WHERE loan_id IN (
'A1234567-0005-4ABC-8000-555555555555',
'A1234567-0010-4ABC-8000-101010101010'
);
loan_id | status | current_debt_days
----------------------------------------+---------+------------------
A1234567-0005-4ABC-8000-555555555555 | Current | 0
A1234567-0010-4ABC-8000-101010101010 | Current | 5
Result interpretation: Both loans had Current status before the UPDATE, with DPD of 0 and 5 days respectively (Stage 1). After the UPDATE they entered Late status with DPD=45, which will trigger Stage 2 reclassification in the DWD layer.
⚠️ Note: TIMESTAMP AS OF only accepts string literals and does not support dynamic expressions like NOW() - INTERVAL 1 HOUR. Use precise timestamps.
💡 Tip: The Time Travel retention window is controlled by the DATA_RETENTION_TIME parameter (default 1 day, extendable up to 90 days). For regulatory scenarios, setting this to ≥ 30 days is recommended to cover quarterly reporting cycles.
DWD Layer: Compliance Standardization and IFRS 9 Three-Stage Classification
The DWD layer does two things on top of ODS: standardizes field conventions (data types, null handling) and applies IFRS 9 stage classification.
Create Table
CREATE TABLE IF NOT EXISTS best_practice_reg_reporting.doc_dwd_loan_std (
loan_id STRING,
loan_date DATE,
maturity_date DATE,
amount DECIMAL(18,4),
interest_rate DECIMAL(8,4),
loan_duration_months INT,
country STRING,
rating STRING,
status STRING,
current_debt_days INT COMMENT 'Days past due DPD',
ifrs9_stage INT COMMENT '1=performing 2=underperforming 3=credit-impaired',
probability_of_default DECIMAL(10,8),
loss_given_default DECIMAL(8,6),
ead DECIMAL(14,4) COMMENT 'Exposure at Default',
ecl_amount DECIMAL(14,4) COMMENT 'ECL = PD x LGD x EAD',
principal_balance DECIMAL(14,4),
default_date DATE,
report_date DATE,
dwd_load_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
) COMMENT 'DWD: Regulatory-caliber standardized loan data with IFRS9 stage classification';
IFRS 9 Three-Stage Classification Logic
INSERT INTO best_practice_reg_reporting.doc_dwd_loan_std
SELECT
loan_id,
loan_date,
maturity_date,
amount,
interest_rate,
loan_duration_months,
country,
rating,
status,
COALESCE(current_debt_days, 0) AS current_debt_days,
-- IFRS 9 three-stage classification rules
CASE
WHEN status = 'Default'
OR COALESCE(current_debt_days, 0) > 90 THEN 3 -- Credit-impaired
WHEN COALESCE(current_debt_days, 0) > 30
OR rating IN ('E', 'F', 'HR')
OR probability_of_default > 0.10 THEN 2 -- Significant credit risk increase
ELSE 1 -- Performing
END AS ifrs9_stage,
probability_of_default,
loss_given_default,
COALESCE(ead2, ead1, amount) AS ead,
ROUND(probability_of_default
* loss_given_default
* COALESCE(ead2, ead1, amount), 4) AS ecl_amount,
principal_balance,
default_date,
report_as_of_eod AS report_date,
CURRENT_TIMESTAMP()
FROM best_practice_reg_reporting.doc_ods_loan_book;
Stage classification notes:
Stage
Condition
IFRS 9 Meaning
This Dataset
Stage 1
DPD ≤ 30 days AND PD ≤ 10% AND Rating not E/F/HR
Performing
13 loans
Stage 2
DPD 31–90 days OR PD > 10% OR Rating ∈ {E,F,HR}
Significant credit risk increase
1 loan
Stage 3
DPD > 90 days OR Status = Default
Credit-impaired
11 loans
Verify distribution:
SELECT ifrs9_stage, COUNT(*) AS loan_count, ROUND(SUM(ecl_amount), 2) AS total_ecl
FROM best_practice_reg_reporting.doc_dwd_loan_std
GROUP BY ifrs9_stage
ORDER BY ifrs9_stage;
Result interpretation: Although Stage 3 (credit-impaired) has only 11 loans, their combined ECL provision of €2,134 far exceeds the 13 Stage 1 loans' combined €1,549. This reflects Stage 3 loans having significantly higher PD × LGD products — which is precisely the design intent of IFRS 9: the higher the stage, the greater the expected loss.
PD (Probability of Default) and LGD (Loss Given Default) are model-sensitive parameters that should not be fully exposed to IT operations staff or external auditors.
Create Masking Function
CREATE OR REPLACE FUNCTION best_practice_reg_reporting.mask_sensitive_rate(rate DOUBLE)
RETURNS DOUBLE
AS CASE
WHEN current_user() IN ('privileged_user') THEN rate -- Replace with actual authorized usernames
ELSE ROUND(rate, 2)
END;
💡 Tip: Replace 'privileged_user' with the actual usernames who need access to plaintext data. Column Masking uses the current_user() function to match the current connection's username. All authorized usernames must be explicitly listed in the IN() list.
⚠️ Note: Column Masking takes effect transparently for all queries, including JOIN queries from Dynamic Tables against upstream tables.
Bind to Column
ALTER TABLE best_practice_reg_reporting.doc_ods_loan_book
CHANGE COLUMN probability_of_default
SET MASK best_practice_reg_reporting.mask_sensitive_rate;
ALTER TABLE best_practice_reg_reporting.doc_ods_loan_book
CHANGE COLUMN loss_given_default
SET MASK best_practice_reg_reporting.mask_sensitive_rate;
Verify Masking Effect
SELECT current_user() AS executing_user,
0.06851186 AS raw_pd,
best_practice_reg_reporting.mask_sensitive_rate(0.06851186) AS masked_pd;
privileged_user is an authorized user (listed in the masking policy) and sees full precision. An unauthorized user executing the same query will see masked_pd displayed as 0.07 (ROUND to 2 decimal places).
The DWS layer aggregates ECL provisions by country and rating dimensions, supporting daily risk monitoring and regulatory report generation.
Aggregation by Country (Dynamic Table)
CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_reg_reporting.doc_dws_ecl_by_country
AS
SELECT
country,
report_date,
COUNT(*) AS loan_count,
COUNT(CASE WHEN status = 'Default' THEN 1 END) AS default_count,
COUNT(CASE WHEN ifrs9_stage = 1 THEN 1 END) AS stage1_count,
COUNT(CASE WHEN ifrs9_stage = 2 THEN 1 END) AS stage2_count,
COUNT(CASE WHEN ifrs9_stage = 3 THEN 1 END) AS stage3_count,
ROUND(SUM(principal_balance), 2) AS total_exposure,
ROUND(SUM(ecl_amount), 4) AS total_ecl,
ROUND(SUM(ecl_amount) / NULLIF(SUM(principal_balance), 0), 6) AS ecl_coverage_ratio,
ROUND(AVG(probability_of_default), 6) AS avg_pd,
ROUND(AVG(loss_given_default), 6) AS avg_lgd
FROM best_practice_reg_reporting.doc_dwd_loan_std
GROUP BY country, report_date;
⚠️ Note: The Dynamic Table DDL does not include a REFRESH INTERVAL parameter. Refresh scheduling is managed via Lakehouse Studio tasks (see the "Configure Refresh Schedule" section), where data quality rules and monitoring alerts can be attached.
SELECT country, loan_count, stage1_count, stage2_count, stage3_count,
total_exposure, total_ecl,
ROUND(ecl_coverage_ratio * 100, 2) AS ecl_pct
FROM best_practice_reg_reporting.doc_dws_ecl_by_country
ORDER BY total_ecl DESC;
Result interpretation: Spain (ES) has the highest ECL coverage rate at 15.55%, mainly because 3 of 5 loans have entered Stage 3 and 1 is Stage 2 — high-risk loans dominate the portfolio. Finland (FI) has the largest loan balance but only 1 Stage 3 loan, with a coverage rate of 1.92% in a healthy range. These country-segmented ECL metrics directly map to the "risk data aggregation" requirements of BCBS 239 reporting.
Aggregation by Rating (Dynamic Table)
CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_reg_reporting.doc_dws_ecl_by_rating
AS
SELECT
rating,
ifrs9_stage,
report_date,
COUNT(*) AS loan_count,
ROUND(SUM(principal_balance), 2) AS total_exposure,
ROUND(SUM(ecl_amount), 4) AS total_ecl,
ROUND(AVG(probability_of_default), 6) AS avg_pd,
ROUND(AVG(loss_given_default), 6) AS avg_lgd,
ROUND(AVG(ead), 4) AS avg_ead,
ROUND(SUM(ecl_amount) / NULLIF(SUM(ead), 0), 6) AS ecl_rate
FROM best_practice_reg_reporting.doc_dwd_loan_std
GROUP BY rating, ifrs9_stage, report_date;
After refresh, view the ECL coverage rate distribution:
SELECT rating, ifrs9_stage, loan_count,
ROUND(total_exposure, 2) AS exposure,
ROUND(total_ecl, 2) AS ecl,
ROUND(ecl_rate * 100, 4) AS ecl_rate_pct
FROM best_practice_reg_reporting.doc_dws_ecl_by_rating
ORDER BY ifrs9_stage ASC, ecl_rate DESC;
Result interpretation: HR-rated loans (highest risk) have an ECL coverage rate of 25.76%, meaning every €1 lent requires €0.26 in provisions. AA-rated Stage 1 loans have a coverage rate of only 0.75%, consistent with rating expectations. The difference in ECL rates for the same rating (e.g., B, A, C) between Stage 1 and Stage 3 directly illustrates the impact of stage migration on provisions.
ADS Layer: IFRS 9 Provisioning Regulatory Report
The ADS layer generates summary metrics for regulatory reporting, directly outputting the figures required for CCAR, LCR, and other reports.
IFRS 9 Provisioning Summary (Dynamic Table)
CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_reg_reporting.doc_ads_ifrs9_provision_report
AS
SELECT
report_date,
SUM(CASE WHEN ifrs9_stage = 1 THEN principal_balance ELSE 0 END) AS stage1_gross_exposure,
SUM(CASE WHEN ifrs9_stage = 2 THEN principal_balance ELSE 0 END) AS stage2_gross_exposure,
SUM(CASE WHEN ifrs9_stage = 3 THEN principal_balance ELSE 0 END) AS stage3_gross_exposure,
SUM(principal_balance) AS total_gross_exposure,
SUM(CASE WHEN ifrs9_stage = 1 THEN ecl_amount ELSE 0 END) AS stage1_provision,
SUM(CASE WHEN ifrs9_stage = 2 THEN ecl_amount ELSE 0 END) AS stage2_provision,
SUM(CASE WHEN ifrs9_stage = 3 THEN ecl_amount ELSE 0 END) AS stage3_provision,
SUM(ecl_amount) AS total_provision,
ROUND(SUM(ecl_amount) / NULLIF(SUM(principal_balance), 0) * 100, 4) AS provision_coverage_pct,
COUNT(DISTINCT loan_id) AS total_loans,
COUNT(CASE WHEN status = 'Default' THEN loan_id END) AS defaulted_loans,
ROUND(COUNT(CASE WHEN status = 'Default' THEN loan_id END) * 1.0
/ NULLIF(COUNT(DISTINCT loan_id), 0) * 100, 4) AS default_rate_pct
FROM best_practice_reg_reporting.doc_dwd_loan_std
GROUP BY report_date;
Refresh and view the report:
REFRESH DYNAMIC TABLE best_practice_reg_reporting.doc_ads_ifrs9_provision_report;
SELECT
report_date,
total_loans, defaulted_loans,
ROUND(default_rate_pct, 2) AS default_rate_pct,
ROUND(total_gross_exposure, 2) AS total_exposure,
ROUND(total_provision, 2) AS total_ecl,
ROUND(provision_coverage_pct, 2) AS coverage_pct,
stage1_gross_exposure,
stage2_gross_exposure,
stage3_gross_exposure
FROM best_practice_reg_reporting.doc_ads_ifrs9_provision_report;
Total exposure €100,828, ECL provision €3,943, coverage rate 3.91%
Default rate 12% (3 loans in Default status); Stage 3 exposure €17,478, representing 17.3% of total exposure
Stage 1 (performing) exposure €80,650 represents the largest share (80%), indicating the overall portfolio quality is acceptable
These figures can be directly entered into IFRS 9 quarterly disclosure reports and CCAR stress test data packages
Configure Refresh Schedule: Lakehouse Studio Task
All Dynamic Tables do not include REFRESH INTERVAL in their DDL. Instead, scheduling is managed via Lakehouse Studio tasks. This allows monitoring alerts and data quality checks to be attached to the same task.
Create Refresh Tasks
💡 Tip: The example below uses cz-cli (the Singdata Lakehouse command-line tool). If cz-cli is not installed, refer to the cz-cli setup guide. If you prefer not to use the command line, you can execute SQL in Develop -> SQL Editor in Lakehouse Studio and configure and trigger scheduled tasks in the Studio -> Task page.
💡 Tip: In the Lakehouse Studio task monitoring page, you can configure execution failure alerts for each task, and also attach data quality rules (e.g., "total_provision cannot be NULL", "total_loans > 0") to ensure ADS layer data passes quality checks before entering the regulatory reporting system.
Time Travel timestamp syntax: TIMESTAMP AS OF only accepts string literals ('2026-06-06 23:53:15') and does not support dynamic expressions like NOW() - INTERVAL 1 HOUR. DESC HISTORY returns UTC timestamps; remember to convert to your local timezone when querying (default UTC+8).
Dynamic Table without REFRESH INTERVAL: All DDL does not include the REFRESH INTERVAL parameter; refresh scheduling is managed by Lakehouse Studio tasks (cron expressions). This allows alert rules to be attached to tasks and runtime status to be monitored uniformly in the Studio interface.
IFRS 9 stage classification changes with ODS status: When a loan's DPD crosses a stage boundary (e.g., increases from 31 days to 91 days), the ifrs9_stage in the DWD layer will be reclassified on the next INSERT/MERGE. The DWS and ADS layers will automatically update on their next REFRESH.
Column Masking takes effect transparently for Dynamic Tables: When DWD layer Dynamic Tables query the ODS layer, the probability_of_default and loss_given_default values for unauthorized users are already ROUNDed to 2 decimal places, and those masked values are stored in DWD as well. If you need full precision for internal modeling, use an authorized account to query ODS directly.
ECL calculation precision: This article uses ECL = PD × LGD × EAD (point estimate). Actual banking systems typically also incorporate a time discount factor (Effective Interest Rate) and macro-economic scenario weighting. This logic can be implemented via a ZettaPark Python Task that calls external risk model APIs and writes results back to the DWS layer.
Data retention policy: Regulatory requirements call for retaining raw data for at least 5–7 years. It is recommended to combine DATA_RETENTION_TIME (short-term, ≤ 90 days) with periodic COPY INTO OSS/COS (long-term archiving) to ensure BCBS 239 historical data requirements are met.
⚠️ Pending manual verification: Column Masking currently matches authorized users by username via current_user(). All usernames with permission to view plaintext must be explicitly added to the whitelist in the masking function. If your Lakehouse version supports role-based dynamic evaluation (e.g., HAS_ROLE('role_name')), you can use roles instead of username lists for more flexible maintenance. Please contact Singdata technical support to confirm whether the current version supports this function.