Cumulative Calculation and Running Totals
Overview
Cumulative calculation (Running Total / Cumulative Aggregation) is a high-frequency requirement in data analysis: calculating cumulative sales up to a point in time, computing rolling N-day averages, tracking how much a metric has changed from its starting point. Singdata Lakehouse implements these calculations through window function ORDER BY clauses and frame specifications (ROWS BETWEEN) without requiring self-joins or subqueries.
SQL Syntax Covered
| Syntax | Purpose |
|---|---|
SUM(...) OVER (ORDER BY ...) | Cumulative sum (default frame: from first row to current row) |
ROWS BETWEEN N PRECEDING AND CURRENT ROW | Rolling N-row window |
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | From partition start to current row |
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | Entire partition |
FIRST_VALUE / LAST_VALUE | Get first/last row value in window |
LAG(col, n) | Get value of n rows before, used for period-over-period calculation |
Prerequisite Data
The following examples use two test tables:
Scenario 1: Cumulative Sum by Partition
Calculate the cumulative daily sales for each region.
Result:
| sale_date | region | amount | running_total |
|---|---|---|---|
| 2024-01-01 | East | 1200.00 | 1200.00 |
| 2024-01-02 | East | 850.00 | 2050.00 |
| 2024-01-03 | East | 1500.00 | 3550.00 |
| 2024-01-04 | East | 600.00 | 4150.00 |
| 2024-01-05 | East | 2100.00 | 6250.00 |
| 2024-01-01 | West | 900.00 | 900.00 |
| 2024-01-02 | West | 1100.00 | 2000.00 |
| 2024-01-03 | West | 750.00 | 2750.00 |
| 2024-01-04 | West | 1300.00 | 4050.00 |
| 2024-01-05 | West | 980.00 | 5030.00 |
Scenario 2: Cumulative Proportion of Total (YTD Progress)
Calculate each category's monthly YTD revenue and its proportion of total annual revenue.
Result:
| month_date | category | revenue | ytd_revenue | annual_total | ytd_pct |
|---|---|---|---|---|---|
| 2024-01-01 | Apparel | 50000.00 | 50000.00 | 215000.00 | 23.26 |
| 2024-02-01 | Apparel | 45000.00 | 95000.00 | 215000.00 | 44.19 |
| 2024-03-01 | Apparel | 62000.00 | 157000.00 | 215000.00 | 73.02 |
| 2024-04-01 | Apparel | 58000.00 | 215000.00 | 215000.00 | 100.00 |
| 2024-01-01 | Electronics | 120000.00 | 120000.00 | 498000.00 | 24.10 |
| 2024-02-01 | Electronics | 135000.00 | 255000.00 | 498000.00 | 51.20 |
| 2024-03-01 | Electronics | 98000.00 | 353000.00 | 498000.00 | 70.88 |
| 2024-04-01 | Electronics | 145000.00 | 498000.00 | 498000.00 | 100.00 |
Scenario 3: Rolling N-Row Window (Moving Average)
Calculate a 3-day moving average of sales for each region.
Result:
| sale_date | region | amount | moving_avg_3day |
|---|---|---|---|
| 2024-01-01 | East | 1200.00 | 1200.000000 |
| 2024-01-02 | East | 850.00 | 1025.000000 |
| 2024-01-03 | East | 1500.00 | 1183.333333 |
| 2024-01-04 | East | 600.00 | 983.333333 |
| 2024-01-05 | East | 2100.00 | 1400.000000 |
Scenario 4: Aggregate First, Then Roll (Two-Level Window)
First aggregate total daily sales across all regions, then calculate a 7-day rolling sum.
Result:
| sale_date | daily_total | rolling_7day_sum | rolling_7day_avg |
|---|---|---|---|
| 2024-01-01 | 2100.00 | 2100.00 | 2100.000000 |
| 2024-01-02 | 1950.00 | 4050.00 | 2025.000000 |
| 2024-01-03 | 2250.00 | 6300.00 | 2100.000000 |
| 2024-01-04 | 1900.00 | 8200.00 | 2050.000000 |
| 2024-01-05 | 3080.00 | 11280.00 | 2256.000000 |
Scenario 5: Cumulative Maximum / Minimum
Track the historical highest and lowest single-day sales for each region.
Result (East region only):
| sale_date | region | amount | running_max | running_min |
|---|---|---|---|---|
| 2024-01-01 | East | 1200.00 | 1200.00 | 1200.00 |
| 2024-01-02 | East | 850.00 | 1200.00 | 850.00 |
| 2024-01-03 | East | 1500.00 | 1500.00 | 850.00 |
| 2024-01-04 | East | 600.00 | 1500.00 | 600.00 |
| 2024-01-05 | East | 2100.00 | 2100.00 | 600.00 |
Scenario 6: Period-over-Period Change (MoM / DoD)
Calculate the month-over-month change amount and change rate for each category.
Result (Apparel category):
| month_date | category | revenue | prev_month | mom_change | mom_pct |
|---|---|---|---|---|---|
| 2024-01-01 | Apparel | 50000.00 | NULL | NULL | NULL |
| 2024-02-01 | Apparel | 45000.00 | 50000.00 | -5000.00 | -10.00 |
| 2024-03-01 | Apparel | 62000.00 | 45000.00 | 17000.00 | 37.78 |
| 2024-04-01 | Apparel | 58000.00 | 62000.00 | -4000.00 | -6.45 |
Scenario 7: Change from Starting Point (FIRST_VALUE)
Calculate each region's daily sales change relative to the first day, and the final day's sales.
Result (East region only):
| sale_date | region | amount | first_day_amount | last_day_amount | change_from_start |
|---|---|---|---|---|---|
| 2024-01-01 | East | 1200.00 | 1200.00 | 2100.00 | 0.00 |
| 2024-01-02 | East | 850.00 | 1200.00 | 2100.00 | -350.00 |
| 2024-01-03 | East | 1500.00 | 1200.00 | 2100.00 | 300.00 |
| 2024-01-04 | East | 600.00 | 1200.00 | 2100.00 | -600.00 |
| 2024-01-05 | East | 2100.00 | 1200.00 | 2100.00 | 900.00 |
Important Notes
- Frame Specification Defaults: When
ORDER BYis present, the default frame isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW(value-range based). For date columns with duplicate dates,RANGEandROWSmay produce different results. Use explicitROWS BETWEENwhen precise control is needed. - Two-Level Windows: When applying window functions over
GROUP BYaggregated results, the window function argument must wrap the aggregate function (e.g.,SUM(SUM(col))), otherwise an error will occur. - LAST_VALUE Trap: Without
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING,LAST_VALUEonly returns the current row's value (because the default frame stops at the current row). - NULL Handling:
LAG/LEADreturn NULL at boundaries. UseCOALESCE(LAG(col, 1), 0)to substitute default values.
