Cumulative Calculation and Running Totals

Overview

Cumulative calculation (Running Total / Cumulative Aggregation) is a high-frequency requirement in data analysis: calculating cumulative sales up to a point in time, computing rolling N-day averages, tracking how much a metric has changed from its starting point. Singdata Lakehouse implements these calculations through window function ORDER BY clauses and frame specifications (ROWS BETWEEN) without requiring self-joins or subqueries.

SQL Syntax Covered

SyntaxPurpose
SUM(...) OVER (ORDER BY ...)Cumulative sum (default frame: from first row to current row)
ROWS BETWEEN N PRECEDING AND CURRENT ROWRolling N-row window
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWFrom partition start to current row
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGEntire partition
FIRST_VALUE / LAST_VALUEGet first/last row value in window
LAG(col, n)Get value of n rows before, used for period-over-period calculation

Prerequisite Data

The following examples use two test tables:

-- Daily sales details table
CREATE TABLE IF NOT EXISTS doc_running_sales (
  sale_date DATE,
  region    VARCHAR(20),
  amount    DECIMAL(10,2)
);

INSERT INTO doc_running_sales VALUES
  (CAST('2024-01-01' AS DATE), 'East', 1200.00),
  (CAST('2024-01-02' AS DATE), 'East',  850.00),
  (CAST('2024-01-03' AS DATE), 'East', 1500.00),
  (CAST('2024-01-04' AS DATE), 'East',  600.00),
  (CAST('2024-01-05' AS DATE), 'East', 2100.00),
  (CAST('2024-01-01' AS DATE), 'West',  900.00),
  (CAST('2024-01-02' AS DATE), 'West', 1100.00),
  (CAST('2024-01-03' AS DATE), 'West',  750.00),
  (CAST('2024-01-04' AS DATE), 'West', 1300.00),
  (CAST('2024-01-05' AS DATE), 'West',  980.00);

-- Monthly revenue table
CREATE TABLE IF NOT EXISTS doc_running_monthly (
  month_date DATE,
  category   VARCHAR(20),
  revenue    DECIMAL(12,2)
);

INSERT INTO doc_running_monthly VALUES
  (CAST('2024-01-01' AS DATE), 'Apparel',   50000.00),
  (CAST('2024-02-01' AS DATE), 'Apparel',   45000.00),
  (CAST('2024-03-01' AS DATE), 'Apparel',   62000.00),
  (CAST('2024-04-01' AS DATE), 'Apparel',   58000.00),
  (CAST('2024-01-01' AS DATE), 'Electronics', 120000.00),
  (CAST('2024-02-01' AS DATE), 'Electronics', 135000.00),
  (CAST('2024-03-01' AS DATE), 'Electronics',  98000.00),
  (CAST('2024-04-01' AS DATE), 'Electronics', 145000.00);

Scenario 1: Cumulative Sum by Partition

Calculate the cumulative daily sales for each region.

SELECT
  sale_date,
  region,
  amount,
  SUM(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
  ) AS running_total
FROM doc_running_sales
ORDER BY region, sale_date;

Result:

sale_dateregionamountrunning_total
2024-01-01East1200.001200.00
2024-01-02East850.002050.00
2024-01-03East1500.003550.00
2024-01-04East600.004150.00
2024-01-05East2100.006250.00
2024-01-01West900.00900.00
2024-01-02West1100.002000.00
2024-01-03West750.002750.00
2024-01-04West1300.004050.00
2024-01-05West980.005030.00

Scenario 2: Cumulative Proportion of Total (YTD Progress)

Calculate each category's monthly YTD revenue and its proportion of total annual revenue.

SELECT
  month_date,
  category,
  revenue,
  SUM(revenue) OVER (
    PARTITION BY category
    ORDER BY month_date
  )                                                          AS ytd_revenue,
  SUM(revenue) OVER (PARTITION BY category)                 AS annual_total,
  ROUND(
    SUM(revenue) OVER (PARTITION BY category ORDER BY month_date)
    / SUM(revenue) OVER (PARTITION BY category) * 100,
    2
  )                                                          AS ytd_pct
FROM doc_running_monthly
ORDER BY category, month_date;

Result:

month_datecategoryrevenueytd_revenueannual_totalytd_pct
2024-01-01Apparel50000.0050000.00215000.0023.26
2024-02-01Apparel45000.0095000.00215000.0044.19
2024-03-01Apparel62000.00157000.00215000.0073.02
2024-04-01Apparel58000.00215000.00215000.00100.00
2024-01-01Electronics120000.00120000.00498000.0024.10
2024-02-01Electronics135000.00255000.00498000.0051.20
2024-03-01Electronics98000.00353000.00498000.0070.88
2024-04-01Electronics145000.00498000.00498000.00100.00

Scenario 3: Rolling N-Row Window (Moving Average)

Calculate a 3-day moving average of sales for each region.

SELECT
  sale_date,
  region,
  amount,
  AVG(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  ) AS moving_avg_3day
FROM doc_running_sales
ORDER BY region, sale_date;

Result:

sale_dateregionamountmoving_avg_3day
2024-01-01East1200.001200.000000
2024-01-02East850.001025.000000
2024-01-03East1500.001183.333333
2024-01-04East600.00983.333333
2024-01-05East2100.001400.000000

Scenario 4: Aggregate First, Then Roll (Two-Level Window)

First aggregate total daily sales across all regions, then calculate a 7-day rolling sum.

SELECT
  sale_date,
  SUM(amount)                                                    AS daily_total,
  SUM(SUM(amount)) OVER (
    ORDER BY sale_date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  )                                                              AS rolling_7day_sum,
  AVG(SUM(amount)) OVER (
    ORDER BY sale_date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  )                                                              AS rolling_7day_avg
FROM doc_running_sales
GROUP BY sale_date
ORDER BY sale_date;

Result:

sale_datedaily_totalrolling_7day_sumrolling_7day_avg
2024-01-012100.002100.002100.000000
2024-01-021950.004050.002025.000000
2024-01-032250.006300.002100.000000
2024-01-041900.008200.002050.000000
2024-01-053080.0011280.002256.000000

Scenario 5: Cumulative Maximum / Minimum

Track the historical highest and lowest single-day sales for each region.

SELECT
  sale_date,
  region,
  amount,
  MAX(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS running_max,
  MIN(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS running_min
FROM doc_running_sales
ORDER BY region, sale_date;

Result (East region only):

sale_dateregionamountrunning_maxrunning_min
2024-01-01East1200.001200.001200.00
2024-01-02East850.001200.00850.00
2024-01-03East1500.001500.00850.00
2024-01-04East600.001500.00600.00
2024-01-05East2100.002100.00600.00

Scenario 6: Period-over-Period Change (MoM / DoD)

Calculate the month-over-month change amount and change rate for each category.

SELECT
  month_date,
  category,
  revenue,
  LAG(revenue, 1) OVER (
    PARTITION BY category
    ORDER BY month_date
  )                                                              AS prev_month,
  revenue - LAG(revenue, 1) OVER (
    PARTITION BY category
    ORDER BY month_date
  )                                                              AS mom_change,
  ROUND(
    (revenue - LAG(revenue, 1) OVER (PARTITION BY category ORDER BY month_date))
    / LAG(revenue, 1) OVER (PARTITION BY category ORDER BY month_date) * 100,
    2
  )                                                              AS mom_pct
FROM doc_running_monthly
ORDER BY category, month_date;

Result (Apparel category):

month_datecategoryrevenueprev_monthmom_changemom_pct
2024-01-01Apparel50000.00NULLNULLNULL
2024-02-01Apparel45000.0050000.00-5000.00-10.00
2024-03-01Apparel62000.0045000.0017000.0037.78
2024-04-01Apparel58000.0062000.00-4000.00-6.45

Scenario 7: Change from Starting Point (FIRST_VALUE)

Calculate each region's daily sales change relative to the first day, and the final day's sales.

SELECT
  sale_date,
  region,
  amount,
  FIRST_VALUE(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
  )                                                              AS first_day_amount,
  LAST_VALUE(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  )                                                              AS last_day_amount,
  amount - FIRST_VALUE(amount) OVER (
    PARTITION BY region
    ORDER BY sale_date
  )                                                              AS change_from_start
FROM doc_running_sales
ORDER BY region, sale_date;

Result (East region only):

sale_dateregionamountfirst_day_amountlast_day_amountchange_from_start
2024-01-01East1200.001200.002100.000.00
2024-01-02East850.001200.002100.00-350.00
2024-01-03East1500.001200.002100.00300.00
2024-01-04East600.001200.002100.00-600.00
2024-01-05East2100.001200.002100.00900.00

Important Notes

  • Frame Specification Defaults: When ORDER BY is present, the default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (value-range based). For date columns with duplicate dates, RANGE and ROWS may produce different results. Use explicit ROWS BETWEEN when precise control is needed.
  • Two-Level Windows: When applying window functions over GROUP BY aggregated results, the window function argument must wrap the aggregate function (e.g., SUM(SUM(col))), otherwise an error will occur.
  • LAST_VALUE Trap: Without ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, LAST_VALUE only returns the current row's value (because the default frame stops at the current row).
  • NULL Handling: LAG / LEAD return NULL at boundaries. Use COALESCE(LAG(col, 1), 0) to substitute default values.