SQL Pivot Tables and Cross-Tab Reports — Reshaping Data for Analytics and Dashboards

One of the most common requests from analytics teams is “can you make it so the months are columns instead of rows?” This is pivoting — transforming row values into columns. It sounds simple but requires careful SQL, especially when the column names are not known in advance.

This tutorial covers static pivots, conditional aggregation, and dynamic pivots across PostgreSQL, Spark SQL, and Snowflake.


The Dataset

Monthly sales data in a long format — the standard output of most pipeline aggregations.

CREATE TABLE monthly_sales (
    sale_year    INT,
    sale_month   INT,
    category     VARCHAR(30),
    region       VARCHAR(20),
    revenue      DECIMAL(12,2),
    order_count  INT
);

INSERT INTO monthly_sales VALUES
(2024, 1, 'Hardware', 'North', 12000.00, 45),
(2024, 1, 'Software', 'North',  3000.00, 30),
(2024, 1, 'Hardware', 'South',  8000.00, 32),
(2024, 1, 'Software', 'South',  2000.00, 18),
(2024, 2, 'Hardware', 'North', 15000.00, 52),
(2024, 2, 'Software', 'North',  4500.00, 40),
(2024, 2, 'Hardware', 'South',  9000.00, 38),
(2024, 2, 'Software', 'South',  2800.00, 22),
(2024, 3, 'Hardware', 'North',  9000.00, 35),
(2024, 3, 'Software', 'North',  5000.00, 45),
(2024, 3, 'Hardware', 'South', 11000.00, 42),
(2024, 3, 'Software', 'South',  3200.00, 28);

Long format (how data looks in the pipeline):

sale_yearsale_monthcategoryregionrevenue
20241HardwareNorth12000.00
20241SoftwareNorth3000.00
20242HardwareNorth15000.00

Pivot target (what the analytics team wants):

categoryregionJanFebMarTotal
HardwareNorth1200015000900036000
HardwareSouth800090001100028000
SoftwareNorth30004500500012500
SoftwareSouth2000280032008000

Method 1 — Conditional Aggregation (Universal SQL)

This works in every SQL database. You use SUM(CASE WHEN ...) to manually create each column.

SELECT
    category,
    region,
    SUM(CASE WHEN sale_month = 1 THEN revenue ELSE 0 END) AS jan,
    SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END) AS feb,
    SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END) AS mar,
    SUM(revenue) AS total
FROM monthly_sales
WHERE sale_year = 2024
GROUP BY category, region
ORDER BY category, region;

Output:

categoryregionjanfebmartotal
HardwareNorth12000.0015000.009000.0036000.00
HardwareSouth8000.009000.0011000.0028000.00
SoftwareNorth3000.004500.005000.0012500.00
SoftwareSouth2000.002800.003200.008000.00

Exactly the format the analytics team requested.

Adding percentage columns:

SELECT
    category,
    region,
    SUM(CASE WHEN sale_month = 1 THEN revenue ELSE 0 END) AS jan,
    SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END) AS feb,
    SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END) AS mar,
    SUM(revenue) AS total,
    ROUND(
        SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END)
        * 100.0
        / NULLIF(SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END), 0)
        - 100,
        1
    ) AS mar_vs_feb_pct
FROM monthly_sales
WHERE sale_year = 2024
GROUP BY category, region
ORDER BY category, region;

Output includes:

categoryregionjanfebmartotalmar_vs_feb_pct
HardwareNorth12000.0015000.009000.0036000.00-40.0%
SoftwareNorth3000.004500.005000.0012500.00+11.1%

Method 2 — PIVOT Syntax (Snowflake, SQL Server, Oracle)

Snowflake and SQL Server support native PIVOT syntax which is more concise for known column values.

-- Snowflake syntax
SELECT *
FROM monthly_sales
PIVOT (
    SUM(revenue)
    FOR sale_month IN (1 AS jan, 2 AS feb, 3 AS mar)
)
ORDER BY category, region;

Output: Same result as Method 1, written with less code.

The limitation: the values in IN (1, 2, 3) must be hardcoded. If you want to pivot on a dynamic set of values (all distinct months in a table), PIVOT syntax cannot do it — you need dynamic SQL.


Method 3 — Unpivot (Wide to Long)

Sometimes you receive data in wide format and need to convert it to long format for your pipeline. UNPIVOT is the reverse of PIVOT.

-- Snowflake / SQL Server UNPIVOT
SELECT category, region, month_name, revenue
FROM (
    SELECT category, region, 12000.00 AS jan, 15000.00 AS feb, 9000.00 AS mar
    WHERE category = 'Hardware' AND region = 'North'
) src
UNPIVOT (
    revenue FOR month_name IN (jan, feb, mar)
) AS unpvt;

PostgreSQL alternative using CROSS JOIN LATERAL:

-- Convert wide format to long using VALUES()
SELECT
    category,
    region,
    month_name,
    revenue
FROM (
    SELECT 'Hardware' AS category, 'North' AS region,
           12000.00 AS jan, 15000.00 AS feb, 9000.00 AS mar
) wide_data
CROSS JOIN LATERAL (
    VALUES ('jan', jan), ('feb', feb), ('mar', mar)
) AS months(month_name, revenue);

Output:

categoryregionmonth_namerevenue
HardwareNorthjan12000.00
HardwareNorthfeb15000.00
HardwareNorthmar9000.00

Method 4 — Dynamic Pivot (When Columns Are Not Known in Advance)

The hardest case: you want to pivot on a column where the distinct values are not known at query-write time — for example, pivoting by product name when new products are added monthly.

PostgreSQL — dynamic pivot with plpgsql:

-- Step 1: generate the CASE WHEN list dynamically
SELECT
    'SUM(CASE WHEN sale_month = ' || sale_month || ' THEN revenue ELSE 0 END) AS month_' || sale_month
FROM (SELECT DISTINCT sale_month FROM monthly_sales ORDER BY sale_month) months;

Output of Step 1 (the generated SQL fragments):

SUM(CASE WHEN sale_month = 1 THEN revenue ELSE 0 END) AS month_1
SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END) AS month_2
SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END) AS month_3
-- Step 2: assemble and execute dynamically (PostgreSQL plpgsql)
DO $$
DECLARE
    col_list TEXT;
    sql      TEXT;
BEGIN
    SELECT STRING_AGG(
        'SUM(CASE WHEN sale_month = ' || sale_month ||
        ' THEN revenue ELSE 0 END) AS month_' || sale_month,
        ', '
        ORDER BY sale_month
    )
    INTO col_list
    FROM (SELECT DISTINCT sale_month FROM monthly_sales) m;

    sql := '
        SELECT category, region, ' || col_list || ', SUM(revenue) AS total
        FROM monthly_sales
        GROUP BY category, region
        ORDER BY category, region
    ';

    EXECUTE sql;
END;
$$;

Databricks / Spark SQL dynamic pivot:

# PySpark: pivot is built into DataFrame API
df = spark.table("monthly_sales")

result = (
    df.groupBy("category", "region")
    .pivot("sale_month")   # automatically discovers all distinct months
    .agg({"revenue": "sum"})
    .orderBy("category", "region")
)

result.show()

Spark’s .pivot() is the cleanest dynamic pivot available — it automatically discovers all distinct values in the pivot column without requiring dynamic SQL.


Building a Cross-Tab Report — Category vs Region

A cross-tab shows how two categorical dimensions intersect. This is different from a time pivot — instead of months as columns, you have categories as columns and regions as rows (or vice versa).

-- Revenue by region (rows) x category (columns)
SELECT
    region,
    SUM(CASE WHEN category = 'Hardware' THEN revenue ELSE 0 END) AS hardware,
    SUM(CASE WHEN category = 'Software' THEN revenue ELSE 0 END) AS software,
    SUM(revenue) AS grand_total,
    ROUND(
        SUM(CASE WHEN category = 'Hardware' THEN revenue ELSE 0 END) * 100.0 / SUM(revenue),
        1
    ) AS hardware_pct
FROM monthly_sales
WHERE sale_year = 2024
GROUP BY region
ORDER BY grand_total DESC;

Output:

regionhardwaresoftwaregrand_totalhardware_pct
North36000.0012500.0048500.0074.2%
South28000.008000.0036000.0077.8%

Adding Subtotals and Grand Totals with ROLLUP

ROLLUP automatically adds subtotal rows for each grouping level and a grand total row at the end.

SELECT
    COALESCE(category, 'ALL CATEGORIES') AS category,
    COALESCE(region, 'ALL REGIONS')      AS region,
    SUM(revenue)                          AS total_revenue,
    SUM(order_count)                      AS total_orders
FROM monthly_sales
GROUP BY ROLLUP(category, region)
ORDER BY category NULLS LAST, region NULLS LAST;

Output:

categoryregiontotal_revenuetotal_orders
HardwareNorth36000.00132
HardwareSouth28000.00112
HardwareALL REGIONS64000.00244
SoftwareNorth12500.00115
SoftwareSouth8000.0068
SoftwareALL REGIONS20500.00183
ALL CATEGORIESALL REGIONS84500.00427

ROLLUP(category, region) generates subtotals at the category level and a grand total. Use CUBE(category, region) if you also need subtotals at the region level.


Common Mistakes

Mistake 1 — Using NULL instead of 0 for missing pivot values

When a category has no data in a particular month, SUM(CASE WHEN ... THEN revenue END) returns NULL. Always use ELSE 0:

-- WRONG: returns NULL for missing months
SUM(CASE WHEN sale_month = 4 THEN revenue END)

-- CORRECT: returns 0 for missing months
SUM(CASE WHEN sale_month = 4 THEN revenue ELSE 0 END)

Mistake 2 — Hardcoding pivot columns then wondering why new months are missing

If you hardcode January to March and April data arrives, it never appears. Either: update the pivot SQL each month, or use a dynamic pivot in Spark/PySpark.

Mistake 3 — Pivoting at the wrong granularity

If your source data has multiple rows per category per month (by region), a naive pivot without proper GROUP BY will double-count. Always define the grouping level before pivoting.


What to Learn Next

Pivot tables and cross-tab reports are a common Gold layer output format. The next tutorial covers SQL window functions for advanced ranking and distribution analysis — the queries used to build percentile reports, tier classifications, and distribution charts.

Leave a Comment