One of the most common requests from analytics teams is “can you make it so the months are columns instead of rows?” This is pivoting — transforming row values into columns. It sounds simple but requires careful SQL, especially when the column names are not known in advance.
This tutorial covers static pivots, conditional aggregation, and dynamic pivots across PostgreSQL, Spark SQL, and Snowflake.
The Dataset
Monthly sales data in a long format — the standard output of most pipeline aggregations.
CREATE TABLE monthly_sales (
sale_year INT,
sale_month INT,
category VARCHAR(30),
region VARCHAR(20),
revenue DECIMAL(12,2),
order_count INT
);
INSERT INTO monthly_sales VALUES
(2024, 1, 'Hardware', 'North', 12000.00, 45),
(2024, 1, 'Software', 'North', 3000.00, 30),
(2024, 1, 'Hardware', 'South', 8000.00, 32),
(2024, 1, 'Software', 'South', 2000.00, 18),
(2024, 2, 'Hardware', 'North', 15000.00, 52),
(2024, 2, 'Software', 'North', 4500.00, 40),
(2024, 2, 'Hardware', 'South', 9000.00, 38),
(2024, 2, 'Software', 'South', 2800.00, 22),
(2024, 3, 'Hardware', 'North', 9000.00, 35),
(2024, 3, 'Software', 'North', 5000.00, 45),
(2024, 3, 'Hardware', 'South', 11000.00, 42),
(2024, 3, 'Software', 'South', 3200.00, 28);
Long format (how data looks in the pipeline):
| sale_year | sale_month | category | region | revenue |
|---|---|---|---|---|
| 2024 | 1 | Hardware | North | 12000.00 |
| 2024 | 1 | Software | North | 3000.00 |
| 2024 | 2 | Hardware | North | 15000.00 |
| … | … | … | … | … |
Pivot target (what the analytics team wants):
| category | region | Jan | Feb | Mar | Total |
|---|---|---|---|---|---|
| Hardware | North | 12000 | 15000 | 9000 | 36000 |
| Hardware | South | 8000 | 9000 | 11000 | 28000 |
| Software | North | 3000 | 4500 | 5000 | 12500 |
| Software | South | 2000 | 2800 | 3200 | 8000 |
Method 1 — Conditional Aggregation (Universal SQL)
This works in every SQL database. You use SUM(CASE WHEN ...) to manually create each column.
SELECT
category,
region,
SUM(CASE WHEN sale_month = 1 THEN revenue ELSE 0 END) AS jan,
SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END) AS feb,
SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END) AS mar,
SUM(revenue) AS total
FROM monthly_sales
WHERE sale_year = 2024
GROUP BY category, region
ORDER BY category, region;
Output:
| category | region | jan | feb | mar | total |
|---|---|---|---|---|---|
| Hardware | North | 12000.00 | 15000.00 | 9000.00 | 36000.00 |
| Hardware | South | 8000.00 | 9000.00 | 11000.00 | 28000.00 |
| Software | North | 3000.00 | 4500.00 | 5000.00 | 12500.00 |
| Software | South | 2000.00 | 2800.00 | 3200.00 | 8000.00 |
Exactly the format the analytics team requested.
Adding percentage columns:
SELECT
category,
region,
SUM(CASE WHEN sale_month = 1 THEN revenue ELSE 0 END) AS jan,
SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END) AS feb,
SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END) AS mar,
SUM(revenue) AS total,
ROUND(
SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END)
* 100.0
/ NULLIF(SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END), 0)
- 100,
1
) AS mar_vs_feb_pct
FROM monthly_sales
WHERE sale_year = 2024
GROUP BY category, region
ORDER BY category, region;
Output includes:
| category | region | jan | feb | mar | total | mar_vs_feb_pct |
|---|---|---|---|---|---|---|
| Hardware | North | 12000.00 | 15000.00 | 9000.00 | 36000.00 | -40.0% |
| Software | North | 3000.00 | 4500.00 | 5000.00 | 12500.00 | +11.1% |
Method 2 — PIVOT Syntax (Snowflake, SQL Server, Oracle)
Snowflake and SQL Server support native PIVOT syntax which is more concise for known column values.
-- Snowflake syntax
SELECT *
FROM monthly_sales
PIVOT (
SUM(revenue)
FOR sale_month IN (1 AS jan, 2 AS feb, 3 AS mar)
)
ORDER BY category, region;
Output: Same result as Method 1, written with less code.
The limitation: the values in IN (1, 2, 3) must be hardcoded. If you want to pivot on a dynamic set of values (all distinct months in a table), PIVOT syntax cannot do it — you need dynamic SQL.
Method 3 — Unpivot (Wide to Long)
Sometimes you receive data in wide format and need to convert it to long format for your pipeline. UNPIVOT is the reverse of PIVOT.
-- Snowflake / SQL Server UNPIVOT
SELECT category, region, month_name, revenue
FROM (
SELECT category, region, 12000.00 AS jan, 15000.00 AS feb, 9000.00 AS mar
WHERE category = 'Hardware' AND region = 'North'
) src
UNPIVOT (
revenue FOR month_name IN (jan, feb, mar)
) AS unpvt;
PostgreSQL alternative using CROSS JOIN LATERAL:
-- Convert wide format to long using VALUES()
SELECT
category,
region,
month_name,
revenue
FROM (
SELECT 'Hardware' AS category, 'North' AS region,
12000.00 AS jan, 15000.00 AS feb, 9000.00 AS mar
) wide_data
CROSS JOIN LATERAL (
VALUES ('jan', jan), ('feb', feb), ('mar', mar)
) AS months(month_name, revenue);
Output:
| category | region | month_name | revenue |
|---|---|---|---|
| Hardware | North | jan | 12000.00 |
| Hardware | North | feb | 15000.00 |
| Hardware | North | mar | 9000.00 |
Method 4 — Dynamic Pivot (When Columns Are Not Known in Advance)
The hardest case: you want to pivot on a column where the distinct values are not known at query-write time — for example, pivoting by product name when new products are added monthly.
PostgreSQL — dynamic pivot with plpgsql:
-- Step 1: generate the CASE WHEN list dynamically
SELECT
'SUM(CASE WHEN sale_month = ' || sale_month || ' THEN revenue ELSE 0 END) AS month_' || sale_month
FROM (SELECT DISTINCT sale_month FROM monthly_sales ORDER BY sale_month) months;
Output of Step 1 (the generated SQL fragments):
SUM(CASE WHEN sale_month = 1 THEN revenue ELSE 0 END) AS month_1
SUM(CASE WHEN sale_month = 2 THEN revenue ELSE 0 END) AS month_2
SUM(CASE WHEN sale_month = 3 THEN revenue ELSE 0 END) AS month_3
-- Step 2: assemble and execute dynamically (PostgreSQL plpgsql)
DO $$
DECLARE
col_list TEXT;
sql TEXT;
BEGIN
SELECT STRING_AGG(
'SUM(CASE WHEN sale_month = ' || sale_month ||
' THEN revenue ELSE 0 END) AS month_' || sale_month,
', '
ORDER BY sale_month
)
INTO col_list
FROM (SELECT DISTINCT sale_month FROM monthly_sales) m;
sql := '
SELECT category, region, ' || col_list || ', SUM(revenue) AS total
FROM monthly_sales
GROUP BY category, region
ORDER BY category, region
';
EXECUTE sql;
END;
$$;
Databricks / Spark SQL dynamic pivot:
# PySpark: pivot is built into DataFrame API
df = spark.table("monthly_sales")
result = (
df.groupBy("category", "region")
.pivot("sale_month") # automatically discovers all distinct months
.agg({"revenue": "sum"})
.orderBy("category", "region")
)
result.show()
Spark’s .pivot() is the cleanest dynamic pivot available — it automatically discovers all distinct values in the pivot column without requiring dynamic SQL.
Building a Cross-Tab Report — Category vs Region
A cross-tab shows how two categorical dimensions intersect. This is different from a time pivot — instead of months as columns, you have categories as columns and regions as rows (or vice versa).
-- Revenue by region (rows) x category (columns)
SELECT
region,
SUM(CASE WHEN category = 'Hardware' THEN revenue ELSE 0 END) AS hardware,
SUM(CASE WHEN category = 'Software' THEN revenue ELSE 0 END) AS software,
SUM(revenue) AS grand_total,
ROUND(
SUM(CASE WHEN category = 'Hardware' THEN revenue ELSE 0 END) * 100.0 / SUM(revenue),
1
) AS hardware_pct
FROM monthly_sales
WHERE sale_year = 2024
GROUP BY region
ORDER BY grand_total DESC;
Output:
| region | hardware | software | grand_total | hardware_pct |
|---|---|---|---|---|
| North | 36000.00 | 12500.00 | 48500.00 | 74.2% |
| South | 28000.00 | 8000.00 | 36000.00 | 77.8% |
Adding Subtotals and Grand Totals with ROLLUP
ROLLUP automatically adds subtotal rows for each grouping level and a grand total row at the end.
SELECT
COALESCE(category, 'ALL CATEGORIES') AS category,
COALESCE(region, 'ALL REGIONS') AS region,
SUM(revenue) AS total_revenue,
SUM(order_count) AS total_orders
FROM monthly_sales
GROUP BY ROLLUP(category, region)
ORDER BY category NULLS LAST, region NULLS LAST;
Output:
| category | region | total_revenue | total_orders |
|---|---|---|---|
| Hardware | North | 36000.00 | 132 |
| Hardware | South | 28000.00 | 112 |
| Hardware | ALL REGIONS | 64000.00 | 244 |
| Software | North | 12500.00 | 115 |
| Software | South | 8000.00 | 68 |
| Software | ALL REGIONS | 20500.00 | 183 |
| ALL CATEGORIES | ALL REGIONS | 84500.00 | 427 |
ROLLUP(category, region) generates subtotals at the category level and a grand total. Use CUBE(category, region) if you also need subtotals at the region level.
Common Mistakes
Mistake 1 — Using NULL instead of 0 for missing pivot values
When a category has no data in a particular month, SUM(CASE WHEN ... THEN revenue END) returns NULL. Always use ELSE 0:
-- WRONG: returns NULL for missing months
SUM(CASE WHEN sale_month = 4 THEN revenue END)
-- CORRECT: returns 0 for missing months
SUM(CASE WHEN sale_month = 4 THEN revenue ELSE 0 END)
Mistake 2 — Hardcoding pivot columns then wondering why new months are missing
If you hardcode January to March and April data arrives, it never appears. Either: update the pivot SQL each month, or use a dynamic pivot in Spark/PySpark.
Mistake 3 — Pivoting at the wrong granularity
If your source data has multiple rows per category per month (by region), a naive pivot without proper GROUP BY will double-count. Always define the grouping level before pivoting.
What to Learn Next
Pivot tables and cross-tab reports are a common Gold layer output format. The next tutorial covers SQL window functions for advanced ranking and distribution analysis — the queries used to build percentile reports, tier classifications, and distribution charts.