Hierarchical data is everywhere in enterprise systems: organizational charts, product category trees, bill of materials, geographic hierarchies (city → region → country), account rollups in finance. Standard SQL aggregations cannot traverse parent-child relationships across multiple levels — that is what recursive CTEs are built for.
This tutorial covers recursive CTEs with three real-world data engineering use cases, including how to handle cycles and build path-based hierarchies.
The Core Structure — How Recursive CTEs Work
A recursive CTE has exactly two parts joined by UNION ALL:
WITH RECURSIVE cte_name AS (
-- Part 1: Anchor — the starting rows (no recursion)
SELECT ...
FROM table
WHERE [starting condition]
UNION ALL
-- Part 2: Recursive step — joins cte_name to itself, extending one level
SELECT ...
FROM table
JOIN cte_name ON [parent-child join condition]
WHERE [optional termination condition]
)
SELECT * FROM cte_name;
The engine executes the anchor first, then repeatedly executes the recursive step using the previous iteration’s output as input, until no new rows are produced.
Use Case 1 — Organizational Hierarchy
The classic example: finding all employees under a given manager, at any depth.
CREATE TABLE employees (
employee_id INT,
name VARCHAR(50),
title VARCHAR(50),
manager_id INT, -- NULL for the CEO (root node)
department VARCHAR(30),
salary DECIMAL(10,2)
);
INSERT INTO employees VALUES
(1, 'Sarah Chen', 'CEO', NULL, 'Executive', 250000),
(2, 'Mark Davis', 'VP Engineering', 1, 'Engineering', 180000),
(3, 'Lisa Park', 'VP Sales', 1, 'Sales', 175000),
(4, 'Tom Wilson', 'VP Finance', 1, 'Finance', 165000),
(5, 'Anna Lee', 'Engineering Manager', 2, 'Engineering', 130000),
(6, 'James Kim', 'Sales Manager', 3, 'Sales', 120000),
(7, 'Rachel Green', 'Senior Engineer', 5, 'Engineering', 110000),
(8, 'Chris Brown', 'Engineer', 5, 'Engineering', 95000),
(9, 'David Chen', 'Sales Rep', 6, 'Sales', 80000),
(10, 'Emily White', 'Sales Rep', 6, 'Sales', 80000),
(11, 'Kevin Patel', 'Financial Analyst', 4, 'Finance', 90000);
Query 1 — Full org chart with depth and path
WITH RECURSIVE org_tree AS (
-- Anchor: start at the root (CEO)
SELECT
employee_id,
name,
title,
manager_id,
salary,
0 AS depth,
name::VARCHAR(500) AS full_path,
ARRAY[employee_id] AS path_ids -- track visited nodes (PostgreSQL)
FROM employees
WHERE manager_id IS NULL -- root node
UNION ALL
-- Recursive: join each employee to their manager's row
SELECT
e.employee_id,
e.name,
e.title,
e.manager_id,
e.salary,
ot.depth + 1,
(ot.full_path || ' > ' || e.name)::VARCHAR(500),
ot.path_ids || e.employee_id
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT
REPEAT(' ', depth) || name AS indented_name,
title,
depth AS org_level,
full_path,
salary
FROM org_tree
ORDER BY full_path;
Output:
| indented_name | title | org_level | full_path |
|---|---|---|---|
| Sarah Chen | CEO | 0 | Sarah Chen |
| Lisa Park | VP Sales | 1 | Sarah Chen > Lisa Park |
| James Kim | Sales Manager | 2 | Sarah Chen > Lisa Park > James Kim |
| David Chen | Sales Rep | 3 | Sarah Chen > Lisa Park > James Kim > David Chen |
| Emily White | Sales Rep | 3 | Sarah Chen > Lisa Park > James Kim > Emily White |
| Mark Davis | VP Engineering | 1 | Sarah Chen > Mark Davis |
| Anna Lee | Eng Manager | 2 | Sarah Chen > Mark Davis > Anna Lee |
| … | … | … | … |
Query 2 — Total headcount and payroll under each manager
WITH RECURSIVE org_tree AS (
SELECT employee_id, name, manager_id, salary, employee_id AS root_manager_id
FROM employees
UNION ALL
SELECT e.employee_id, e.name, e.manager_id, e.salary, ot.root_manager_id
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT
m.name AS manager_name,
m.title,
COUNT(ot.employee_id) - 1 AS headcount_under, -- subtract self
SUM(ot.salary) - m.salary AS total_team_payroll -- subtract own salary
FROM org_tree ot
JOIN employees m ON ot.root_manager_id = m.employee_id
GROUP BY m.employee_id, m.name, m.title, m.salary
HAVING COUNT(ot.employee_id) > 1 -- only show managers
ORDER BY total_team_payroll DESC;
Output:
| manager_name | title | headcount_under | total_team_payroll |
|---|---|---|---|
| Sarah Chen | CEO | 10 | 1225000 |
| Mark Davis | VP Engineering | 3 | 335000 |
| Lisa Park | VP Sales | 3 | 280000 |
| Anna Lee | Eng Manager | 2 | 205000 |
| James Kim | Sales Manager | 2 | 160000 |
| Tom Wilson | VP Finance | 1 | 90000 |
Use Case 2 — Product Category Tree
E-commerce category hierarchies: Electronics → Computers → Laptops → Gaming Laptops.
CREATE TABLE categories (
category_id INT,
category_name VARCHAR(50),
parent_id INT -- NULL for root categories
);
INSERT INTO categories VALUES
(1, 'Electronics', NULL),
(2, 'Computers', 1),
(3, 'Phones', 1),
(4, 'Accessories', 1),
(5, 'Laptops', 2),
(6, 'Desktops', 2),
(7, 'Smartphones', 3),
(8, 'Tablets', 3),
(9, 'Gaming Laptops', 5),
(10, 'Business Laptops', 5),
(11, 'Cases', 4),
(12, 'Chargers', 4);
Find all descendants of a given category
WITH RECURSIVE category_tree AS (
-- Anchor: start at a specific category (Computers = 2)
SELECT
category_id,
category_name,
parent_id,
0 AS depth
FROM categories
WHERE category_id = 2 -- start node
UNION ALL
SELECT
c.category_id,
c.category_name,
c.parent_id,
ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.category_id
)
SELECT
REPEAT(' ', depth) || category_name AS category_tree,
depth,
category_id
FROM category_tree
ORDER BY depth, category_name;
Output:
| category_tree | depth | category_id |
|---|---|---|
| Computers | 0 | 2 |
| Laptops | 1 | 5 |
| Desktops | 1 | 6 |
| Business Laptops | 2 | 10 |
| Gaming Laptops | 2 | 9 |
Find the full path for any category (bottom-up traversal)
Going up the tree instead of down — find the breadcrumb path for any leaf node.
WITH RECURSIVE breadcrumb AS (
-- Anchor: start at the leaf node
SELECT
category_id,
category_name,
parent_id,
category_name::VARCHAR(500) AS path
FROM categories
WHERE category_id = 9 -- Gaming Laptops
UNION ALL
-- Recursive: traverse upward to parent
SELECT
c.category_id,
c.category_name,
c.parent_id,
(c.category_name || ' > ' || bc.path)::VARCHAR(500)
FROM categories c
JOIN breadcrumb bc ON c.category_id = bc.parent_id
)
SELECT path AS full_breadcrumb
FROM breadcrumb
WHERE parent_id IS NULL; -- return only the root row (which has the full path)
Output:
| full_breadcrumb |
|---|
| Electronics > Computers > Laptops > Gaming Laptops |
This is the classic e-commerce breadcrumb navigation path, built entirely in SQL.
Use Case 3 — Bill of Materials (Parts Explosion)
Manufacturing and supply chain: a product is made of components, which are made of sub-components, each with a quantity.
CREATE TABLE bom (
parent_part_id INT,
child_part_id INT,
quantity DECIMAL(10,3),
unit VARCHAR(10)
);
CREATE TABLE parts (
part_id INT,
part_name VARCHAR(50),
unit_cost DECIMAL(10,2)
);
INSERT INTO parts VALUES
(1, 'Laptop', 0.00), -- finished good
(2, 'Chassis', 45.00),
(3, 'Screen', 80.00),
(4, 'Motherboard', 120.00),
(5, 'CPU', 200.00),
(6, 'RAM Module', 30.00),
(7, 'Keyboard', 15.00),
(8, 'Screw Set', 2.00),
(9, 'Thermal Paste', 1.50);
INSERT INTO bom VALUES
(1, 2, 1, 'each'), -- Laptop needs 1 Chassis
(1, 3, 1, 'each'), -- Laptop needs 1 Screen
(1, 4, 1, 'each'), -- Laptop needs 1 Motherboard
(1, 7, 1, 'each'), -- Laptop needs 1 Keyboard
(4, 5, 1, 'each'), -- Motherboard needs 1 CPU
(4, 6, 2, 'each'), -- Motherboard needs 2 RAM Modules
(4, 8, 12, 'each'), -- Motherboard needs 12 Screws
(4, 9, 0.5, 'gram'); -- Motherboard needs 0.5g Thermal Paste
Full parts explosion — all raw materials for 1 laptop, with quantities
WITH RECURSIVE parts_explosion AS (
-- Anchor: start with the top-level product
SELECT
b.parent_part_id,
b.child_part_id,
b.quantity AS total_quantity,
b.unit,
1 AS depth
FROM bom b
WHERE b.parent_part_id = 1 -- start with Laptop
UNION ALL
-- Recursive: expand sub-components, multiplying quantities upward
SELECT
b.parent_part_id,
b.child_part_id,
pe.total_quantity * b.quantity AS total_quantity, -- multiply quantities at each level
b.unit,
pe.depth + 1
FROM bom b
JOIN parts_explosion pe ON pe.child_part_id = b.parent_part_id
)
SELECT
p.part_name,
pe.total_quantity,
pe.unit,
pe.depth,
p.unit_cost,
ROUND(pe.total_quantity * p.unit_cost, 2) AS total_cost
FROM parts_explosion pe
JOIN parts p ON pe.child_part_id = p.part_id
ORDER BY pe.depth, p.part_name;
Output:
| part_name | total_quantity | unit | depth | unit_cost | total_cost |
|---|---|---|---|---|---|
| Chassis | 1 | each | 1 | 45.00 | 45.00 |
| Keyboard | 1 | each | 1 | 15.00 | 15.00 |
| Motherboard | 1 | each | 1 | 120.00 | 120.00 |
| Screen | 1 | each | 1 | 80.00 | 80.00 |
| CPU | 1 | each | 2 | 200.00 | 200.00 |
| RAM Module | 2 | each | 2 | 30.00 | 60.00 |
| Screw Set | 12 | each | 2 | 2.00 | 24.00 |
| Thermal Paste | 0.5 | gram | 2 | 1.50 | 0.75 |
Total raw material cost: 544.75.
The key insight is pe.total_quantity * b.quantity — this cascades the multiplied quantities through multiple levels. If you needed 10 laptops, change the anchor quantity to 10 and all downstream quantities scale automatically.
Handling Cycles in Hierarchical Data
Real-world data sometimes has circular references (A is parent of B, B is parent of A). Without a cycle guard, your recursive CTE will loop forever.
WITH RECURSIVE safe_tree AS (
SELECT
employee_id,
manager_id,
ARRAY[employee_id] AS visited -- track all visited IDs
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT
e.employee_id,
e.manager_id,
st.visited || e.employee_id
FROM employees e
JOIN safe_tree st ON e.manager_id = st.employee_id
WHERE NOT e.employee_id = ANY(st.visited) -- stop if we have seen this node before
AND ARRAY_LENGTH(st.visited, 1) < 20 -- hard depth limit as backup
)
SELECT * FROM safe_tree;
NOT e.employee_id = ANY(st.visited) is the cycle guard. If any node appears in the visited array, the recursion stops for that branch.
Common Mistakes
Mistake 1 — No termination condition
Without WHERE NOT id = ANY(visited) or WHERE depth < max_depth, a circular reference causes an infinite loop. Always add a cycle guard when your data could have cycles.
Mistake 2 — Forgetting to multiply quantities in BOM explosions
In a multi-level BOM, you must multiply the parent quantity by the child quantity at each level (pe.total_quantity * b.quantity). Failing to do this gives you the per-unit quantity at each level, not the total requirement.
Mistake 3 — Using UNION instead of UNION ALL
Recursive CTEs must use UNION ALL, not UNION. Using UNION causes the engine to deduplicate after every step, which breaks the recursion.
Mistake 4 — Selecting from the recursive CTE multiple times
Some databases (PostgreSQL) may re-evaluate a CTE each time it is referenced. If you need the result of a recursive CTE in multiple places, materialize it into a temp table first.
Quick Reference — When to Use Recursive CTEs
| Data structure | Use recursive CTE when |
|---|---|
| Org chart | Finding all reports under a manager at any depth |
| Category tree | Getting all descendants or ancestors of a node |
| Bill of materials | Exploding a product into all raw material components |
| Geographic hierarchy | Rolling up city → region → country → global |
| Account hierarchy | Financial roll-ups across subsidiary accounts |
| Network graph | Finding connected components (with cycle guard) |
What to Learn Next
Hierarchical queries give you the traversal logic. The next tutorial covers SQL pivot tables and cross-tab reports — reshaping your aggregated data from long format to wide format for dashboard consumption.