SQL for Hierarchical Data — Recursive CTEs for Org Charts, Category Trees, and Bill of Materials

Hierarchical data is everywhere in enterprise systems: organizational charts, product category trees, bill of materials, geographic hierarchies (city → region → country), account rollups in finance. Standard SQL aggregations cannot traverse parent-child relationships across multiple levels — that is what recursive CTEs are built for.

This tutorial covers recursive CTEs with three real-world data engineering use cases, including how to handle cycles and build path-based hierarchies.


The Core Structure — How Recursive CTEs Work

A recursive CTE has exactly two parts joined by UNION ALL:

WITH RECURSIVE cte_name AS (
    -- Part 1: Anchor — the starting rows (no recursion)
    SELECT ...
    FROM table
    WHERE [starting condition]

    UNION ALL

    -- Part 2: Recursive step — joins cte_name to itself, extending one level
    SELECT ...
    FROM table
    JOIN cte_name ON [parent-child join condition]
    WHERE [optional termination condition]
)
SELECT * FROM cte_name;

The engine executes the anchor first, then repeatedly executes the recursive step using the previous iteration’s output as input, until no new rows are produced.


Use Case 1 — Organizational Hierarchy

The classic example: finding all employees under a given manager, at any depth.

CREATE TABLE employees (
    employee_id   INT,
    name          VARCHAR(50),
    title         VARCHAR(50),
    manager_id    INT,     -- NULL for the CEO (root node)
    department    VARCHAR(30),
    salary        DECIMAL(10,2)
);

INSERT INTO employees VALUES
(1,  'Sarah Chen',    'CEO',               NULL, 'Executive',   250000),
(2,  'Mark Davis',    'VP Engineering',       1, 'Engineering', 180000),
(3,  'Lisa Park',     'VP Sales',             1, 'Sales',       175000),
(4,  'Tom Wilson',    'VP Finance',           1, 'Finance',     165000),
(5,  'Anna Lee',      'Engineering Manager',  2, 'Engineering', 130000),
(6,  'James Kim',     'Sales Manager',        3, 'Sales',       120000),
(7,  'Rachel Green',  'Senior Engineer',      5, 'Engineering', 110000),
(8,  'Chris Brown',   'Engineer',             5, 'Engineering',  95000),
(9,  'David Chen',    'Sales Rep',            6, 'Sales',        80000),
(10, 'Emily White',   'Sales Rep',            6, 'Sales',        80000),
(11, 'Kevin Patel',   'Financial Analyst',    4, 'Finance',      90000);

Query 1 — Full org chart with depth and path

WITH RECURSIVE org_tree AS (
    -- Anchor: start at the root (CEO)
    SELECT
        employee_id,
        name,
        title,
        manager_id,
        salary,
        0                          AS depth,
        name::VARCHAR(500)         AS full_path,
        ARRAY[employee_id]         AS path_ids    -- track visited nodes (PostgreSQL)
    FROM employees
    WHERE manager_id IS NULL   -- root node

    UNION ALL

    -- Recursive: join each employee to their manager's row
    SELECT
        e.employee_id,
        e.name,
        e.title,
        e.manager_id,
        e.salary,
        ot.depth + 1,
        (ot.full_path || ' > ' || e.name)::VARCHAR(500),
        ot.path_ids || e.employee_id
    FROM employees e
    JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT
    REPEAT('    ', depth) || name AS indented_name,
    title,
    depth                         AS org_level,
    full_path,
    salary
FROM org_tree
ORDER BY full_path;

Output:

indented_nametitleorg_levelfull_path
Sarah ChenCEO0Sarah Chen
    Lisa ParkVP Sales1Sarah Chen > Lisa Park
        James KimSales Manager2Sarah Chen > Lisa Park > James Kim
            David ChenSales Rep3Sarah Chen > Lisa Park > James Kim > David Chen
            Emily WhiteSales Rep3Sarah Chen > Lisa Park > James Kim > Emily White
    Mark DavisVP Engineering1Sarah Chen > Mark Davis
        Anna LeeEng Manager2Sarah Chen > Mark Davis > Anna Lee

Query 2 — Total headcount and payroll under each manager

WITH RECURSIVE org_tree AS (
    SELECT employee_id, name, manager_id, salary, employee_id AS root_manager_id
    FROM employees

    UNION ALL

    SELECT e.employee_id, e.name, e.manager_id, e.salary, ot.root_manager_id
    FROM employees e
    JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT
    m.name              AS manager_name,
    m.title,
    COUNT(ot.employee_id) - 1   AS headcount_under,    -- subtract self
    SUM(ot.salary) - m.salary   AS total_team_payroll   -- subtract own salary
FROM org_tree ot
JOIN employees m ON ot.root_manager_id = m.employee_id
GROUP BY m.employee_id, m.name, m.title, m.salary
HAVING COUNT(ot.employee_id) > 1   -- only show managers
ORDER BY total_team_payroll DESC;

Output:

manager_nametitleheadcount_undertotal_team_payroll
Sarah ChenCEO101225000
Mark DavisVP Engineering3335000
Lisa ParkVP Sales3280000
Anna LeeEng Manager2205000
James KimSales Manager2160000
Tom WilsonVP Finance190000

Use Case 2 — Product Category Tree

E-commerce category hierarchies: Electronics → Computers → Laptops → Gaming Laptops.

CREATE TABLE categories (
    category_id   INT,
    category_name VARCHAR(50),
    parent_id     INT     -- NULL for root categories
);

INSERT INTO categories VALUES
(1,  'Electronics',      NULL),
(2,  'Computers',           1),
(3,  'Phones',              1),
(4,  'Accessories',         1),
(5,  'Laptops',             2),
(6,  'Desktops',            2),
(7,  'Smartphones',         3),
(8,  'Tablets',             3),
(9,  'Gaming Laptops',      5),
(10, 'Business Laptops',    5),
(11, 'Cases',               4),
(12, 'Chargers',            4);

Find all descendants of a given category

WITH RECURSIVE category_tree AS (
    -- Anchor: start at a specific category (Computers = 2)
    SELECT
        category_id,
        category_name,
        parent_id,
        0 AS depth
    FROM categories
    WHERE category_id = 2   -- start node

    UNION ALL

    SELECT
        c.category_id,
        c.category_name,
        c.parent_id,
        ct.depth + 1
    FROM categories c
    JOIN category_tree ct ON c.parent_id = ct.category_id
)
SELECT
    REPEAT('  ', depth) || category_name AS category_tree,
    depth,
    category_id
FROM category_tree
ORDER BY depth, category_name;

Output:

category_treedepthcategory_id
Computers02
  Laptops15
  Desktops16
    Business Laptops210
    Gaming Laptops29

Find the full path for any category (bottom-up traversal)

Going up the tree instead of down — find the breadcrumb path for any leaf node.

WITH RECURSIVE breadcrumb AS (
    -- Anchor: start at the leaf node
    SELECT
        category_id,
        category_name,
        parent_id,
        category_name::VARCHAR(500) AS path
    FROM categories
    WHERE category_id = 9   -- Gaming Laptops

    UNION ALL

    -- Recursive: traverse upward to parent
    SELECT
        c.category_id,
        c.category_name,
        c.parent_id,
        (c.category_name || ' > ' || bc.path)::VARCHAR(500)
    FROM categories c
    JOIN breadcrumb bc ON c.category_id = bc.parent_id
)
SELECT path AS full_breadcrumb
FROM breadcrumb
WHERE parent_id IS NULL;   -- return only the root row (which has the full path)

Output:

full_breadcrumb
Electronics > Computers > Laptops > Gaming Laptops

This is the classic e-commerce breadcrumb navigation path, built entirely in SQL.


Use Case 3 — Bill of Materials (Parts Explosion)

Manufacturing and supply chain: a product is made of components, which are made of sub-components, each with a quantity.

CREATE TABLE bom (
    parent_part_id   INT,
    child_part_id    INT,
    quantity         DECIMAL(10,3),
    unit             VARCHAR(10)
);

CREATE TABLE parts (
    part_id    INT,
    part_name  VARCHAR(50),
    unit_cost  DECIMAL(10,2)
);

INSERT INTO parts VALUES
(1, 'Laptop',         0.00),    -- finished good
(2, 'Chassis',       45.00),
(3, 'Screen',        80.00),
(4, 'Motherboard',  120.00),
(5, 'CPU',          200.00),
(6, 'RAM Module',    30.00),
(7, 'Keyboard',      15.00),
(8, 'Screw Set',      2.00),
(9, 'Thermal Paste',  1.50);

INSERT INTO bom VALUES
(1, 2, 1,   'each'),    -- Laptop needs 1 Chassis
(1, 3, 1,   'each'),    -- Laptop needs 1 Screen
(1, 4, 1,   'each'),    -- Laptop needs 1 Motherboard
(1, 7, 1,   'each'),    -- Laptop needs 1 Keyboard
(4, 5, 1,   'each'),    -- Motherboard needs 1 CPU
(4, 6, 2,   'each'),    -- Motherboard needs 2 RAM Modules
(4, 8, 12,  'each'),    -- Motherboard needs 12 Screws
(4, 9, 0.5, 'gram');    -- Motherboard needs 0.5g Thermal Paste

Full parts explosion — all raw materials for 1 laptop, with quantities

WITH RECURSIVE parts_explosion AS (
    -- Anchor: start with the top-level product
    SELECT
        b.parent_part_id,
        b.child_part_id,
        b.quantity     AS total_quantity,
        b.unit,
        1              AS depth
    FROM bom b
    WHERE b.parent_part_id = 1   -- start with Laptop

    UNION ALL

    -- Recursive: expand sub-components, multiplying quantities upward
    SELECT
        b.parent_part_id,
        b.child_part_id,
        pe.total_quantity * b.quantity   AS total_quantity,   -- multiply quantities at each level
        b.unit,
        pe.depth + 1
    FROM bom b
    JOIN parts_explosion pe ON pe.child_part_id = b.parent_part_id
)
SELECT
    p.part_name,
    pe.total_quantity,
    pe.unit,
    pe.depth,
    p.unit_cost,
    ROUND(pe.total_quantity * p.unit_cost, 2)  AS total_cost
FROM parts_explosion pe
JOIN parts p ON pe.child_part_id = p.part_id
ORDER BY pe.depth, p.part_name;

Output:

part_nametotal_quantityunitdepthunit_costtotal_cost
Chassis1each145.0045.00
Keyboard1each115.0015.00
Motherboard1each1120.00120.00
Screen1each180.0080.00
CPU1each2200.00200.00
RAM Module2each230.0060.00
Screw Set12each22.0024.00
Thermal Paste0.5gram21.500.75

Total raw material cost: 544.75.

The key insight is pe.total_quantity * b.quantity — this cascades the multiplied quantities through multiple levels. If you needed 10 laptops, change the anchor quantity to 10 and all downstream quantities scale automatically.


Handling Cycles in Hierarchical Data

Real-world data sometimes has circular references (A is parent of B, B is parent of A). Without a cycle guard, your recursive CTE will loop forever.

WITH RECURSIVE safe_tree AS (
    SELECT
        employee_id,
        manager_id,
        ARRAY[employee_id] AS visited   -- track all visited IDs
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    SELECT
        e.employee_id,
        e.manager_id,
        st.visited || e.employee_id
    FROM employees e
    JOIN safe_tree st ON e.manager_id = st.employee_id
    WHERE NOT e.employee_id = ANY(st.visited)   -- stop if we have seen this node before
      AND ARRAY_LENGTH(st.visited, 1) < 20      -- hard depth limit as backup
)
SELECT * FROM safe_tree;

NOT e.employee_id = ANY(st.visited) is the cycle guard. If any node appears in the visited array, the recursion stops for that branch.


Common Mistakes

Mistake 1 — No termination condition

Without WHERE NOT id = ANY(visited) or WHERE depth < max_depth, a circular reference causes an infinite loop. Always add a cycle guard when your data could have cycles.

Mistake 2 — Forgetting to multiply quantities in BOM explosions

In a multi-level BOM, you must multiply the parent quantity by the child quantity at each level (pe.total_quantity * b.quantity). Failing to do this gives you the per-unit quantity at each level, not the total requirement.

Mistake 3 — Using UNION instead of UNION ALL

Recursive CTEs must use UNION ALL, not UNION. Using UNION causes the engine to deduplicate after every step, which breaks the recursion.

Mistake 4 — Selecting from the recursive CTE multiple times

Some databases (PostgreSQL) may re-evaluate a CTE each time it is referenced. If you need the result of a recursive CTE in multiple places, materialize it into a temp table first.


Quick Reference — When to Use Recursive CTEs

Data structureUse recursive CTE when
Org chartFinding all reports under a manager at any depth
Category treeGetting all descendants or ancestors of a node
Bill of materialsExploding a product into all raw material components
Geographic hierarchyRolling up city → region → country → global
Account hierarchyFinancial roll-ups across subsidiary accounts
Network graphFinding connected components (with cycle guard)

What to Learn Next

Hierarchical queries give you the traversal logic. The next tutorial covers SQL pivot tables and cross-tab reports — reshaping your aggregated data from long format to wide format for dashboard consumption.

Leave a Comment