Materialize and Advent of Code: Using SQL to solve your puzzles!

This past year Team Materialize struck out to do each day of 2023’s Advent of Code, an annual programming event with thought-provoking problems that you are encouraged to approach from non-standard directions. We figured we’d try and use SQL for the whole thing.

SQL is a bold choice because it is meant for querying data, and not as much for general computation. Several of the problems call for interesting algorithms, specific data structures, and some flexibility. However, Materialize’s core thesis is that you can do so much more with SQL that just query your data. If you want to move operational logic from bespoke code into SQL, you’ll need to be able to express that logic. And so, Advent of Code was a great opportunity to stretch our legs, and fingers, and see just how much logic fits into SQL.

Preliminaries

There’s a lot of content in the month’s problems. There are 49 problems, and although there is some overlap really there is too much to say about all of them. We aren’t going to recount each of the problems, the whimsical backstories, and the shape of the problem inputs. We’ll try and flag some surprising moments, though, and you should dive into those problems if you are keen (they can each be done on their own).

I (Frank) wrote all of my solutions using Materialize’s WITH MUTUALLY RECURSIVE even when recursion was not required. This just helped me start writing, as the blocks allow you to just start naming subqueries and writing SQL.

My solutions all had the same skeletal structure:

sql

WITH MUTUALLY RECURSIVE

    -- Parse the problem input into tabular form.
    lines(line TEXT) AS ( .. ),

    -- SQL leading up to part 1.
    part1(part1 BIGINT) AS ( .. ),

    -- SQL leading up to part 2.
    part2(part2 BIGINT) AS ( .. ) 

SELECT * FROM part1, part2;

As mentioned, we won’t always need recursion. However, we often do use recursion, and may even need it. We’ll call this out, as the use (and ease) of recursion in SQL was one of the main unlocks.

Week one

Day one was largely about text manipulation, specifically extracting numbers from text, and was well-addressed by using regular expressions to manipulate and search the text.

See the solution

Link to puzzle(s) 🟢 🟢

Part one

The newly-improved calibration document consists of lines of text; each line originally contained a specific calibration value that the Elves now need to recover. On each line, the calibration value can be found by combining the first digit and the last digit (in that order) to form a single two-digit number.

Consider your entire calibration document. What is the sum of all of the calibration values?

sql

SELECT SUM(LEFT(r, 1)::int * 10 + RIGHT(r, 1)::int) AS part1
FROM (
	SELECT regexp_replace(input, '[^\d]', '', 'g') AS r
	FROM aoc_1201
);

Part two

Your calculation isn’t quite right. It looks like some of the digits are actually spelled out with letters: one, two, three, four, five, six, seven, eight, and nine also count as valid “digits”.

Equipped with this new information, you now need to find the real first and last digit on each line.

sql

WITH
    lines AS (
        SELECT regexp_split_to_table(input, '\n') AS line
        FROM aoc_1201
    ),
    slices AS (
        SELECT line, index, substring(line, index, width) AS slice
        FROM
            lines,
            generate_series(1, length(line)) AS index,
            generate_series(1, 5) AS width
    ),
    numbers (t, n) AS (
        VALUES ('0', 0), ('1', 1), ('2', 2), ('3', 3), ('4', 4), ('5', 5), ('6', 6), ('7', 7), ('8', 8), ('9', 9),
               ('zero', 0), ('one', 1), ('two', 2), ('three', 3), ('four', 4), ('five', 5), ('six', 6), ('seven', 7), ('eight', 8), ('nine', 9)
    ),
    findings AS (
        SELECT line, index, n AS number
        FROM slices, numbers
        WHERE slices.slice = numbers.t
    ),
    first AS ( SELECT DISTINCT ON (line) line, number AS f FROM findings ORDER BY line, index ),
    last AS ( SELECT DISTINCT ON (line) line, number AS l FROM findings ORDER BY line, index DESC )
SELECT SUM(f * 10 + l)
FROM first, last
WHERE first.line = last.line

Contributors

Day 1 was brought to you by: @chass, @def-, @doy-materialize, @frankmcsherry, @josharenberg, @morsapaes, @nrainer-materialize

Day two was largely about aggregation: rolling up counts and maxima for games involving numbers of colored cubes; SQL did great here.

See the solution

Link to puzzle(s) 🟢 🟢

Part one

Given a table with the following format:

sql

game_id   | set_id | green_cnt | red_cnt | blue_cnt
----------+--------+-----------+---------+----------
 Game 4   | set_2  |        12 |       0 |        0
...

sql

WITH game_cnt AS (
SELECT split_part(game_id,' ', 2)::int AS game_id,
       COUNT(set_id) AS total_set_cnt,
       COUNT(set_id) FILTER (WHERE (green_cnt <= 13) AND (red_cnt <= 12) AND (blue_cnt <= 14)) AS possible_set_cnt
FROM aoc_1202
GROUP BY game_id
)
SELECT SUM(game_id) FROM game_cnt WHERE total_set_cnt = possible_set_cnt;

Part two

sql

WITH game_min AS (
SELECT split_part(game_id,' ', 2)::int AS game_id,
       MAX(green_cnt) AS green_min,
       MAX(red_cnt) AS red_min,
       MAX(blue_cnt) AS blue_min
FROM aoc_1202
GROUP BY split_part(game_id,' ', 2)::int
)
SELECT SUM(green_min*red_min*blue_min) FROM game_min;

Part one + two in one go!

sql

-- Pre-supposes a view `input(input TEXT)` containing the string from AOC
with mutually recursive
    -- Parse the input up
    lines(line TEXT) as (select regexp_split_to_table(input, '\n') as line from input),
    games(game TEXT, report TEXT) as (select regexp_split_to_array(line, ':')[1], regexp_split_to_array(line, ':')[2] from lines),
    round(game TEXT, visible TEXT) as (select game, regexp_split_to_table(report, ';') from games),
    bacon(game TEXT, color TEXT) as (select game, regexp_split_to_table(visible, ',') from round),
    parsed(game INT, color TEXT, number INT) as (
        select
            substring(game, 5)::INT as game,
            regexp_split_to_array(color, ' ')[3] as color,
            regexp_split_to_array(color, ' ')[2]::INT as number
        from bacon
    ),
    -- PART 1
    limits(color TEXT, number INT) as (SELECT * FROM (VALUES ('red', 12), ('green', 13), ('blue', 14))),
    bad_news(game INT) as (
        select game
        from parsed, limits
        where parsed.color = limits.color
          AND parsed.number > limits.number
    ),
    plausible(game INT) as (select distinct parsed.game from parsed left join bad_news on(parsed.game = bad_news.game) where bad_news.game IS NULL),
    part1(part1 BIGINT) as (select SUM(game) from plausible),
    -- PART 2
    maximum(game INT, color TEXT, number INT) as (select game, color, max(number) from parsed GROUP BY game, color),
    red(game INT) as (select game from maximum, generate_series(1, number) where color = 'red'),
    blue(game INT) as (select game from maximum, generate_series(1, number) where color = 'blue'),
    green(game INT) as (select game from maximum, generate_series(1, number) where color = 'green'),
    power(game INT, product BIGINT) as (SELECT red.game, count(*) from red, blue, green where red.game = blue.game and blue.game = green.game GROUP BY red.game),
    part2(part2 BIGINT) as (select sum(product)::BIGINT from power)
select * from part1, part2;

Contributors

Day 2 was brought to you by: @def-, @frankmcsherry, @morsapaes

Day three has inputs in grid form, where there can be interaction between multiple lines (with symbols above or below others). You are looking for runs of numerals, and I used WMR to track these down; reportedly you can also use regular expressions, but I was not clever enough for that!

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

-- Pre-supposes a view `input(input TEXT)` containing the string from AOC
    WITH MUTUALLY RECURSIVE
        -- PART 0
        -- Parse the input as lines of text with line numbers.
        lines(line TEXT, row_idx INT) AS (
            SELECT
               regexp_split_to_array(input, '\n')[row_idx],
               row_idx
             FROM
                input,
                generate_series(1, (SELECT COUNT(*)::INT FROM (SELECT regexp_split_to_table(input, '\n') FROM input))) as row_idx
        ),
        chars(symbol TEXT, row_idx INT, col_idx INT) AS (
            SELECT
                substring(line, start, 1),
                row_idx,
                start
            FROM
                lines,
                generate_series(1, length(line)) as start
            WHERE
                substring(line, start, 1) != '.'
        ),
        numerals(number TEXT, row_idx INT, col_idx INT) AS (
            SELECT symbol, row_idx, col_idx
            FROM chars
            WHERE symbol IN ( VALUES ('0'), ('1'), ('2'), ('3'), ('4'), ('5'), ('6'), ('7'), ('8'), ('9') )
        ),
        symbols(symbol TEXT, row_idx INT, col_idx INT) AS (
            SELECT symbol, row_idx, col_idx
            FROM chars
            WHERE symbol NOT IN ( VALUES ('0'), ('1'), ('2'), ('3'), ('4'), ('5'), ('6'), ('7'), ('8'), ('9') )
        ),
        -- PART 1
        -- Recursively build up ranges of numerals that are "active", in the sense of being adjacent to a symbol.
        -- Each range has an accumulated number (as a string), a row index, a column index and length of the run.
        active(number TEXT, row_idx INT, col_idx INT, length INT) AS (
            -- Base case: numerals adjacent to a symbol
            SELECT numerals.*, 1
            FROM
                numerals,
                symbols,
                generate_series(-1, 1) row_off,
                generate_series(-1, 1) col_off
            WHERE numerals.row_idx = symbols.row_idx + row_off
              AND numerals.col_idx = symbols.col_idx + col_off
            UNION
            -- Inductive case 1: Join to the left
            SELECT numerals.number || active.number, numerals.row_idx, numerals.col_idx, active.length + 1
            FROM numerals, active
            WHERE numerals.row_idx = active.row_idx
              AND numerals.col_idx = active.col_idx - 1
            UNION
            -- Inductive case 2: Join to the right
            SELECT active.number || numerals.number, numerals.row_idx, active.col_idx, active.length + 1
            FROM numerals, active
            WHERE numerals.row_idx = active.row_idx
              AND numerals.col_idx = active.col_idx + active.length
        ),
        parts(number INT, row_idx INT, col_idx INT, length INT) AS (
            SELECT active.number::INT, row_idx, col_idx, length
            FROM active
            WHERE (active.row_idx, active.col_idx-1) NOT IN (SELECT row_idx, col_idx FROM numerals)
              AND (active.row_idx, active.col_idx+length) NOT IN (SELECT row_idx, col_idx FROM numerals)
        ),
        part1(part1 BIGINT) AS ( SELECT SUM(parts.number::INT) FROM parts ),
        -- PART 2
        -- A "gear" is a `*` adjacent to exactly two part numbers. We want the sum over gears of their product.
        -- A gear is identified by a location, which we will want to attempt to join with part numbers.
        gear_adjacent(row_idx INT, col_idx INT, number INT, part_row INT, part_col INT) AS (
            SELECT DISTINCT symbols.row_idx, symbols.col_idx, parts.number, parts.row_idx, parts.col_idx
            FROM
                symbols,
                generate_series(-1, 1) gear_r_off,
                generate_series(-1, 1) gear_c_off,
                parts,
                generate_series(parts.col_idx, parts.col_idx + parts.length - 1) part_col
            WHERE symbols.symbol = '*'
              AND symbols.row_idx + gear_r_off = parts.row_idx
              AND symbols.col_idx + gear_c_off = part_col
        ),
        gears(row_idx INT, col_idx INT) AS (
            SELECT row_idx, col_idx
            FROM gear_adjacent
            GROUP BY row_idx, col_idx
            HAVING COUNT(*) = 2
        ),
        gear_products(row_idx INT, col_idx INT, product INT) AS (
            SELECT DISTINCT gears.row_idx, gears.col_idx, p1.number * p2.number
            FROM gears, gear_adjacent p1, gear_adjacent p2
            WHERE gears.row_idx = p1.row_idx
              AND gears.col_idx = p1.col_idx
              AND gears.row_idx = p2.row_idx
              AND gears.col_idx = p2.col_idx
              AND (p1.part_row != p2.part_row OR p1.part_col != p2.part_col)
        ),
        part2(part2 BIGINT) AS ( SELECT SUM(product) FROM gear_products)

    SELECT * FROM part1, part2;

Contributors

Day 3 was brought to you by: @frankmcsherry, @morsapaes

Day four introduced scratch cards where each line of input has some winners and losers. This was easy SQL until part two, in which winners give you other scratch cards, which have winners that give you other scratch cards, which .. you can see the recursion. Despite being wordy and complicated, the SQL isn’t so bad:

sql

    -- PART 2
    -- Each card provides a copy of the next `score` cards.
    expanded(card INT, score BIGINT) AS (
        SELECT * FROM matches
        UNION ALL
        SELECT
            matches.card,
            matches.score
        FROM
            expanded,
            matches,
            generate_series(1, expanded.score) as step
        WHERE
            expanded.card + step = matches.card
    ),
    part2(part2 BIGINT) AS ( SELECT COUNT(*) FROM expanded)

This would be tricky to do with non-recursive SQL, as the data itself tells us how to unfold the results. Hooray for recursion!

See the solution

Link to puzzle(s) 🟢 🟢

Part one

sql

WITH parsed AS (
  SELECT regexp_split_to_table(input, '\n') AS line FROM aoc_1204
),
numbers AS (
  SELECT split_part(line,':',1) AS card_id,
         replace(split_part(line,':',2),'|','') AS nrs
  FROM parsed
),
arr AS (
  SELECT card_id,
         nrs,
         regexp_split_to_array(ltrim(rtrim(nrs)),'\s') AS nrs_arr
  FROM numbers
),
winning AS (
  SELECT card_id,
         unnest(array_remove(nrs_arr,'')) nr,
         ROW_NUMBER() OVER (PARTITION BY card_id) AS row_num
  FROM arr
  GROUP BY card_id, nr HAVING COUNT(*)>1
  ORDER BY card_id
),
winning_points AS (
  SELECT ROUND(EXP(SUM(LN(CASE WHEN row_num = 1 THEN row_num ELSE 2 END)))) AS points
  FROM winning
  GROUP BY card_id
)
SELECT SUM(points)
FROM winning_points;

Part two

sql

WITH MUTUALLY RECURSIVE
lines(line string) AS (
    SELECT
        regexp_split_to_table(input, '\n') AS line
    FROM
        aoc_1204
),
cards(match string[]) AS (
    SELECT
        regexp_match(line, 'Card +(\d+): (.*)') AS match
    FROM
        lines
),
card_parts(card_id int, parts string[]) AS (
    SELECT
        match[1]::int AS card_id,
        regexp_split_to_array(match[2], ' \| ') AS parts
    FROM
        cards
),
winners(card_id int, val int) AS (
    SELECT
        card_id,
        regexp_split_to_table(trim(parts[1]), '\s+')::int AS val
    FROM
        card_parts
),
ours(card_id int, val int) AS (
    SELECT
        card_id,
        regexp_split_to_table(trim(parts[2]), '\s+')::int AS val
    FROM
        card_parts
),
count_winning_numbers(card_id int, count int) AS (
    SELECT
        ours.card_id,
        count(winners.val)::int AS count
    FROM
        ours LEFT OUTER JOIN winners ON (
            ours.card_id = winners.card_id AND
            ours.val = winners.val
        )
    GROUP BY ours.card_id
),
prizes(card_id int, prize_id int) AS (
    SELECT
        card_id,
        prize_id
    FROM
        count_winning_numbers CROSS JOIN generate_series(card_id + 1, card_id + count) AS prize_id
    UNION
    SELECT
        0 AS card_id,
        ours.card_id AS prize_id
    FROM
        ours
),
multipliers(card_id int, multiplier int) AS (
    SELECT
        prizes.prize_id AS card_id,
        SUM(coalesce(multipliers.multiplier, 1))::int AS multiplier
    FROM
        prizes left outer JOIN multipliers ON (
            prizes.card_id = multipliers.card_id
        )
    GROUP BY prizes.prize_id
)
SELECT
    SUM(multiplier) AS answer
FROM
    multipliers;

Part one + two in one go!

sql

-- Pre-supposes a view `input(input TEXT)` containing the string FROM AOC
WITH MUTUALLY RECURSIVE
    -- PART 0
    -- Parse the input as lines of text with line numbers.
    lines(line TEXT) AS (
        SELECT regexp_split_to_table(input, '\n')
        FROM   input
    ),
    blocks(card TEXT, wins TEXT, have TEXT) AS (
        SELECT
            TRIM (regexp_split_to_array(line, '(:|\|)')[1]),
            TRIM (regexp_split_to_array(line, '(:|\|)')[2]),
            TRIM (regexp_split_to_array(line, '(:|\|)')[3])
        FROM
            lines
    ),
    parsed(card INT, wins TEXT[], have TEXT[]) AS (
        SELECT
            regexp_match(card, '[0-9]+')[1]::INT,
            regexp_split_to_array(wins, ' '),
            regexp_split_to_array(have, ' ')
        FROM blocks
    ),

    -- PART 1
    -- Count "have"s in "wins" for each row, exponentiate, sum.
    matches(card INT, score BIGINT) AS (
        SELECT card, (
            SELECT COUNT(*)
            FROM (
                SELECT unnest(wins) w
                INTERSECT
                SELECT unnest(have) w
            )
            WHERE w != ''
        )
        FROM parsed
    ),
    part1(part1 NUMERIC) AS (
        SELECT SUM(pow(2, score - 1))::NUMERIC
        FROM matches
        WHERE score > 0
    ),

    -- PART 2
    -- Each card provides a copy of the next `score` cards.
    -- This could be prefix sum if we want to be clever ...
    expanded(card INT, score BIGINT) AS (
        SELECT * FROM matches
        UNION ALL
        SELECT
            matches.card,
            matches.score
        FROM
            expanded,
            matches,
            generate_series(1, expanded.score) as step
        WHERE
            expanded.card + step = matches.card
    ),
    part2(part2 BIGINT) AS ( SELECT COUNT(*) FROM expanded)

select * from part1, part2;

Contributors

Day 4 was brought to you by: @chass, @doy-materialize, @frankmcsherry, @morsapaes

Day five was a bit of a bear. It was the same day we were doing a Materialize on-site and we were all a bit distracted, but also it was pretty beefy. You first have to “route” various elements through a sequence of remappings, whose length is defined in the data. You then have to expand that out to routing whole intervals (rather than elements), and .. there is just lots of potential for error. I used recursive SQL to handle all the remapping, but other folks just expanded out their SQL for each of the (ten-ish) remappings.

See the solution

Link to puzzle(s) 🟢 🟢

Part one

sql

WITH seeds AS (
    SELECT
        regexp_split_to_table(
            regexp_split_to_array(
                regexp_split_to_array(input, '\n')[1],
                ': '
            )[2],
            ' '
        )::bigint AS seed
    FROM
        input
),
seed_to_soil_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'seed-to-soil map:\n([0-9 \n]*?)\n\n')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
seed_to_soil AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        seed_to_soil_lines
),
soil_to_fertilizer_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'soil-to-fertilizer map:\n([0-9 \n]*?)\n\n')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
soil_to_fertilizer AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        soil_to_fertilizer_lines
),
fertilizer_to_water_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'fertilizer-to-water map:\n([0-9 \n]*?)\n\n')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
fertilizer_to_water AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        fertilizer_to_water_lines
),
water_to_light_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'water-to-light map:\n([0-9 \n]*?)\n\n')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
water_to_light AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        water_to_light_lines
),
light_to_temperature_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'light-to-temperature map:\n([0-9 \n]*?)\n\n')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
light_to_temperature AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        light_to_temperature_lines
),
temperature_to_humidity_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'temperature-to-humidity map:\n([0-9 \n]*?)\n\n')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
temperature_to_humidity AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        temperature_to_humidity_lines
),
humidity_to_location_lines AS (
    SELECT
        regexp_split_to_array(
            regexp_split_to_table(
                regexp_match(input, 'humidity-to-location map:\n([0-9 \n]*)')[1],
                '\n'
            ),
            ' '
        )::bigint[] AS line
    FROM
        input
),
humidity_to_location AS (
    SELECT
        line[1] AS dst_base,
        line[2] AS src_base,
        line[3] AS len
    FROM
        humidity_to_location_lines
),
soil AS (
    SELECT
        seed,
        coalesce(
            MAX(
                CASE
                    WHEN seed >= src_base AND seed < src_base + len
                    THEN dst_base + (seed - src_base)
                    ELSE null
                END
            ),
            seed
        ) AS soil
    FROM
        seeds, seed_to_soil
    GROUP BY seed
),
fertilizer AS (
    SELECT
        soil,
        coalesce(
            MAX(
                CASE
                    WHEN soil >= src_base AND soil < src_base + len
                    THEN dst_base + (soil - src_base)
                    ELSE null
                END
            ),
            soil
        ) AS fertilizer
    FROM
        soil, soil_to_fertilizer
    GROUP BY soil
),
water AS (
    SELECT
        fertilizer,
        coalesce(
            MAX(
                CASE
                    when fertilizer >= src_base AND fertilizer < src_base + len
                    then dst_base + (fertilizer - src_base)
                    else null
                END
            ),
            fertilizer
        ) AS water
    FROM
        fertilizer, fertilizer_to_water
    GROUP BY fertilizer
),
light AS (
    SELECT
        water,
        coalesce(
            MAX(
                CASE
                    WHEN water >= src_base AND water < src_base + len
                    THEN dst_base + (water - src_base)
                    ELSE null
                END
            ),
            water
        ) AS light
    FROM
        water, water_to_light
    GROUP BY water
),
temperature AS (
    SELECT
        light,
        coalesce(
            MAX(
                CASE
                    WHEN light >= src_base AND light < src_base + len
                    THEN dst_base + (light - src_base)
                    ELSE null
                END
            ),
            light
        ) AS temperature
    FROM
        light, light_to_temperature
    GROUP BY light
),
humidity AS (
    SELECT
        temperature,
        coalesce(
            MAX(
                CASE
                    WHEN temperature >= src_base AND temperature < src_base + len
                    THEN dst_base + (temperature - src_base)
                    ELSE null
                END
            ),
            temperature
        ) AS humidity
    FROM
        temperature, temperature_to_humidity
    GROUP BY temperature
),
location AS (
    SELECT
        humidity,
        coalesce(
            MAX(
                CASE
                    WHEN humidity >= src_base AND humidity < src_base + len
                    THEN dst_base + (humidity - src_base)
                    ELSE null
                END
            ),
            humidity
        ) AS location
    FROM
        humidity, humidity_to_location
    GROUP BY humidity
)
SELECT
    MIN(location) AS answer
FROM
    location;

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE
    blocks(head TEXT, body TEXT) AS (
        SELECT
            split_part(regexp_split_to_table(input, '\n\n'), ':', 1),
            split_part(regexp_split_to_table(input, '\n\n'), ':', 2)
        FROM
            input
    ),
    seeds(seed BIGINT) AS (
        SELECT regexp_split_to_table(trim(body), ' ')::BIGINT
        FROM blocks
        WHERE head = 'seeds'
    ),
    entry0(src_name TEXT, dst_name TEXT, dst_idx TEXT, src_idx TEXT, len TEXT) AS (
        SELECT
            split_part(split_part(head, ' ', 1), '-', 1),
            split_part(split_part(head, ' ', 1), '-', 3),
            split_part(regexp_split_to_table(body, '\n'), ' ', 1),
            split_part(regexp_split_to_table(body, '\n'), ' ', 2),
            split_part(regexp_split_to_table(body, '\n'), ' ', 3)
        FROM
            blocks
        WHERE
            head != 'seeds'
    ),
    entry(src_name TEXT, dst_name TEXT, src_idx BIGINT, dst_idx BIGINT, len BIGINT) AS (
        SELECT
            src_name,
            dst_name,
            src_idx::BIGINT,
            dst_idx::BIGINT,
            len::BIGINT
        FROM
            entry0
        WHERE
            src_idx != ''
    ),

    -- PART 1
    -- Our active inventory of .. "stuff"
    active(name TEXT, idx BIGINT) AS (
        SELECT 'seed', seed FROM seeds
        UNION ALL
        SELECT
            intent.dst_name,
            COALESCE(intent.idx + (entry.dst_idx - entry.src_idx), idx)
        FROM intent LEFT JOIN entry ON (
            intent.src_name = entry.src_name AND
            intent.dst_name = entry.dst_name AND
            intent.idx BETWEEN entry.src_idx AND entry.src_idx + len - 1)
    ),
    -- We would like to perform this mapping, but must find a range.
    intent(src_name TEXT, dst_name TEXT, idx BIGINT) AS (
        SELECT DISTINCT entry.src_name, dst_name, idx
        FROM active, entry
        WHERE active.name = entry.src_name
    ),
    part1(part1 BIGINT) AS (
        SELECT MIN(idx) FROM active WHERE name = 'location'
    ),

    -- PART 2
    -- Now we are doing *ranges* of seeds, rather than seed identifiers.
    -- They are big ranges, so we'll need to be smarter!
    seeds2(start_idx BIGINT, end_idx BIGINT) AS (
        SELECT
            regexp_split_to_array(trim(body), ' ')[2*x-1]::BIGINT,
            regexp_split_to_array(trim(body), ' ')[2*x-1]::BIGINT + regexp_split_to_array(trim(body), ' ')[2*x]::BIGINT
        FROM
            blocks,
            generate_series(1, array_length(regexp_split_to_array(trim(body), ' '), 1)/2) x
        WHERE head = 'seeds'
    ),
    active2(name TEXT, start_idx BIGINT, end_idx BIGINT) AS (
        SELECT 'seed', start_idx, end_idx
        FROM seeds2
        UNION
        SELECT
            dst_name,
            clipped_start + (entry_dst - entry_start),
            clipped_end   + (entry_dst - entry_start)
        FROM intersection
        UNION
        SELECT
            name,
            start_idx,
            end_idx
        FROM hole
    ),
    -- We would like to perform this mapping, but must find a range.
    intent2(src_name TEXT, dst_name TEXT, start_idx BIGINT, end_idx BIGINT) AS (
        SELECT DISTINCT entry.src_name, dst_name, start_idx, end_idx
        FROM active2, entry
        WHERE active2.name = entry.src_name
    ),
    -- Each mapping has a potential intersection with a requested range.
    intersection(src_name TEXT, dst_name TEXT, start_idx BIGINT, end_idx BIGINT, entry_start BIGINT, entry_end BIGINT, clipped_start BIGINT, clipped_end BIGINT, entry_dst BIGINT) AS (
        SELECT
            intent2.src_name,
            intent2.dst_name,
            intent2.start_idx,
            intent2.end_idx,
            entry.src_idx,
            entry.src_idx + entry.len,
            GREATEST(start_idx, entry.src_idx),
            LEAST(end_idx, entry.src_idx + entry.len),
            entry.dst_idx
        FROM intent2, entry
        WHERE intent2.src_name = entry.src_name
          AND intent2.dst_name = entry.dst_name
          AND GREATEST(intent2.start_idx, entry.src_idx)
            < LEAST(intent2.end_idx, entry.src_idx + entry.len)
    ),
    -- We may have holes in our intervals. Each intersection's start and end is the end and
    -- start, respectively, of some hole we may have that needs to remain the identity.
    hole(name TEXT, start_idx BIGINT, end_idx BIGINT) AS (
        SELECT * FROM (
            SELECT
                dst_name,
                clipped_end start_idx,
                (
                    SELECT COALESCE(MIN(i2.clipped_start), i1.end_idx)
                    FROM intersection i2
                    WHERE i2.clipped_start >= i1.clipped_end
                    AND i2.clipped_start < i1.end_idx
                    AND i1.src_name = i2.src_name
                    AND i1.dst_name = i2.dst_name
                    AND i1.start_idx = i2.start_idx
                    AND i1.end_idx = i2.end_idx
                ) end_idx
            FROM intersection i1
            UNION
            SELECT DISTINCT
                dst_name,
                start_idx,
                (
                    SELECT COALESCE(MIN(i2.clipped_start), i1.end_idx)
                    FROM intersection i2
                    WHERE i2.clipped_start >= i1.start_idx
                    AND i2.clipped_start < i1.end_idx
                    AND i1.src_name = i2.src_name
                    AND i1.dst_name = i2.dst_name
                    AND i1.start_idx = i2.start_idx
                    AND i1.end_idx = i2.end_idx
                )
            FROM intent2 i1
        )
        WHERE start_idx < end_idx
    ),
    part2(part2 BIGINT) AS ( SELECT MIN(start_idx) FROM active2 WHERE name = 'location')

SELECT * FROM part1, part2;

Contributors

Day 5 was brought to you by: @doy-materialize, @frankmcsherry, @nrainer-materialize

Day six was about whether you knew (or were willing to learn about) the quadratic formula.

See the solution

Link to puzzle(s) 🟢 🟢

Part one

sql

WITH options AS
(
	SELECT
	  (floor((time - sqrt(time * time - 4 * record)) / 2) + 1)::int low,
	  (ceil((time + sqrt(time * time - 4 * record)) / 2) - 1)::int hi,
	FROM input
)
SELECT exp(sum(ln(hi - low + 1)))::int
FROM options;

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    ties(slower NUMERIC, faster NUMERIC) AS (
        SELECT
            (time + sqrt(time * time - 4 * distance)) / 2 as slower,
            (time - sqrt(time * time - 4 * distance)) / 2 as faster
        FROM input
    ),
    options(choices NUMERIC) AS (
        SELECT 1 + FLOOR(slower)::NUMERIC - CEIL(faster)::NUMERIC FROM ties
    ),
    part12(part12 NUMERIC) AS (
        SELECT pow(10.0, SUM(log(choices))) FROM options
    )

SELECT * FROM part12;

Contributors

Day 6 was brought to you by: @doy-materialize, @frankmcsherry, @nrainer-materialize, @petrosagg

Day seven is about scoring poker hands, using some new rules for tie breaking. This was mostly SQL aggregation, as the numbers of each card in each hand largely determine the outcome, other than tie-breaking where I learned about the translate function.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

    -- Hands of cards (e.g. 'AKJQT') and integer bids.
    WITH MUTUALLY RECURSIVE
        lines(line TEXT) AS ( SELECT regexp_split_to_table(input, '\n') FROM input ),
        hands(hand TEXT, bid INT) as (
            SELECT regexp_split_to_array(line, ' ')[1],
                   regexp_split_to_array(line, ' ')[2]::INT
            FROM lines
        ),
        cards(hand TEXT, value TEXT, position INT) AS (
            SELECT hand, substring(hand, pos, 1), pos
            FROM hands, generate_series(1, 5) pos
        ),

        -- Part1
        counts(hand TEXT, value TEXT, count INT) AS (
            SELECT hand, value, COUNT(*)
            FROM cards
            GROUP BY hand, value
        ),
        ranked(hand TEXT, bid INT, rank INT, score TEXT) AS (
            SELECT
                hand,
                bid,
                CASE WHEN hand IN (SELECT hand FROM counts WHERE count = 5) THEN 1
                     WHEN hand IN (SELECT hand FROM counts WHERE count = 4) THEN 2
                     WHEN hand IN (SELECT hand FROM counts WHERE count = 3)
                      AND hand IN (SELECT hand FROM counts WHERE count = 2) THEN 3
                     WHEN hand IN (SELECT hand FROM counts WHERE count = 3) THEN 4
                     WHEN hand IN (SELECT hand FROM (SELECT hand FROM counts WHERE count = 2) GROUP BY hand HAVING COUNT(*) = 2) THEN 5
                     WHEN hand IN (SELECT hand FROM counts WHERE count = 2) THEN 6
                     ELSE 7
                END,
                translate(hand, 'AKQJT98765432', 'ABCDEFGHIJKLM')
            FROM
                hands
        ),
        part1(part1 INT) AS (
            SELECT SUM(r1.bid)
            FROM ranked r1, ranked r2
            WHERE r1.rank < r2.rank OR (r1.rank = r2.rank AND r1.score <= r2.score)
        ),

        -- Part2: J are now wild for determining rank, but last for score.
        wild(hand TEXT, value TEXT, position INT) AS (
            SELECT * FROM cards
            UNION
            SELECT c1.hand, c2.value, c1.position
            FROM cards c1, cards c2
            WHERE c1.hand = c2.hand
              AND c1.value = 'J'
        ),
        wild_hands(hand TEXT, new_hand TEXT) AS (
            SELECT DISTINCT w1.hand, w1.value || w2.value || w3.value || w4.value || w5.value
            FROM (SELECT * FROM wild w1 WHERE position = 1) w1,
                 (SELECT * FROM wild w2 WHERE position = 2) w2,
                 (SELECT * FROM wild w3 WHERE position = 3) w3,
                 (SELECT * FROM wild w4 WHERE position = 4) w4,
                 (SELECT * FROM wild w5 WHERE position = 5) w5
            WHERE w1.hand = w2.hand
              AND w1.hand = w3.hand
              AND w1.hand = w4.hand
              AND w1.hand = w5.hand
        ),
        wild_cards(hand TEXT, value TEXT, position INT) AS (
            SELECT DISTINCT new_hand, substring(new_hand, pos, 1), pos
            FROM wild_hands, generate_series(1, 5) pos
        ),
        wild_counts(hand TEXT, value TEXT, count INT) AS (
            SELECT hand, value, COUNT(*)
            FROM wild_cards
            GROUP BY hand, value
        ),
        wild_ranked(hand TEXT, new_hand TEXT, rank INT, score TEXT) AS (
            SELECT
                hand,
                new_hand,
                CASE WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 5) THEN 1
                     WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 4) THEN 2
                     WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 3)
                      AND new_hand IN (SELECT hand FROM wild_counts WHERE count = 2) THEN 3
                     WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 3) THEN 4
                     WHEN new_hand IN (SELECT hand FROM (SELECT hand FROM wild_counts WHERE count = 2) GROUP BY hand HAVING COUNT(*) = 2) THEN 5
                     WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 2) THEN 6
                     ELSE 7
                END,
                translate(hand, 'AKQT98765432J', 'ABCDEFGHIJKLM')
            FROM
                wild_hands
        ),
        best_hands(hand TEXT, new_hand TEXT, rank INT, score TEXT) AS (
            SELECT DISTINCT ON (hand) hand, new_hand, rank, score
            FROM wild_ranked
            ORDER BY hand, rank, score
        ),
        wild_bids(hand TEXT, bid INT, rank INT, score TEXT) AS (
            SELECT hands.hand, hands.bid, rank, score
            FROM hands, best_hands
            WHERE hands.hand = best_hands.hand
        ),
        part2(part2 INT) AS (
            SELECT SUM(r1.bid)
            FROM wild_bids r1, wild_bids r2
            WHERE r1.rank < r2.rank OR (r1.rank = r2.rank AND r1.score <= r2.score)
        )

    SELECT * FROM part1, part2;

Contributors

Day 7 was brought to you by: @frankmcsherry, @nrainer-materialize

Week two

Day eight involved some graph navigation (recursion), and some mathematics. The mathematics were of the form “notice that various things are relatively prime”, and it was important to rely on SQL as a tool to support reasoning, as opposed to directly attacking the specified computation. In this case, my problem called for 14,935,034,899,483 steps, and no tool is going to make direct simulation be the right answer.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

    WITH MUTUALLY RECURSIVE

        route(step TEXT, steps INT) AS (
            SELECT substring(input, steps, 1), steps
            FROM steps_input, generate_series(1, length(input)) steps
        ),

        -- Part 1: Start at 'AAA` and go until `ZZZ`.
        pos1(state TEXT, steps INT) AS (
            SELECT 'AAA', 0
            UNION ALL
            SELECT
                CASE WHEN route.step = 'L' THEN paths.left
                     WHEN route.step = 'R' THEN paths.right
                     ELSE '???'
                END,
                pos1.steps + 1
            FROM paths, pos1, route
            WHERE pos1.state = paths.state
              AND 1 + (pos1.steps % 263) = route.steps
              AND pos1.state != 'ZZZ'
              AND pos1.state != '???'
        )
        part1(part1 INT) AS (SELECT steps FROM pos1 WHERE pos1.state = 'ZZZ'),

        -- Part 2: Start at all '**A` and go until all at '**Z'
        pos2(start TEXT, state TEXT, steps INT) AS (
            SELECT state, state, 0
            FROM paths
            WHERE substring(state, 3, 1) = 'A'
            UNION ALL
            SELECT
                pos2.start,
                CASE WHEN route.step = 'L' THEN paths.left
                     WHEN route.step = 'R' THEN paths.right
                     ELSE '???'
                END,
                pos2.steps + 1
            FROM paths, pos2, route
            WHERE pos2.state = paths.state
              AND 1 + (pos2.steps % 263) = route.steps
              AND substring(pos2.state, 3, 1) != 'Z'
        )

    SELECT * FROM pos2 WHERE substring(state, 3, 1) = 'Z';

Contributors

Day 8 was brought to you by: @doy-materialize, @frankmcsherry, @nrainer-materialize

Day nine was a refreshing introduction to polynomials, and how if you take enough derivatives of them they end up at zero. The task was to do this, repeatedly difference adjacent measurements, or adjacent differences, etc., until you get all zeros. Then, integrate back up to get projections in the forward and reverse direction. I used recursion here to accommodate the unknown degree of the polynomial (somewhere in the twenties).

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 30)

        lines (line TEXT, line_no INT) AS (
            SELECT regexp_split_to_array(input, '\n')[i], i
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) i
        ),

        numbers(value INT, line_no INT, col_no INT) AS (
            SELECT regexp_split_to_array(line, ' ')[j]::INT, line_no, j
            FROM lines, generate_series(1, array_length(regexp_split_to_array(line, ' '), 1)) j
        ),

        -- Contains non-zero values of differences after each round.
        derivatives(value INT, line_no INT, col_no INT, round INT) AS (
            SELECT numbers.*, 1
            FROM numbers
            UNION
            SELECT
                COALESCE(i2.value, 0) - COALESCE(i1.value, 0),
                COALESCE(i1.line_no, i2.line_no),
                COALESCE(i1.col_no + 1, i2.col_no),
                COALESCE(i1.round, i2.round) + 1
            FROM derivatives i1 FULL OUTER JOIN derivatives i2 ON (i1.line_no = i2.line_no AND i1.round = i2.round AND i1.col_no + 1 = i2.col_no)
            WHERE COALESCE(i2.value, 0) - COALESCE(i1.value, 0) != 0
              AND COALESCE(i1.col_no + 1, i2.col_no) > COALESCE(i1.round, i2.round)
              AND COALESCE(i1.col_no + 1, i2.col_no) <= 21
        ),

        -- Accumulate the derivatives at the leading edge
        part1(part1 BIGINT) AS (
            SELECT SUM(value)
            FROM derivatives
            WHERE col_no = 21
        ),

        -- Accumulate the derivatives at the preceding edge
        part2(part2 BIGINT) AS (
            SELECT SUM(pow(-1, round + 1) * value)
            FROM derivatives
            WHERE col_no = round
        )

    -- SELECT * FROM derivatives WHERE line_no = 1 ORDER BY round, col_no;
    SELECT * FROM part1, part2;

Contributors

Day 9 was brought to you by: @frankmcsherry, @nrainer-materialize

Day ten presents you with a grid of pipe (symbols |, -, J, 7, F, and L), and questions about how long a loop of pipe is, and then how many cells are contained within it. The first part involved recursion, and I used it again for a dynamic programming solution to the second part.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

    WITH MUTUALLY RECURSIVE

        lines(line TEXT, row_no INT) AS (
            SELECT regexp_split_to_array(input, '\n')[i], i
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) i
        ),

        symbols(symb TEXT, row_no INT, col_no INT) as (
            SELECT substring(line, j, 1), row_no, j
            FROM lines, generate_series(1, length(line)) j
        ),

        -- Each location that is pipe has two neighbors
        edge1(r1 INT, c1 INT, r2 INT, c2 INT) AS (
            SELECT
                row_no,
                col_no,
                CASE WHEN symb = '-' THEN row_no
                     WHEN symb = '|' THEN row_no - 1
                     WHEN symb = 'F' THEN row_no + 1
                     WHEN symb = 'L' THEN row_no - 1
                     WHEN symb = 'J' THEN row_no
                     WHEN symb = '7' THEN row_no
                     ELSE NULL
                END,
                CASE WHEN symb = '-' THEN col_no - 1
                     WHEN symb = '|' THEN col_no
                     WHEN symb = 'F' THEN col_no
                     WHEN symb = 'L' THEN col_no
                     WHEN symb = 'J' THEN col_no - 1
                     WHEN symb = '7' THEN col_no - 1
                     ELSE NULL
                END
            FROM symbols
            WHERE symb != '.' AND symb != 'S'
        ),
        edge2(r1 INT, c1 INT, r2 INT, c2 INT) AS (
            SELECT
                row_no,
                col_no,
                CASE WHEN symb = '-' THEN row_no
                     WHEN symb = '|' THEN row_no + 1
                     WHEN symb = 'F' THEN row_no
                     WHEN symb = 'L' THEN row_no
                     WHEN symb = 'J' THEN row_no - 1
                     WHEN symb = '7' THEN row_no + 1
                     ELSE NULL
                END,
                CASE WHEN symb = '-' THEN col_no + 1
                     WHEN symb = '|' THEN col_no
                     WHEN symb = 'F' THEN col_no + 1
                     WHEN symb = 'L' THEN col_no + 1
                     WHEN symb = 'J' THEN col_no
                     WHEN symb = '7' THEN col_no
                     ELSE NULL
                END
            FROM symbols
            WHERE symb != '.' AND symb != 'S'
        ),
        -- Symmetrized graph
        symm(r1 INT, c1 INT, r2 INT, c2 INT) AS (
            SELECT r1, c1, r2, c2
            FROM (
                SELECT * FROM edge1
                UNION ALL
                SELECT * FROM edge2
                UNION ALL
                SELECT r2, c2, r1, c1 FROM edge1
                UNION ALL
                SELECT r2, c2, r1, c1 FROM edge2
                UNION ALL
                SELECT row_no, col_no, row_no + 1, col_no FROM symbols WHERE symb = 'S'
                UNION ALL
                SELECT row_no, col_no, row_no, col_no + 1 FROM symbols WHERE symb = 'S'
                UNION ALL
                SELECT row_no, col_no, row_no - 1, col_no FROM symbols WHERE symb = 'S'
                UNION ALL
                SELECT row_no, col_no, row_no, col_no - 1 FROM symbols WHERE symb = 'S'
            )
            GROUP BY r1, c1, r2, c2
            HAVING COUNT(*) = 2
        ),
        reach(r INT, c INT) AS (
            SELECT row_no, col_no
            FROM symbols
            WHERE symb = 'S'
            UNION
            SELECT r2, c2
            FROM reach, symm
            WHERE r = r1 AND c = c1
        ),
        part1(part1 BIGINT) AS (
            SELECT COUNT(*)/2 FROM reach
        ),

        -- Part 2: how many cells are *inside* the loop?
        -- All (1, *) and (*, 1) cells have their upper left outside the loop (outer edge of the diagram).
        -- Each cell inherits from its UL neighbor, toggled by any pipe except '7' and 'L' pipe.
        -- Rewrite the pipe to have symbols, and resolve 'S' to actual oriented pipe.
        pipe(r INT, c INT, symb TEXT) AS (
            SELECT r, c, symb
            FROM reach, symbols
            WHERE r = row_no AND c = col_no AND symb != 'S'
            UNION
            SELECT
                row_no,
                col_no,
                CASE WHEN row_no = s1.r1 AND col_no = s1.c1 + 1 AND row_no = s2.r2 + 1 AND col_no = s2.c2 THEN 'J' -- toggle
                     WHEN row_no = s1.r1 AND col_no = s1.c1 + 1 AND row_no = s2.r2 AND col_no = s2.c2 - 1 THEN '-' -- toggle
                     WHEN row_no = s1.r1 AND col_no = s1.c1 + 1 AND row_no = s2.r2 - 1 AND col_no = s2.c2 THEN '7' -- no toggle
                     WHEN row_no = s1.r1 + 1 AND col_no = s1.c1 AND row_no = s2.r2 AND col_no = s2.c2 - 1 THEN 'L' -- no toggle
                     WHEN row_no = s1.r1 + 1 AND col_no = s1.c1 AND row_no = s2.r2 - 1 AND col_no = s2.c2 THEN '|' -- toggle
                     WHEN row_no = s1.r1 AND col_no = s1.c1 - 1 AND row_no = s2.r2 AND col_no = s2.c2 - 1 THEN 'F' -- toggle
                     ELSE '???'
                END
            FROM symbols, symm s1, symm s2
            WHERE symb = 'S'
              AND row_no = s1.r1
              AND col_no = s1.c1
              AND row_no = s2.r1
              AND col_no = s2.c1
        ),
        -- Enclosed(1,*) and Enclosed(*,1) are all false.
        -- Enclosed(x+1,y+1) = Enclosed(x,y) perhaps toggled by pipe(x,y)
        status(r INT, c INT, encl BOOL) AS (
            SELECT row_no, col_no, false
            FROM symbols
            WHERE row_no = 1 OR col_no = 1
            UNION
            SELECT
                row_no + 1,
                col_no + 1,
                CASE WHEN pipe.symb IN (VALUES ('J'),('-'),('|'),('F')) THEN NOT encl
                     ELSE encl
                END
            FROM status LEFT JOIN pipe ON (status.r = pipe.r AND status.c = pipe.c)
            JOIN symbols ON (status.r = symbols.row_no AND status.c = symbols.col_no)
        ),
        part2(part2 BIGINT) AS (
            SELECT COUNT(*)
            FROM status
            WHERE encl = true AND (r, c) NOT IN (SELECT r, c FROM pipe)
        )

    SELECT * FROM part1, part2;

Contributors

Day 10 was brought to you by: @frankmcsherry

Day eleven presents a grid of “galaxies” and has you calculate the distance between pairs (the L1 or “Manhattan” distance, always the sum of absolute values of coordinate differences). Parts one and two were the same, but with different magnitudes of numbers. No recursion here!

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

    WITH MUTUALLY RECURSIVE

        lines(line TEXT, r INT) AS (
            SELECT regexp_split_to_array(input, '\n')[i], i
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) i
        ),

        symbols(symb TEXT, r INT, c INT) as (
            SELECT substring(line, j, 1), r, j
            FROM lines, generate_series(1, length(line)) j
        ),

        row_gaps(r INT) AS (
            SELECT r
            FROM symbols
            GROUP BY r
            HAVING COUNT(*) FILTER (WHERE symb = '#') = 0
        ),

        col_gaps(c INT) AS (
            SELECT c
            FROM symbols
            GROUP BY c
            HAVING COUNT(*) FILTER (WHERE symb = '#') = 0
        ),

        -- Part1: Expand space and restrict to galaxies
        galaxies(r INT, c INT) AS (
            SELECT
                r + (SELECT COUNT(*) FROM row_gaps WHERE row_gaps.r < symbols.r),
                c + (SELECT COUNT(*) FROM col_gaps WHERE col_gaps.c < symbols.c)
            FROM symbols
            WHERE symb = '#'
        ),
        -- Sum of L1 distance between distinct galaxies
        part1(part1 BIGINT) AS (
            SELECT SUM(ABS(g1.r - g2.r) + ABS(g1.c - g2.c))
            FROM galaxies g1, galaxies g2
            WHERE g1.r < g2.r
               OR (g1.r = g2.r AND g1.c < g2.c)
        )

        -- Part2: Expand space MORE and restrict to galaxies
        galaxies2(r INT, c INT) AS (
            SELECT
                r + 999999 * (SELECT COUNT(*) FROM row_gaps WHERE row_gaps.r < symbols.r),
                c + 999999 * (SELECT COUNT(*) FROM col_gaps WHERE col_gaps.c < symbols.c)
            FROM symbols
            WHERE symb = '#'
        ),
        -- Sum of L1 distance between distinct galaxies
        part2(part2 BIGINT) AS (
            SELECT SUM(ABS(g1.r - g2.r) + ABS(g1.c - g2.c))
            FROM galaxies2 g1, galaxies2 g2
            WHERE g1.r < g2.r
               OR (g1.r = g2.r AND g1.c < g2.c)
        )

    SELECT * FROM part1, part2;

Contributors

Day 11 was brought to you by: @frankmcsherry, @nrainer-materialize

Day twelve was about sequence alignment, matching partial observations with hard constraints. Dynamic programming was a great solution here, using recursion.

See the solution

Link to puzzle(s) 🟢 🟢

Part one

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, characters TEXT, springs TEXT) AS (
        SELECT
            row_id,
            regexp_split_to_array(regexp_split_to_array(input, '\n')[row_id], ' ')[1] || '.',
            regexp_split_to_array(regexp_split_to_array(input, '\n')[row_id], ' ')[2]
        FROM
            input,
            generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) row_id
    ),
    characters(r INT, pos INT, symb TEXT) AS (
        SELECT
            r,
            pos,
            substring(characters, pos, 1)
        FROM
            lines,
            generate_series(1, length(characters)) pos
    ),
    springs(r INT, pos INT, len INT) AS (
        SELECT
            r,
            pos,
            regexp_split_to_array(springs, ',')[pos]::INT
        FROM
            lines,
            generate_series(1, array_length(regexp_split_to_array(springs, ','), 1)) pos
    ),

    -- How many ways can we pack row `r`'s first `spring` springs (plus a space) into the first `chars` characters?
    -- Importantly, the "plus a space" applies to the last spring also! Each of these should admit the immediate appending of a new spring.
    fits(r INT, chars INT, spring INT) AS (
        -- We can pack no springs into no characters.
        SELECT r, 0, 0
        FROM lines
        -- We can extend any fits with a blank, as long as there are no '#' observations.
        UNION ALL
        SELECT fits.r, fits.chars + 1, fits.spring
        FROM fits, characters
        WHERE fits.r = characters.r
          AND fits.chars + 1 = characters.pos
          AND characters.symb != '#'
        -- We can extend any fits with the next spring and a blank, as long as no '.' in the spring and no '#' in the blank.
        UNION ALL
        SELECT fits.r, fits.chars + springs.len + 1, fits.spring + 1
        FROM
            fits,
            springs,
            characters
        WHERE fits.r = springs.r
          AND fits.spring + 1 = springs.pos
          AND fits.r = characters.r
          AND fits.chars + springs.len + 1 = characters.pos
          AND characters.symb != '#'
          AND NOT EXISTS (SELECT FROM characters c WHERE c.r = fits.r AND c.symb = '.' AND c.pos BETWEEN fits.chars + 1 AND fits.chars + springs.len)
    ),

    fit_counts(r INT, chars INT, spring INT, count BIGINT) AS (
        SELECT r, chars, spring, COUNT(*) AS count
        FROM fits
        GROUP BY r, chars, spring
    ),
    counts(r INT, chars INT, spring INT, count BIGINT) AS (
        SELECT DISTINCT ON (r) r, chars, spring, count
        FROM fit_counts
        ORDER BY r, chars DESC, spring DESC
    ),

    potato (x INT) AS ( SELECT 1 )

SELECT SUM(count) FROM counts;

Contributors

Day 12 was brought to you by: @frankmcsherry, @nrainer-materialize

Day thirteen had grids of observations with the hypothesis that each is mirrored, horizontally or vertically, at some point that you need to find. SQL and subqueries were a great way to validate hypothetical mirroring axes.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

    WITH MUTUALLY RECURSIVE

        blocks(b INT, block TEXT) AS (
            SELECT b, regexp_split_to_array(input, '\n\n')[b] as block
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n\n'), 1)) b
        ),
        lines(b INT, r INT, line TEXT) AS (
            SELECT b, r, regexp_split_to_array(block, '\n')[r] as block
            FROM blocks, generate_series(1, array_length(regexp_split_to_array(block, '\n'), 1)) r
        ),
        cells(b INT, r INT, c INT, symbol TEXT) AS (
            SELECT b, r, c, substring(line, c, 1)
            FROM lines, generate_series(1, length(line)) c
        ),
        columns(b INT, c INT, column TEXT) AS (
            SELECT b, c, string_agg(symbol, '' ORDER BY r) FROM cells GROUP BY b, c
        ),

        row_mirror(b INT, r INT) AS (
            SELECT *
            FROM (SELECT DISTINCT b, r FROM cells) o
            WHERE NOT EXISTS (
                -- We would be upset to find rows at mirrored positions that do not match
                -- Rows that match, or have no mirrored position, are fine.
                SELECT FROM lines
                WHERE o.b = lines.b
                GROUP BY abs(2 * lines.r - (2 * o.r - 1))
                HAVING COUNT(DISTINCT lines.line) > 1
            )
        ),

        col_mirror(b INT, c INT) AS (
            SELECT *
            FROM (SELECT DISTINCT b, c FROM cells) o
            WHERE NOT EXISTS (
                -- We would be upset to find rows at mirrored positions that do not match
                -- Rows that match, or have no mirrored position, are fine.
                SELECT FROM columns
                WHERE o.b = columns.b
                GROUP BY abs(2 * columns.c - (2 * o.c - 1))
                HAVING COUNT(DISTINCT columns.column) > 1
            )
        ),

        part1(part1 BIGINT) AS (
            SELECT COALESCE((SELECT SUM(r-1) FROM row_mirror), 0) * 100
                 + COALESCE((SELECT SUM(c-1) FROM col_mirror), 0)
        ),

        row_mirror2(b INT, r INT) AS (
            SELECT *
            FROM (SELECT DISTINCT b, r FROM cells) o
            WHERE 1 = (
                SELECT COUNT(*)
                FROM cells c1, cells c2
                WHERE abs(2 * c1.r - (2 * o.r - 1)) = abs(2 * c2.r - (2 * o.r - 1))
                  AND c1.r < c2.r
                  AND c1.c = c2.c
                  AND c1.b = c2.b
                  AND c1.b = o.b
                  AND c1.symbol != c2.symbol
            )
        ),

        col_mirror2(b INT, c INT) AS (
            SELECT *
            FROM (SELECT DISTINCT b, c FROM cells) o
            WHERE 1 = (
                SELECT COUNT(*)
                FROM cells c1, cells c2
                WHERE abs(2 * c1.c - (2 * o.c - 1)) = abs(2 * c2.c - (2 * o.c - 1))
                  AND c1.c < c2.c
                  AND c1.r = c2.r
                  AND c1.b = c2.b
                  AND c1.b = o.b
                  AND c1.symbol != c2.symbol
            )
        ),

        part2(part2 BIGINT) AS (
            SELECT COALESCE((SELECT SUM(r-1) FROM row_mirror2), 0) * 100
                 + COALESCE((SELECT SUM(c-1) FROM col_mirror2), 0)
        ),

        potato (x INT) AS ( SELECT 1 )

    SELECT * FROM part1, part2;

Contributors

Day 13 was brought to you by: @frankmcsherry, @nrainer-materialize

Day fourteen was a treat, in that it used nested recursion: a WMR block within a WMR block. The problem was simulation of rocks that roll in cardinal directions, changing the direction ninety degrees, and repeating. Each simulation was recursive (rocks roll until they stop), and we were meant to repeat the larger progress a great many times (1,000,000,000 cycles). The only bummer here was the amount of copy/paste re-use, as each of the four cardinal directions had different subqueries.

See the solution

Link to puzzle(s) 🟢 🟢

Part 1

sql

WITH MUTUALLY RECURSIVE

        lines(r INT, line TEXT) AS (
            SELECT r, regexp_split_to_array(input, '\n')[r] as block
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
        ),
        cells(r INT, c INT, symbol TEXT) AS (
            SELECT r, c, substring(line, c, 1)
            FROM lines, generate_series(1, length(line)) c
        ),

        northward(r INT, c INT, symbol TEXT) AS (
            SELECT * FROM northward
            -- Anyone on the move does so
            UNION  ALL SELECT r - 1, c, 'O' FROM north_move
            EXCEPT ALL SELECT r - 1, c, '.' FROM north_move
            UNION  ALL SELECT r, c, '.' FROM north_move
            EXCEPT ALL SELECT r, c, 'O' FROM north_move
            -- Initial state is cells, but not refreshed each round.
            UNION  ALL SELECT * FROM cells
            EXCEPT ALL SELECT * FROM cells_delay
        ),

        -- Each 'O' with a '.' to the north will move.
        north_move(r INT, c INT) AS (
            SELECT n1.r, n1.c
            FROM northward n1, northward n2
            WHERE n1.symbol = 'O'
              AND n1.r = n2.r + 1
              AND n1.c = n2.c
              AND n2.symbol = '.'
        ),

        part1(part1 BIGINT) AS (
            SELECT SUM(1 + (SELECT MAX(r) FROM lines) - r)
            FROM northward
            WHERE symbol = 'O'
        ),

        output (r INT, line TEXT) AS (
            SELECT r, string_agg(symbol, ' ' ORDER BY c)
            FROM northward
            GROUP BY r
        ),

        cells_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM cells )

    SELECT * FROM part1;

Part 2

sql

    WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 142)

        lines(r INT, line TEXT) AS (
            SELECT r, regexp_split_to_array(input, '\n')[r] as block
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
        ),
        cells(r INT, c INT, symbol TEXT) AS (
            SELECT r, c, substring(line, c, 1)
            FROM lines, generate_series(1, length(line)) c
        ),

        -- Where should we start each iteration from?
        -- From `east`, once it exits, but initially `cells`.
        round(r INT, c INT, symbol TEXT) AS (
            SELECT * FROM east
            UNION  ALL SELECT * FROM cells
            EXCEPT ALL SELECT * FROM cells_delay
        ),

        north(r INT, c INT, symbol TEXT) AS (
            WITH MUTUALLY RECURSIVE
                start(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM round
                ),
                northward(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM northward
                    -- Anyone on the move does so
                    UNION  ALL SELECT r - 1, c, 'O' FROM north_move
                    EXCEPT ALL SELECT r - 1, c, '.' FROM north_move
                    UNION  ALL SELECT r, c, '.' FROM north_move
                    EXCEPT ALL SELECT r, c, 'O' FROM north_move
                    -- Second time around, the above cancels and `east` is non-empty.
                    UNION  ALL SELECT * FROM start
                    EXCEPT ALL SELECT * FROM start_delay
                ),
                -- Each 'O' with a '.' in front of them will move.
                north_move(r INT, c INT) AS (
                    SELECT n1.r, n1.c
                    FROM northward n1, northward n2
                    WHERE n1.symbol = 'O'
                    AND n1.r = n2.r + 1
                    AND n1.c = n2.c
                    AND n2.symbol = '.'
                ),
                start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )

            SELECT * FROM northward
        ),

         west(r INT, c INT, symbol TEXT) AS (
            WITH MUTUALLY RECURSIVE
                start(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM north
                ),
                westward(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM westward
                    -- Anyone on the move does so
                    UNION  ALL SELECT r, c - 1, 'O' FROM west_move
                    EXCEPT ALL SELECT r, c - 1, '.' FROM west_move
                    UNION  ALL SELECT r, c, '.' FROM west_move
                    EXCEPT ALL SELECT r, c, 'O' FROM west_move
                    -- Initial state is cells, but not refreshed each round.
                    UNION  ALL SELECT * FROM start
                    EXCEPT ALL SELECT * FROM start_delay
                ),
                -- Each 'O' with a '.' in front of them will move.
                west_move(r INT, c INT) AS (
                    SELECT w1.r, w1.c
                    FROM westward w1, westward w2
                    WHERE w1.symbol = 'O'
                    AND w1.r = w2.r
                    AND w1.c = w2.c + 1
                    AND w2.symbol = '.'
                ),
                start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )

            SELECT * FROM westward
        ),

        south(r INT, c INT, symbol TEXT) AS (
            WITH MUTUALLY RECURSIVE
                start(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM west
                ),
                southward(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM southward
                    -- Anyone on the move does so
                    UNION  ALL SELECT r + 1, c, 'O' FROM south_move
                    EXCEPT ALL SELECT r + 1, c, '.' FROM south_move
                    UNION  ALL SELECT r, c, '.' FROM south_move
                    EXCEPT ALL SELECT r, c, 'O' FROM south_move
                    -- Initial state is cells, but not refreshed each round.
                    UNION  ALL SELECT * FROM start
                    EXCEPT ALL SELECT * FROM start_delay
                ),
                -- Each 'O' with a '.' in front of them will move.
                south_move(r INT, c INT) AS (
                    SELECT s1.r, s1.c
                    FROM southward s1, southward s2
                    WHERE s1.symbol = 'O'
                    AND s1.r = s2.r - 1
                    AND s1.c = s2.c
                    AND s2.symbol = '.'
                ),
                start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )
            SELECT * FROM southward
        ),

        east(r INT, c INT, symbol TEXT) AS (
            WITH MUTUALLY RECURSIVE
                start(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM south
                ),
                eastward(r INT, c INT, symbol TEXT) AS (
                    SELECT * FROM eastward
                    -- Anyone on the move does so
                    UNION  ALL SELECT r, c + 1, 'O' FROM east_move
                    EXCEPT ALL SELECT r, c + 1, '.' FROM east_move
                    UNION  ALL SELECT r, c, '.' FROM east_move
                    EXCEPT ALL SELECT r, c, 'O' FROM east_move
                    -- Initial state is cells, but not refreshed each round.
                    UNION  ALL SELECT * FROM start
                    EXCEPT ALL SELECT * FROM start_delay
                ),
                -- Each 'O' with a '.' in front of them will move.
                east_move(r INT, c INT) AS (
                    SELECT e1.r, e1.c
                    FROM eastward e1, eastward e2
                    WHERE e1.symbol = 'O'
                    AND e1.r = e2.r
                    AND e1.c = e2.c - 1
                    AND e2.symbol = '.'
                ),
                start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )
            SELECT * FROM eastward
        ),

        output (r INT, line TEXT) AS (
            SELECT r, string_agg(symbol, ' ' ORDER BY c)
            FROM round
            GROUP BY r
        ),

        transitions(source TEXT, target TEXT) AS (
            SELECT
                (SELECT string_agg(symbol, '' ORDER BY r, c) FROM round),
                (SELECT string_agg(symbol, '' ORDER BY r, c) FROM east)
            UNION ALL
            SELECT * FROM transitions
        ),

        part2(part2 BIGINT) AS (
            SELECT SUM(1 + (SELECT MAX(r) FROM lines) - r)
            FROM east
            WHERE symbol = 'O'
        ),

        cells_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM cells )

    -- SELECT count, COUNT(*)
    -- FROM (
    --     SELECT source, target, COUNT(*) count
    --     FROM transitions
    --     GROUP BY source, target)
    -- GROUP BY count;

    -- SELECT * FROM output ORDER BY r;

    SELECT * FROM part2;

Contributors

Day 14 was brought to you by: @frankmcsherry

Week three

Day fifteen has you implement a hash function, and then a hash map. Recursion was a handy way to walk through the input to be hashed, though the hash function was simple enough that you could have used math directly instead. The second part (the hash map) did not require recursion, as rather than simulate the operations you could leap to the final state you were looking for.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 10)

        strings(r INT, string TEXT) AS (
            SELECT r, regexp_split_to_array(input, ',')[r]
            FROM input, generate_series(1, array_length(regexp_split_to_array(input, ','), 1)) r
        ),

        -- Advance the hash by one character, until all strings are empty.
        hashes(string TEXT, hash BIGINT) AS (
            SELECT string, 0 as hash
            FROM strings
            UNION ALL
            SELECT substring(string, 2), ((hash + ascii(substring(string, 1, 1))) * 17) % 256
            FROM hashes
            WHERE length(string) > 0
        ),

        part1(part1 BIGINT) AS (
            SELECT SUM(hash)
            FROM hashes
            WHERE string = ''
        ),

        -- Parse strings as symbol plus commands; either `-` or `=X`.
        commands(r INT, symb TEXT, op INT) AS (
            SELECT
                r,
                CASE WHEN substring(string, length(string)) = '-'
                     THEN substring(string, 1, length(string)-1)
                     ELSE substring(string, 1, length(string)-2)
                END,
                CASE WHEN substring(string, length(string)) = '-'
                     THEN 0
                     ELSE substring(string, length(string))::INT
                END
            FROM strings
        ),
        -- Operations that happen after a symbol's last delete operation.
        -- All other operations do not matter, and do not affect the state.
        final_ops(r INT, symb TEXT, op INT) AS (
            SELECT *
            FROM commands
            WHERE r > COALESCE(
                (SELECT MAX(r)
                FROM commands c2
                WHERE commands.symb = c2.symb
                  AND c2.op = 0), 0)
        ),
        -- Each symbol is summarized by their first final insert time, and the last final operation
        final_state(r INT, symb TEXT, op INT) AS (
            SELECT DISTINCT ON(symb)
                (SELECT MIN(r) FROM final_ops fo2 WHERE fo2.symb = final_ops.symb),
                symb,
                op
            FROM final_ops
            ORDER BY symb, r DESC, op
        ),
        -- Redo the hash computation on symbols rather than commands.
        hashes2(start TEXT, string TEXT, hash BIGINT) AS (
            SELECT symb as start, symb as string, 0 as hash
            FROM final_state
            UNION ALL
            SELECT start, substring(string, 2), ((hash + ascii(substring(string, 1, 1))) * 17) % 256
            FROM hashes2
            WHERE length(string) > 0
        ),
        -- Bin up the state, so's we can tabulate it
        binned(hash BIGINT, r INT, symb TEXT, op INT) AS (
            SELECT hash, final_state.*
            FROM hashes2, final_state
            WHERE hashes2.start = symb
              AND hashes2.string = ''
        ),
        -- Sum the product of 1 + hash, the position in bin by r, and the op.
        part2(part2 BIGINT) AS (
            SELECT SUM(
                (1 + hash) *
                (SELECT COUNT(*) FROM binned b2 WHERE binned.hash = b2.hash AND binned.r >= b2.r) *
                op
            )
            FROM binned
        ),

        potato(x int) as (select 1)

    SELECT * FROM part1, part2;

Contributors

Day 15 was brought to you by: @frankmcsherry, @nrainer-materialize

Day sixteen was about bouncing light around in a grid, and seeing how many grid cells are illuminated. The illumination process was classic recursive SQL, where you keep expanding (row, col, dir) triples until the set reaches a fixed point. In the second part the light sources had an origin, which is just a fourth column to add, tracking the source of each ray of light.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as block
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),
    cells(r INT, c INT, symbol TEXT) AS (
        SELECT r, c, substring(line, c, 1)
        FROM lines, generate_series(1, length(line)) c
    ),

    shift(dir TEXT, symbol TEXT, dr INT, dc INT, new_dir TEXT) AS (
        VALUES
            ('r', '.',  0,  1, 'r'),
            ('r', '-',  0,  1, 'r'),
            ('r', '|',  1,  0, 'd'),
            ('r', '|', -1,  0, 'u'),
            ('r', '/', -1,  0, 'u'),
            ('r', '\',  1,  0, 'd'),
            ('l', '.',  0, -1, 'l'),
            ('l', '-',  0, -1, 'l'),
            ('l', '|',  1,  0, 'd'),
            ('l', '|', -1,  0, 'u'),
            ('l', '/',  1,  0, 'd'),
            ('l', '\', -1,  0, 'u'),
            ('u', '.', -1,  0, 'u'),
            ('u', '-',  0,  1, 'r'),
            ('u', '-',  0, -1, 'l'),
            ('u', '|', -1,  0, 'u'),
            ('u', '/',  0,  1, 'r'),
            ('u', '\',  0, -1, 'l'),
            ('d', '.',  1,  0, 'd'),
            ('d', '-',  0,  1, 'r'),
            ('d', '-',  0, -1, 'l'),
            ('d', '|',  1,  0, 'd'),
            ('d', '/',  0, -1, 'l'),
            ('d', '\',  0,  1, 'r')
    ),

    -- Light is in a location, and has a direction.
    light(r INT, c INT, dir TEXT) AS (
        SELECT 1, 1, 'r'
        UNION
        SELECT light.r + dr, light.c + dc, new_dir
        FROM light, cells, shift
        WHERE light.r = cells.r
            AND light.c = cells.c
            AND light.dir = shift.dir
            AND cells.symbol = shift.symbol
    ),

    part1(part1 BIGINT) AS (
        SELECT COUNT(*) FROM (
            SELECT DISTINCT light.r, light.c
            FROM light, cells
            WHERE light.r = cells.r
                AND light.c = cells.c
        )
    ),

    -- Light is in a location, a direction, and an origin.
    light2(r INT, c INT, dir TEXT, source TEXT) AS (
        SELECT DISTINCT * FROM (SELECT r, (SELECT MIN(c) FROM cells), 'r', 'r' || r FROM cells) UNION
        SELECT DISTINCT * FROM (SELECT r, (SELECT MAX(c) FROM cells), 'l', 'l' || r FROM cells) UNION
        SELECT DISTINCT * FROM (SELECT (SELECT MIN(r) FROM cells), c, 'd', 'd' || c FROM cells) UNION
        SELECT DISTINCT * FROM (SELECT (SELECT MAX(c) FROM cells), c, 'u', 'u' || c FROM cells) UNION
        SELECT light2.r + dr, light2.c + dc, new_dir, source
        FROM light2, cells, shift
        WHERE light2.r = cells.r
            AND light2.c = cells.c
            AND light2.dir = shift.dir
            AND cells.symbol = shift.symbol
    ),

    part2(part2 BIGINT) AS (
        SELECT MAX(count) FROM (
            SELECT source, COUNT(*) FROM (
                SELECT DISTINCT light2.r, light2.c, source
                FROM light2, cells
                WHERE light2.r = cells.r
                    AND light2.c = cells.c
            )
            GROUP BY source
        )
    )

SELECT * FROM part1, part2;

Contributors

Day 16 was brought to you by: @frankmcsherry

Day seventeen is a pathfinding problem, with constraints on how you move around the path (not too short or too long in any direction at once). Classic recursive SQL to implement Bellman-Ford.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as block
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),
    cells(r INT, c INT, cost INT) AS (
        SELECT r, c, substring(line, c, 1)::INT
        FROM lines, generate_series(1, length(line)) c
    ),

    -- For each cell, we can be headed n, e, w, s and have gone 1, 2, 3 steps already.
    -- There is a mimimum cost path to reach this configuration, and .. we might need
    -- to remember how we got there but let's do that in part 2.
    min_cost(r INT, c INT, dr INT, dc INT, steps INT, cost INT) AS (
        SELECT r, c, dr, dc, steps, MIN(cost)
        FROM (
            SELECT 1 as r, 1 as c, 1 as dr, 0 as dc, 0 as steps, 0 as cost
            UNION ALL
            SELECT 1, 1, 0, 1, 0, 0
            -- We could have just stepped to r, c in a few ways, incurring its cost.
            UNION ALL
            SELECT cells.r, cells.c, dr, dc, steps + 1, min_cost.cost + cells.cost
            FROM min_cost, cells
            WHERE steps < 3
              AND cells.r = min_cost.r + dr
              AND cells.c = min_cost.c + dc
            -- We could take a ??? turn
            UNION ALL
            SELECT cells.r, cells.c, dc, dr, 1, min_cost.cost + cells.cost
            FROM min_cost, cells
            WHERE cells.r = min_cost.r + dc
              AND cells.c = min_cost.c + dr
            -- We could take a ??? turn
            UNION ALL
            SELECT cells.r, cells.c, -dc, -dr, 1, min_cost.cost + cells.cost
            FROM min_cost, cells
            WHERE cells.r = min_cost.r - dc
              AND cells.c = min_cost.c - dr
        )
        GROUP BY r, c, dr, dc, steps
    ),

    part1(part1 INT) AS (
        SELECT MIN(cost)
        FROM min_cost
        WHERE r = (SELECT MAX(r) FROM cells)
          AND c = (SELECT MAX(c) FROM cells)
    ),

    potato(x INT) AS (SELECT 1),

    -- For each cell, we can be headed n, e, w, s and have gone 1, 2, 3 steps already.
    -- There is a mimimum cost path to reach this configuration, and .. we might need
    -- to remember how we got there but let's do that in part 2.
    min_cost2(r INT, c INT, dr INT, dc INT, steps INT, cost INT) AS (
        SELECT r, c, dr, dc, steps, MIN(cost)
        FROM (
            SELECT 1 as r, 1 as c, 1 as dr, 0 as dc, 0 as steps, 0 as cost
            UNION ALL
            SELECT 1, 1, 0, 1, 0, 0
            -- We could have just stepped to r, c in a few ways, incurring its cost.
            UNION ALL
            SELECT cells.r, cells.c, dr, dc, steps + 1, min_cost2.cost + cells.cost
            FROM min_cost2, cells
            WHERE steps < 10
              AND cells.r = min_cost2.r + dr
              AND cells.c = min_cost2.c + dc
            -- We could take a XYZ turn
            UNION ALL
            SELECT cells.r, cells.c, dc, dr, 1, min_cost2.cost + cells.cost
            FROM min_cost2, cells
            WHERE steps >= 4
              AND cells.r = min_cost2.r + dc
              AND cells.c = min_cost2.c + dr
            -- We could take a ZYX turn
            UNION ALL
            SELECT cells.r, cells.c, -dc, -dr, 1, min_cost2.cost + cells.cost
            FROM min_cost2, cells
            WHERE steps >= 4
              AND cells.r = min_cost2.r - dc
              AND cells.c = min_cost2.c - dr
        )
        GROUP BY r, c, dr, dc, steps
    ),
    part2(part2 INT) AS (
        SELECT MIN(cost)
        FROM min_cost2
        WHERE r = (SELECT MAX(r) FROM cells)
          AND c = (SELECT MAX(c) FROM cells)
          AND steps >= 4
    ),

SELECT * FROM part1, part2;

Contributors

Day 17 was brought to you by: @frankmcsherry

Day eighteen provides instructions of how a digger will move around, excavating earth, and asks you to calculate the area. This is an opportunity to learn about the Trapezoid formula for computing the area as the addition and subtraction of trapezoid areas.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as line
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),

    split1(r INT, dr INT, dc INT, steps INT) AS (
        SELECT
            r,
            CASE WHEN regexp_split_to_array(line, ' ')[1] = 'U' THEN -1
                 WHEN regexp_split_to_array(line, ' ')[1] = 'D' THEN  1
                 ELSE 0
            END,
            CASE WHEN regexp_split_to_array(line, ' ')[1] = 'L' THEN -1
                 WHEN regexp_split_to_array(line, ' ')[1] = 'R' THEN  1
                 ELSE 0
            END,
            regexp_split_to_array(line, ' ')[2]::INT
        FROM lines
    ),

    -- Part 1 is prefix sum followed by area calculations.
    -- We'll brute force the prefix sum part, and use the
    -- "trapezoid formula", summing + and - contributions
    -- as the path moves around.
    path1(r1 INT, c1 INT, r2 INT, c2 INT, rounds INT) AS (
        SELECT 0, 0, 0, 0, 1
        UNION
        SELECT
            path1.r2,
            path1.c2,
            path1.r2 + split1.dr * split1.steps,
            path1.c2 + split1.dc * split1.steps,
            path1.rounds + 1
        FROM path1, split1
        WHERE path1.rounds = split1.r
    ),
    -- The area carved by the path, plus half a unit of area
    -- for each path step, plus 4 * (1/4) units for the net
    -- four 90 degree turns.
    part1(part1 BIGINT) AS (
        SELECT
            ABS((SELECT SUM((r1 + r2) * (c1 - c2)) FROM path1)) / 2
          + (SELECT SUM(steps) FROM split1) / 2
          + 1
    ),

    -- Part 2 changes how we parse each line to give long paths.
    split2(r INT, dr INT, dc INT, steps INT) AS (
        SELECT
            r,
            CASE WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '3' THEN -1
                 WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '1' THEN  1
                 ELSE 0
            END,
            CASE WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '2' THEN -1
                 WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '0' THEN  1
                 ELSE 0
            END,
            256 * 256 * get_byte(decode('0' || substring(regexp_split_to_array(line, ' ')[3], 3, 5), 'hex'), 0)
                + 256 * get_byte(decode('0' || substring(regexp_split_to_array(line, ' ')[3], 3, 5), 'hex'), 1)
                      + get_byte(decode('0' || substring(regexp_split_to_array(line, ' ')[3], 3, 5), 'hex'), 2)
        FROM lines
    ),

    path2(r1 BIGINT, c1 BIGINT, r2 BIGINT, c2 BIGINT, rounds INT) AS (
        SELECT 0, 0, 0, 0, 1
        UNION
        SELECT
            path2.r2,
            path2.c2,
            path2.r2 + split2.dr * split2.steps,
            path2.c2 + split2.dc * split2.steps,
            path2.rounds + 1
        FROM path2, split2
        WHERE path2.rounds = split2.r
    ),
    -- The area carved by the path, plus half a unit of area
    -- for each path step, plus 4 * (1/4) units for the net
    -- four 90 degree turns.
    part2(part2 BIGINT) AS (
        SELECT
            ABS((SELECT SUM((r1 + r2) * (c1 - c2)) FROM path2)) / 2
          + (SELECT SUM(steps) FROM split2) / 2
          + 1
    )

SELECT * FROM part1, part2;

Contributors

Day 18 was brought to you by: @frankmcsherry

Day nineteen sneakily introduces you to binary space partitioning, where rules based on inequality tests route you to new rules, until eventually you reach some rule that says “accept” or “reject”. This was all pretty easy, except for a substantial amount of SQL overhead related to the various symbols and characters and coordinates all of which required their own columns.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    blocks(block1 TEXT, block2 TEXT) AS (
        SELECT
            regexp_split_to_array(input, '\n\n')[1] block1,
            regexp_split_to_array(input, '\n\n')[2] block2
        FROM input
    ),
    states(state TEXT, trans TEXT) AS (
        SELECT
            regexp_split_to_array(line, '\{')[1] state,
            trim('}' FROM regexp_split_to_array(line, '\{')[2]) trans
        FROM (SELECT regexp_split_to_table(block1, '\n') line FROM blocks)
    ),
    steps(state TEXT, priority INT, rule TEXT) AS (
        SELECT
            state,
            priority,
            regexp_split_to_array(trans, ',')[priority]
        FROM states, generate_series(1, array_length(regexp_split_to_array(trans, ','), 1)) priority
    ),

    starts(x INT, m INT, a INT, s INT) AS (
        SELECT
            substring(regexp_split_to_array(trimmed, ',')[1], 3)::INT,
            substring(regexp_split_to_array(trimmed, ',')[2], 3)::INT,
            substring(regexp_split_to_array(trimmed, ',')[3], 3)::INT,
            substring(regexp_split_to_array(trimmed, ',')[4], 3)::INT
        FROM (SELECT trim('\{' FROM trim('\}' FROM regexp_split_to_table(block2, '\n'))) trimmed FROM blocks)
    ),

    --
    rules(state TEXT, priority INT, field TEXT, cmp TEXT, val INT, next TEXT) AS (
        SELECT
            state,
            priority,
            CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
                THEN substring(rule, 1, 1)
                ELSE 'x'
            END,
            CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
                THEN substring(rule, 2, 1)
                ELSE '>'
            END,
            CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
                THEN regexp_split_to_array(substring(rule, 3), ':')[1]::INT
                ELSE '0'
            END,
            CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
                THEN regexp_split_to_array(substring(rule, 3), ':')[2]
                ELSE rule
            END
        FROM steps
    ),

    -- PART 1: iterate folks forward from `in`
    movement(state TEXT, x INT, m INT, a INT, s INT) AS (
        SELECT 'in' state, * FROM starts
        UNION ALL
        SELECT next, x, m, a, s
        FROM (
            SELECT DISTINCT ON (state, x, m, a, s) state, x, m, a, s, priority, next
            FROM (
                SELECT movement.*, rules.next, rules.priority
                FROM movement, rules
                WHERE movement.state = rules.state
                AND CASE WHEN rules.cmp = '<'
                         THEN CASE WHEN rules.field = 'x' THEN x < val
                                   WHEN rules.field = 'm' THEN m < val
                                   WHEN rules.field = 'a' THEN a < val
                                   WHEN rules.field = 's' THEN s < val
                                   ELSE false
                              END
                         WHEN rules.cmp = '>'
                         THEN CASE WHEN rules.field = 'x' THEN x > val
                                   WHEN rules.field = 'm' THEN m > val
                                   WHEN rules.field = 'a' THEN a > val
                                   WHEN rules.field = 's' THEN s > val
                                   ELSE false
                              END
                         ELSE false
                    END
            )
            ORDER BY state, x, m, a, s, priority
        )
    ),

    part1(part1 BIGINT) AS (
        SELECT SUM(x + m + a + s)
        FROM movement
        WHERE state = 'A'
    ),

    -- PART 2: just find all the bounding regions and label them 'A' or 'R'.
    region(state TEXT, priority INT, xl INT, xu INT, ml INT, mu INT, al INT, au INT, sl INT, su INT) AS (
        SELECT 'in', 1, 1, 4000, 1, 4000, 1, 4000, 1, 4000
        -- Could satisfy the rule, and transition to the next state ..
        UNION ALL
        SELECT
            next,
            1,
            CASE WHEN rules.field = 'x' AND rules.cmp = '>' THEN GREATEST(val+1, xl) ELSE xl END,
            CASE WHEN rules.field = 'x' AND rules.cmp = '<' THEN LEAST(val-1, xu) ELSE xu END,
            CASE WHEN rules.field = 'm' AND rules.cmp = '>' THEN GREATEST(val+1, ml) ELSE ml END,
            CASE WHEN rules.field = 'm' AND rules.cmp = '<' THEN LEAST(val-1, mu) ELSE mu END,
            CASE WHEN rules.field = 'a' AND rules.cmp = '>' THEN GREATEST(val+1, al) ELSE al END,
            CASE WHEN rules.field = 'a' AND rules.cmp = '<' THEN LEAST(val-1, au) ELSE au END,
            CASE WHEN rules.field = 's' AND rules.cmp = '>' THEN GREATEST(val+1, sl) ELSE sl END,
            CASE WHEN rules.field = 's' AND rules.cmp = '<' THEN LEAST(val-1, su) ELSE su END
        FROM region, rules
        WHERE region.state = rules.state
          AND region.priority = rules.priority
        -- .. or could fail the rule, and advance to the next priority.
        UNION ALL
        SELECT
            region.state,
            region.priority + 1,
            CASE WHEN rules.field = 'x' AND rules.cmp = '<' THEN GREATEST(val, xl) ELSE xl END,
            CASE WHEN rules.field = 'x' AND rules.cmp = '>' THEN LEAST(val, xu) ELSE xu END,
            CASE WHEN rules.field = 'm' AND rules.cmp = '<' THEN GREATEST(val, ml) ELSE ml END,
            CASE WHEN rules.field = 'm' AND rules.cmp = '>' THEN LEAST(val, mu) ELSE mu END,
            CASE WHEN rules.field = 'a' AND rules.cmp = '<' THEN GREATEST(val, al) ELSE al END,
            CASE WHEN rules.field = 'a' AND rules.cmp = '>' THEN LEAST(val, au) ELSE au END,
            CASE WHEN rules.field = 's' AND rules.cmp = '<' THEN GREATEST(val, sl) ELSE sl END,
            CASE WHEN rules.field = 's' AND rules.cmp = '>' THEN LEAST(val, su) ELSE su END
        FROM region, rules
        WHERE region.state = rules.state
          AND region.priority = rules.priority
    ),

    part2(part2 NUMERIC) AS (
        SELECT SUM((1 + xu - xl)::BIGINT * (1 + mu - ml)::BIGINT * (1 + au - al)::BIGINT * (1 + su - sl)::BIGINT)
        FROM region
        WHERE state = 'A'
    ),

    potato(x INT) AS (SELECT 1)

SELECT * FROM part1, part2;

Contributors

Day 19 was brought to you by: @frankmcsherry

Day twenty presents you with the simulation of an asynchronous circuit, and this is the day that almost broke me. Mechanically the SQL isn’t that complicated, but debugging the SQL was a real challenge. It got done over the course of a quite long train ride into the evening.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(line TEXT) AS ( SELECT regexp_split_to_table(input, '\n') FROM input ),
    links(name TEXT, link TEXT) AS (
        SELECT
            substring(regexp_split_to_array(line, ' ')[1], 2),
            trim(',' FROM regexp_split_to_array(line, ' ')[x])
        FROM
            lines, generate_series(3, array_length(regexp_split_to_array(line, ' '), 1)) x
    ),
    -- One special line has op 'b' and name 'roadcaster'.
    types(op TEXT, name TEXT) AS (
        SELECT
            substring(regexp_split_to_array(line, ' ')[1], 1, 1),
            substring(regexp_split_to_array(line, ' ')[1], 2)
        FROM
            lines
    ),

    -- Part one: simulate 1000 steps of 'broadcaster' being activated with a low pulse.
    -- tally up total low and high pulses, and then multiply.
    -- The state carried across steps are the last-transmitted pulses of each operator.
    -- This should also tell us the final state of the `%` operators.
    -- We'll also need the totals of low and high pulses, so that we can add them up.

    seed(press INT, counter INT) AS (
        SELECT 1, 1
        UNION
        SELECT press, counter - 1
        FROM seed
        WHERE counter > 0
        UNION
        SELECT press + 1, 20
        FROM seed
        WHERE counter = 0
          AND press < 4100
    ),

    -- Emitted pulses after various button presses, in various rounds of resolution.
    pulses(name TEXT, press INT, round INT, pulse TEXT) AS (
        -- One thousand button presses, each followed by rounds of resolution.
        SELECT 'roadcaster', press, 1, 'lo' FROM seed WHERE counter = 0
        UNION ALL SELECT * FROM flip
        UNION ALL SELECT * FROM conj
    ),

    -- Counters; every 'lo' input pulse flips and emits the state.
    flip(name TEXT, press INT, round INT, pulse TEXT) AS (
        -- Each `signal` needs to behave as if all "prior" signals have been processed, ordered by (press, round, source).
        SELECT 
            name, 
            press,
            round + 1, 
            -- Look for the most recently emitted signal, and we'll produce the opposite of that one.
            CASE WHEN (
                    SELECT COUNT(*)
                    FROM signal s1 
                    WHERE s1.target = types.name 
                      AND s1.pulse = 'lo'
                      AND ((s1.press < signal.press) OR 
                           (s1.press = signal.press AND s1.round < signal.round) OR 
                           (s1.press = signal.press AND s1.round = signal.round AND s1.source < signal.source))
                ) % 2 = 0
                THEN 'hi'
                ELSE 'lo'
            END
        FROM signal, types
        WHERE signal.target = types.name
            AND types.op = '%'
            AND signal.pulse = 'lo'
    ),

    -- NAND gates; every input pulse evokes the NAND of most recent inputs.
    conj(name TEXT, press INT, round INT, pulse TEXT) AS (
        SELECT
            name,
            press,
            round + 1,
            -- Look for the most recently received signals from each input,
            -- including this one, and iff all 'hi' then 'lo'.
            CASE WHEN (
                    (SELECT COUNT(*) FROM links WHERE link = types.name)
                    =
                    (SELECT COUNT(*) FROM (
                        SELECT DISTINCT ON (source) source, pulse
                        FROM signal s1
                        WHERE s1.target = types.name
                          AND ((s1.press < signal.press) OR
                               (s1.press = signal.press AND s1.round < signal.round) OR
                               (s1.press = signal.press AND s1.round = signal.round AND s1.source <= signal.source))
                        OPTIONS (DISTINCT ON INPUT GROUP SIZE = 1000)
                        ORDER BY source, press DESC, round DESC
                    )
                    WHERE pulse = 'hi'))
                 THEN 'lo'
                 ELSE 'hi'
            END
        FROM signal, types
        WHERE signal.target = types.name
            AND types.op = '&'
    ),

    -- A record of a pulse into an operator, from another operator.
    -- We track the source so that '&' operators can make any sense.
    signal(source TEXT, target TEXT, press INT, round INT, pulse TEXT) AS (
        SELECT pulses.name, links.link, pulses.press, pulses.round, pulses.pulse
        FROM pulses, links
        WHERE pulses.name = links.name
          AND pulses.round > 0
    ),

    part1(pulse TEXT, count BIGINT) AS (
        SELECT pulse, count(*) FROM signal GROUP BY pulse
    ),

    potato(x INT) AS (SELECT 1)

SELECT * FROM signal WHERE target = 'cn' AND pulse = 'hi';

Contributors

Day 20 was brought to you by: @frankmcsherry

Day twenty-one was another example of some (recursive) SQL for grid exploration, followed by some mathematics. In this case the grid exploration was standard, determining reachable locations on the grid, and then the math was quadratic extrapolation from a sequence of measurements (to something too large to actually evaluate, an answer of 621,289,922,886,149 reachable states).

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as block
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),
    cells(r INT, c INT, symbol TEXT) AS (
        SELECT r, c, substring(line, c, 1)
        FROM lines, generate_series(1, length(line)) c
    ),

    steps(r INT, c INT) AS (
        SELECT r, c FROM cells WHERE symbol = 'S'
        EXCEPT ALL
        SELECT * FROM s_delay
        UNION
        SELECT cells.r, cells.c
        FROM cells, (
                  SELECT r + 1, c FROM steps
            UNION SELECT r - 1, c FROM steps
            UNION SELECT r, c + 1 FROM steps
            UNION SELECT r, c - 1 FROM steps
        ) as potato(r,c)
        WHERE cells.r = potato.r
          AND cells.c = potato.c
          AND cells.symbol != '#'
    ),

    s_delay(r INT, c INT) AS (
        SELECT r, c FROM cells WHERE symbol = 'S'
    ),

    part1(part1 BIGINT) AS (
        SELECT COUNT(*) FROM (SELECT DISTINCT * FROM steps)
    ),

    -- PART 2 wants a much larger step count on an infinite repeating grid.
    -- We know it will be quadratic based on the clear paths if nothing else.
    -- Map out enough points to reverse out polynomial coefficients.
    -- For me they were `ax^2 + bx + c` with a = 60724, b = 30602, c =  3849.

    dists(r INT, c INT, d INT) AS (
        SELECT r, c, MIN(d)
        FROM (
            SELECT r, c, 0 d
            FROM cells
            WHERE symbol = 'S'
            UNION ALL
            SELECT potato.r, potato.c, d + 1
            FROM cells, (
                      SELECT r + 1, c, d FROM dists
                UNION SELECT r - 1, c, d FROM dists
                UNION SELECT r, c + 1, d FROM dists
                UNION SELECT r, c - 1, d FROM dists
            ) as potato(r,c,d)
            WHERE cells.r = 1 + (((potato.r - 1) % 131) + 131) % 131
              AND cells.c = 1 + (((potato.c - 1) % 131) + 131) % 131
              AND cells.symbol != '#'
              AND potato.d < 1000
        )
        GROUP BY r, c
    ),

    part2(x0 BIGINT, x2 BIGINT, x4 BIGINT, x6 BIGINT) AS (
        SELECT
            (SELECT COUNT(*) FROM dists WHERE d <=  0 * 131 + 65 AND d % 2 = 1),
            (SELECT COUNT(*) FROM dists WHERE d <=  2 * 131 + 65 AND d % 2 = 1),
            (SELECT COUNT(*) FROM dists WHERE d <=  4 * 131 + 65 AND d % 2 = 1),
            (SELECT COUNT(*) FROM dists WHERE d <=  6 * 131 + 65 AND d % 2 = 1)
    ),

    potato (x INT) AS ( SELECT 1 )

SELECT 'idk';

Contributors

Day 21 was brought to you by: @frankmcsherry

Week four

The last week was shorter, but also culminated in some pretty exciting problems and techniques.

Day twenty-two had shapes made of cubes falling into a well, and coming to rest on others (or the ground). There were then questions about how many pieces are load bearing, and also for each load bearing piece how many others would fall if they were removed. Dropping the pieces used recursive SQL, determining the load bearing pieces did not, but then scoring the load bearing pieces again required recursion.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as line
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),

    cells(r INT, x INT, y INT, z INT) AS (
        SELECT xs.r, x, y, z
        FROM (SELECT r, generate_series(regexp_split_to_array(regexp_split_to_array(line, '~')[1], ',')[1]::INT,
                                        regexp_split_to_array(regexp_split_to_array(line, '~')[2], ',')[1]::INT) x FROM lines) xs,
             (SELECT r, generate_series(regexp_split_to_array(regexp_split_to_array(line, '~')[1], ',')[2]::INT,
                                        regexp_split_to_array(regexp_split_to_array(line, '~')[2], ',')[2]::INT) y FROM lines) ys,
             (SELECT r, generate_series(regexp_split_to_array(regexp_split_to_array(line, '~')[1], ',')[3]::INT,
                                        regexp_split_to_array(regexp_split_to_array(line, '~')[2], ',')[3]::INT) z FROM lines) zs
        WHERE xs.r = ys.r
          AND xs.r = zs.r
    ),

    -- Part one: let the pieces fall, with a minimum z value of one.
    parts(r INT, x INT, y INT, z INT) AS (
        SELECT * FROM cells
        EXCEPT ALL SELECT * FROM cells_delayed
        UNION ALL
        SELECT r, x, y, CASE WHEN r IN (SELECT * FROM supported) THEN z ELSE z - 1 END
        FROM parts
    ),
    -- One piece supports a *different* piece if it is directly below a piece of the other.
    supports(r1 INT, r2 INT) AS (
        SELECT DISTINCT p1.r, p2.r
        FROM parts p1, parts p2
        WHERE p1.x = p2.x
          AND p1.y = p2.y
          AND p1.z + 1 = p2.z
          AND p1.r != p2.r
    ),
    supported(r INT) AS (
        SELECT r FROM parts WHERE z = 1
        UNION
        SELECT r2 FROM supports
    ),
    -- A piece is safe to remove if it is does not uniquely support any other piece.
    part1(part1 BIGINT) AS (
        SELECT COUNT(DISTINCT r)
        FROM lines
        WHERE r NOT IN (
            SELECT r1
            FROM supports
            WHERE r2 IN (
                SELECT r2
                FROM supports
                GROUP BY r2
                HAVING COUNT(*) = 1
            )
        )
    ),

    cells_delayed(r INT, x INT, y INT, z INT) AS ( SELECT * FROM cells ),

    -- Part two: for each piece, how many pieces would fall if you removed it?
    -- Extend `supports` to transitive support: if r1 vanished would r2 fall?
    supports_trans(r1 INT, r2 INT) AS (
        -- Uniquely supported pieces would certainly fall.
        SELECT *
        FROM supports
        WHERE r2 IN (SELECT r2 FROM supports GROUP BY r2 HAVING COUNT(*) = 1)
        -- Any piece all of whose supports would fall without 'a' also falls without it.
        UNION
        SELECT st.r1, s1.r2
        FROM supports_trans st, supports s1
        WHERE st.r2 = s1.r1
        GROUP BY st.r1, s1.r2
        HAVING COUNT(*) = (SELECT COUNT(*) FROM supports WHERE supports.r2 = s1.r2)
    ),

    part2(part2 BIGINT) AS (SELECT COUNT(*) FROM supports_trans)

SELECT * FROM part1, part2;

Contributors

Day 22 was brought to you by: @frankmcsherry

Day twenty-three is a classic example of finding the “longest path” in a directed graph. This is a relatively easy problem when the input is acyclic (part one), and it is NP-hard when the input may have cycles (part two). Part one was a mostly vanilla recursive SQL query, and part two encoded the 32 prior state options in a large integer and just did a lot of work.

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as line
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),

    cells(r INT, c INT, symbol TEXT) AS (
        SELECT r, c, substring(line, c, 1)
        FROM lines, generate_series(1, length(line)) c
    ),

    -- Part one: longest path (on probably a DAG)
    paths(r INT, c INT) AS (
        SELECT r, c FROM cells WHERE symbol = '.'
    ),

    steps(r1 INT, c1 INT, r2 INT, c2 INT) AS (
        SELECT r, c, r + 1, c FROM paths WHERE (r + 1, c) IN (SELECT * FROM PATHS) UNION
        SELECT r, c, r - 1, c FROM paths WHERE (r - 1, c) IN (SELECT * FROM PATHS) UNION
        SELECT r, c, r, c + 1 FROM paths WHERE (r, c + 1) IN (SELECT * FROM PATHS) UNION
        SELECT r, c, r, c - 1 FROM paths WHERE (r, c - 1) IN (SELECT * FROM PATHS)
    ),

    -- A directional trip, forced by a slope and the no-revisting rule.
    force(r1 INT, c1 INT, r2 INT, c2 INT) AS (
        SELECT r-1, c, r+1, c FROM cells WHERE symbol = 'v' UNION ALL
        SELECT r+1, c, r-1, c FROM cells WHERE symbol = '^' UNION ALL
        SELECT r, c-1, r, c+1 FROM cells WHERE symbol = '>' UNION ALL
        SELECT r, c+1, r, c-1 FROM cells WHERE symbol = '<'
    ),

    dists(r INT, c INT, d INT) AS (
        SELECT 1, 2, 0
        UNION
        SELECT steps.r2, steps.c2, 1 + MIN(d)
        FROM dists, steps
        WHERE dists.r = steps.r1
          AND dists.c = steps.c1
        GROUP BY steps.r2, steps.c2
        UNION 
        SELECT force.r2, force.c2, 2 + MAX(d)
        FROM dists, force
        WHERE dists.r = force.r1
          AND dists.c = force.c1
        GROUP BY force.r2, force.c2
    ),

    -- Part two: longest path on definitely not a DAG.
    -- There are 32 optional nodes (not including first and last nodes)
    -- Clearly meant to pack in to an int and avoid duplication.
    paths2(r INT, c INT) AS (
        SELECT r, c FROM cells WHERE symbol != '#'
    ),

    steps2(r1 INT, c1 INT, r2 INT, c2 INT) AS (
        SELECT r, c, r + 1, c FROM paths2 WHERE (r + 1, c) IN (SELECT * FROM paths2) UNION
        SELECT r, c, r - 1, c FROM paths2 WHERE (r - 1, c) IN (SELECT * FROM paths2) UNION
        SELECT r, c, r, c + 1 FROM paths2 WHERE (r, c + 1) IN (SELECT * FROM paths2) UNION
        SELECT r, c, r, c - 1 FROM paths2 WHERE (r, c - 1) IN (SELECT * FROM paths2)
    ),
    -- Locations where a choice exists (or start/end).
    nodes(r INT, c INT) AS (
        SELECT r1, c1 FROM steps2 GROUP BY r1, c1 HAVING COUNT(*) != 2
    ),
    -- Determine node-to-node path lengths. Do not cross nodes.
    trail(r1 INT, c1 INT, d INT, r2 INT, c2 INT) AS (
        SELECT r1, c1, MIN(d), r2, c2
        FROM (
            SELECT r1, c1, 1 d, r2, c2 FROM steps2 WHERE (r1, c1) IN (SELECT * FROM nodes)
            UNION ALL
            SELECT trail.r1, trail.c1, d + 1, steps2.r2, steps2.c2
            FROM trail, steps2
            WHERE trail.r2 = steps2.r1
            AND trail.c2 = steps2.c1
            AND (trail.r1 != steps2.r2 OR trail.c1 != steps2.c2)
            AND (steps2.r1, steps2.c1) NOT IN (SELECT * FROM nodes)
        )
        GROUP BY r1, c1, r2, c2
    ),

    links(r1 INT, c1 INT, d INT, r2 INT, c2 INT) AS (
        SELECT * FROM trail WHERE (r2, c2) IN (SELECT * FROM nodes)
    ),

    -- These rows in links show that (12, 20) and (130, 126) are mandatory,
    -- and are the first moments we have a choice. The remainaing 32 nodes
    -- can each get a number, and be used in a bit pattern somewhere.
    --
    --          1 |   2 | 105 |  12 |  20
    --        141 | 140 | 121 | 130 | 126

    -- Re-key nodes to dense integers.
    internal(r INT, c INT, id INT) AS (
        SELECT r, c, (
            SELECT COUNT(*)
            FROM nodes n1
            WHERE (n1.r < n2.r OR (n1.r = n2.r AND n1.c < n2.c))
              AND (n1.r, n1.c) NOT IN (VALUES (1,2), (12,20), (130,126), (141,140))
        )
        FROM nodes n2
        WHERE (r, c) NOT IN (VALUES (1,2), (12,20), (130,126), (141,140))
    ),

    longest(r INT, c INT, d INT, v BIGINT) AS (
        SELECT r, c, MAX(d), v
        FROM (
            SELECT 12 r, 20 c, 0 d, 0 v
            UNION ALL
            SELECT r2, c2, longest.d + links.d, v + (1::BIGINT << internal.id)
            FROM longest, links, internal
            WHERE longest.r = links.r1
              AND longest.c = links.c1
              AND links.r2 = internal.r
              AND links.c2 = internal.c
              AND ((v >> internal.id) % 2) != 1
        )
        GROUP BY r, c, v
    ),

    potato(x INT) AS ( SELECT 1 )

SELECT * FROM longest ORDER BY d DESC;

Contributors

Day 23 was brought to you by: @frankmcsherry

Day twenty-four had most folks reach for a numerical solver, something like Mathematica or z3. That is less easy in SQL, and I needed to learn some math instead (specifically how to find the intersection of two line segments). Although part two seemed quite complex, it ended up being relatively easy when you realize a few simplifications (an added dimension that can be ignored until the end, allowing you to re-use part one).

See the solution

Link to puzzle(s) 🟢 🟢

Part one + two in one go!

sql

WITH MUTUALLY RECURSIVE

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as line
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),

    observation(r INT, x NUMERIC, y NUMERIC, z NUMERIC, dx NUMERIC, dy NUMERIC, dz NUMERIC) AS (
        SELECT
            r,
            trim(',' FROM regexp_split_to_array(line, ' ')[1])::NUMERIC,
            trim(',' FROM regexp_split_to_array(line, ' ')[2])::NUMERIC,
            trim(',' FROM regexp_split_to_array(line, ' ')[3])::NUMERIC,
            trim(',' FROM regexp_split_to_array(line, ' ')[5])::NUMERIC,
            trim(',' FROM regexp_split_to_array(line, ' ')[6])::NUMERIC,
            trim(',' FROM regexp_split_to_array(line, ' ')[7])::NUMERIC
        FROM
            lines
    ),

    -- Part one: for each pair, solve for a future (x,y) intersection of their traced paths.
    -- https://en.wikipedia.org/wiki/Line–line_intersection#Given_two_points_on_each_line_segment
    meeting(r1 INT, r2 INT, x NUMERIC, y NUMERIC, t1 NUMERIC, t2 NUMERIC) AS (
        SELECT
            o1.r,
            o2.r,
            o1.x + o1.dx * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            o1.y + o1.dy * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            (((o2.x - o1.x) * o1.dy) - ((o2.y - o1.y) * o1.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx)
        FROM observation o1, observation o2
        WHERE o1.dx * o2.dy != o1.dy * o2.dx
          AND o1.r < o2.r
    ),
    part1(part1 BIGINT) AS (
        SELECT COUNT(*)
        FROM meeting
        WHERE t1 >= 0
          AND t2 >= 0
          AND x BETWEEN 200000000000000 AND 400000000000000
          AND y BETWEEN 200000000000000 AND 400000000000000
    ),

    -- Part two: find an initial x, y, z, dx, dy, dz such that you intersect every observation in the future.
    -- Hypothesize dx and dy, subtract them, and assses the number of coincidences.
    hypotheses(r INT, x NUMERIC, y NUMERIC, dx NUMERIC, dy NUMERIC, ox NUMERIC, oy NUMERIC) AS (
        SELECT
            r, x, y, dx - ox, dy - oy, ox, oy
        FROM
            observation,
            generate_series(-500, 500) ox,
            generate_series(-500, 500) oy
        WHERE r < 10
          AND 5 * (ox + 21) = 16 * (oy + 39)    -- derived from input pair with same (dx, dy).
    ),
    coincidence(r1 INT, r2 INT, x NUMERIC, y NUMERIC, ox NUMERIC, oy NUMERIC) AS (
        SELECT
            o1.r,
            o2.r,
            o1.x + o1.dx * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            o1.y + o1.dy * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            o1.ox,
            o1.oy
        FROM hypotheses o1, hypotheses o2
        WHERE o1.dx * o2.dy != o1.dy * o2.dx
          AND o1.r < o2.r
          AND o1.ox = o2.ox
          AND o1.oy = o2.oy
    ),

    hypotheses_xz(r INT, x NUMERIC, y NUMERIC, dx NUMERIC, dy NUMERIC, ox NUMERIC, oy NUMERIC) AS (
        SELECT
            r, x, z, dx - ox, dz - oz, ox, oz
        FROM
            observation,
            generate_series(-117, -117) ox,
            generate_series(-500, 500) oz
        WHERE r < 10
    ),
    coincidence_xz(r1 INT, r2 INT, x NUMERIC, y NUMERIC, ox NUMERIC, oy NUMERIC) AS (
        SELECT
            o1.r,
            o2.r,
            o1.x + o1.dx * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            o1.y + o1.dy * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
            o1.ox,
            o1.oy
        FROM hypotheses_xz o1, hypotheses_xz o2
        WHERE o1.dx * o2.dy != o1.dy * o2.dx
          AND o1.r < o2.r
          AND o1.ox = o2.ox
          AND o1.oy = o2.oy
    ),

    potato (x INT) AS ( SELECT 1 )

-- SELECT x, y, ox, oy, COUNT(*) FROM coincidence GROUP BY x, y, ox, oy HAVING COUNT(*) > 1;
SELECT x, y, ox, oy, COUNT(*) FROM coincidence_xz GROUP BY x, y, ox, oy HAVING COUNT(*) > 1;

Contributors

Day 24 was brought to you by: @frankmcsherry

Day twenty-five asked for a minimum graph cut (of three edges). This is a standard optimization problem, but rather than try to implement the Stoer-Wagner algorithm I went with something from my PhD thesis: partitioning the graph based on the Fiedler vector. It turns out this gave the right answer on the first try, and the holidays were saved!

See the solution

Link to puzzle(s) 🟢

Part one

sql

WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 50)

    lines(r INT, line TEXT) AS (
        SELECT r, regexp_split_to_array(input, '\n')[r] as line
        FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
    ),

    edges(src TEXT, dst TEXT) AS (
        SELECT
            trim(':' FROM regexp_split_to_array(line, ' ')[1]),
            trim(',' FROM regexp_split_to_array(line, ' ')[x])
        FROM
            lines, generate_series(2, array_length(regexp_split_to_array(line, ' '), 1)) x
    ),

    symm(src TEXT, dst TEXT) AS (
        SELECT src, dst FROM edges
        UNION ALL
        SELECT dst, src FROM edges
    ),

    init(src TEXT, val NUMERIC) AS (
        SELECT src, CASE WHEN src < 'n' THEN 1.0 ELSE -1.0 END
        FROM (SELECT src FROM edges UNION ALL SELECT dst FROM edges)
    ),
    -- determine the second eigenvector of the adjacency matrix
    weight(src TEXT, val NUMERIC) AS (
        SELECT * FROM init
        EXCEPT ALL
        SELECT * FROM init_delayed
        UNION ALL
        SELECT symm.src, SUM((val - (SELECT AVG(val) FROM weight))/(SELECT STDDEV(val) FROM weight))
        FROM symm, weight
        WHERE symm.dst = weight.src
        GROUP BY symm.src
    ),

    init_delayed(src TEXT, val NUMERIC) AS ( SELECT * FROM init ),

    part1(part1 BIGINT) AS (
        SELECT
            (SELECT COUNT(*) FROM weight WHERE val < 0.0) *
            (SELECT COUNT(*) FROM weight WHERE val > 0.0)
    ),

    potato(x INT) AS ( SELECT 1 )

SELECT * FROM part1;

Contributors

Day 25 was brought to you by: @frankmcsherry

Conclusions

The exercise was certainly helpful and informative, on multiple levels.

First, it really reinforced for me that WITH MUTUALLY RECURSIVE is a very valuable tool to have access to when faced with a new problem. Often your problem is a bunch of joins and reductions, but when it isn’t you are immediately in a bit of a pickle. In most cases, algorithmic challenges immediately gave way to recursive SQL.

That being said, there’s clearly an accessibility gap when reaching for recursive SQL. I find the idioms approachable, but I’ve spent a while working with data-parallel algorithms, and have seen several of the tricks. There’s still plenty of work to do before the casual SQL author feels comfortable with recursive SQL.

Second, the majority of my time was spent debugging rather than authoring. This is a classic challenge with declaritive languages, who go from input program to output data in often inscrutable ways. I borrowed some techniques from debugging Datalog, but ideally the system itself would help me with this (and several research systems do provide integrated lineage).

Debugging the logic of SQL queries only gets harder when the data are changing underneath you. Techniques like spot checking data become infeasible when the data changes faster than you can observe records that are meant to line up. Materialize should help in these cases, with maintained diagnostic views that represent assertions, or better violations thereof, whose contents spell out records that at some moment violated something that was meant to be true. Materialize’s SUBSCRIBE provides a full account of these views, reporting records that existed even for a moment, where anything other than “always empty” represents an error in your SQL (or your assertions).

Third, using Materialize in new and weird ways shook out several bugs. We’ve already fixed them. Dogfooding your own product, especially in surprising contexts, is a great way to forcibly increase your test coverage. Issues ranged from the silly (“why would you name a table count?”) to the abstruse (doubly nested recursive SQL blocks), but they spilled out in the early days and became less frequent as the weeks went on.

Finally, the main conclusion was that it was all possible. Despite substantial anxiety about whether and when we would need to bail out, defeated, the whole project did work out. We were able to express a rich variety of computational tasks as data-driven SQL both expressed and maintained by Materialize.

Materialize and Advent of Code: Using SQL to solve your puzzles!

See Materialize in action

Preliminaries

Week one

See the solution

Part one

Part two

Contributors

See the solution

Part one

Part two

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one

Part two

Part one + two in one go!

Contributors

See the solution

Part one

Part one + two in one go!

Contributors

See the solution

Part one

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

Week two

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part 1

Part 2

Contributors

Week three

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

See the solution

Part one + two in one go!

Contributors

Week four

See the solution

Part one + two in one go!