advent of code 2020, day 7: droste and bourke

Written: 2020-12-07 to 2020-12-07
by Norman Liu

The previous few days of Advent have been kinda boring - day 4 was regex, then binary search, then set intersections. We’re finally back to graph traversals, and so I felt like this was worth a post.

The problem goes as follows - you’ve sledded to the airport, made it past security, boarded your plane, and gotten through customs. Now that you’re out of the North Pole and at your layover, bizarre baggage rules are only the next obstacle on your trip.

Specifically, bags must be colour-coded and must contain specific quantities of other colour-coded bags. The input consists of several hundred lines that each describe the colour of a bag, and its required contents, like

shiny red bags contain 5 muted pink bags, 1 faded green bag.
faded green bags contain no other bags.
bright white bags contain 2 muted black bags.

That’s a ton of nested bags, and it’s starting to sound like a graph theory problem. We can label each node by its colour description, and assign a weight to each edge by bag quantity. Since logically, a bag can’t contain itself, and inner bags can’t contain outer bags (forming a cycle), we can look at this as a directed acyclic graph. Part 1 seems like it’s asking a lot at first. The prompt reads, “You have a shiny gold bag. If you wanted to carry it in at least one other bag, how many different bag colors would be valid for the outermost bag?”. My first thought was just to do a topological sort of our graph, count the number of nodes that occur before shiny gold, and call it a day. There’s a logical error, though - for a given node that has a path to shiny gold, we don’t necessarily know if all of its other edges also converge to shiny gold. If we had a graph that looked like

                      Г── gold|
  Г──────── green ───┘
red ─────── blue|

blue would come before gold in sorted topological order. We’ll just have to walk the graph normally, then.

Sounds simple enough. We’ll conjure up some Haskell as usual, and model our graph as a Map of Strings (bag colours) to lists, [(String, Int)], denoting other bag colours and their respective counts.

I took some time this weekend and refactored my Advent of Code repo a little bit. It makes use of the advent-of-code-api library to automatically fetch inputs and submit solutions. I’ve also abstracted away a bunch of modules and useful functions into one monolithic file for faster imports. All in all, I save about 30 seconds now.

module Solutions.Day7 where

import Helpers -- Control.Arrow, Control.Applicative, Control.Monad, Data.Maybe, Data.List.Split, Data.List

We also need a way to map keys to values. Data.Map is comfy, although it uses trees instead of hash tables internally. That’s fine for our purposes.

import Data.Map (Map, (!))
import qualified Data.Map as M

Following my tradition of avoiding regex like the plague, we’ll spit out something naive to parse our input into the format we want. In my defense, even Parsec is an ordeal to debug and it was late.

It seems like every line has the term bags contain separating the description of a bag and the description of its children, so we’ll use splitOn from Data.List.Split. Only matching a 2-element list like that is dangerous in the real world, because you’ll hit a parse error if you try to match a list of any other length.

If a description says no other, we can (safely) assume that there won’t be any outgoing edges from this graph node. It also saves us from a potential edge case when trying to parse the descriptions of inner bags. We’ll focus on a single line at first.

parseBag :: String -> (String, [(String, Int)])
parseBag = splitOn " bags contain" >>> \[colour, inners] -> (colour, edges)
  where edges = ("no other" `isPrefixOf` inners ? [] $ ((map parseDscrp . splitComma) inners)))
  -- we can optimize by using a shorter matching string, like "no o"
  -- no regex allowed

“But Haskell doesn’t have ternary operators???” you say, bound by the shackles of inflexible syntax.

(?) :: Bool -> a -> a -> a
(?) True = const
(?) False = const id

-- splitComma is also defined in helpers
splitComma :: [Char] -> [[Char]]
splitComma = splitOn ","

We also need to parse the predicates of each sentence. We know that each description is separated by commas, so we can break the task down to parsing a single description at a time. The inputs for parseDscrp will look like <number> <adjective> <colour> (bag[s]) at this point. Wikipedia doesn’t list any colours that start with “bag”, and Eric isn’t evil, so we’ll filter out any “bag”-like word too.

Since every sequence starts with a number, we’ll just split by words, turn the <number> into an Int, and use the remainder as our colour. These colours should match up with the colour descriptions that preface each line, that we’ll use as keys in our eventual Map.

parseDscrp :: String -> (String, Int)
parseDscrp = words >>> filter (not . ("bag" `isPrefixOf`)) >>> (tail &&& head) >>> (unwords *** read) 
                                                           -- \(n:bs) -> (unwords bs, read n)) 

Let’s try these out now.

> parseDscrp "5 muted pink bags" 
(5, "muted pink")

> parseBag "bright orange bags contain 5 muted pink bags, 1 dark black bag."
("bright orange", [(5, "muted pink"), (1, "dark black")])

> parseBag "dark black bags contain no other bags."
("dark black", [])

Way easier than writing regular expressions, right? At least it’s more fun this way. Let’s build our graph now - fromList from Data.Map expects a list [(key, value)], which matches up with the output type of parseBag. All our types match up, so we’ll connect our functions into a neat little pipeline and build a dependency graph from our input.

type Graph = Map String [(String, Int)]

fromInput :: String -> Graph
fromInput = lines >>> map parseBag >>> M.fromList

To put the prompt into simpler terms, it’s asking how many nodes in our graph can reach “shiny gold”. We know that each bag doesn’t contain that many different bags, and that the result of searching for “shiny gold” in a bag’s path depends on whether we find it deeper down in our traversal. In any other languages, this is a dynamic programming problem. We have a lot of inputs, a (potentially expensive) recursive call, and a ton of sharing between values. Luckily, we’re working in a lazy language, and values only get computed once within a given closure. If we pass “matte beige” into walk, find an answer, and encounter “matte beige” again during a later traversal, the answer will be returned immediately since the results will be cached by the GHC runtime. walk is recursive, and we’re likely to encounter the same values many times, so having implicit memoization in the background saves us a ton of effort. To be honest, getting implicit memoization to kick in is a subtle thing, but my code took less than 60ms for both parts combined so I’ll count that as a win. I’ll have to come back and do some profiling at a later date to get a better insight of what’s going on underneath.

In the meantime, we’ve done it. Either a node has “shiny gold”, or one of its children does. We’ll traverse from every node in our graph, and count the number of successful starting points. Again, passing 3 functions h, f, g into liftA2 will apply functions f, g to the same input, passing both results as arguments into h. It’s incredibly useful for capturing the pattern of passing a single input into two computations, and combining their results.

walk :: Graph -> String -> Bool
walk g b = "shiny gold" `elem` kids || any (walk g) kids
    where kids = map fst $ (g ! b)
    -- liftA2 (||) (elem "shiny gold") (any $ walk g)

That’s it. We can map walk over the keys of our graph directly, saving us from having to convert it back down to a list.

day7Pt1 :: String -> Int
day7Pt1 = fromInput >>> (\graph -> M.filterWithKey (\k _ -> walk graph k) graph) >>> M.size

If you want to be even more clever, you can use the Monad instance of the function type ((->) r) (lifting sequential function application with the same input into a monadic bind) to cut this down even more. Brevity is free in the lambda calculus.

instance Monad ((->) r) where
  (>>=) :: (r -> a) -> (a -> (r -> b)) -> (r -> b)
  g >>= f = \x -> f (g x) x 

day7Pt1' :: String -> Int
day7Pt1' = fromInput >>> ((walk >>> (const .)) >>= M.filterWithKey) >>> M.size

Really useful stuff. The Applicative instance of the function type ((->) r) additionally corresponds to the S combinator, which encapsulates the flipped version of this pattern,

instance Applicative ((->) r) where

 -- pure :: a -> ((->) r a)
 pure :: a -> (r -> a)
 pure = const

 -- (<*>) :: ((->) r) (a -> b) -> ((->) r a) -> r -> b
 (<*>) :: (r -> a -> b) -> (r -> a) -> r -> b
 (<*>) f g x = f x (g x)

Part 2 coming soon!

innerBags :: Graph -> String -> Int
innerBags g b = succ (sum $ uncurry ((*) . innerBags g) <$> (g ! b))
               
day7Pt2 :: String -> Int
day7Pt2 = fromInput >>> (pred . flip innerBags "shiny gold")
    -- pred because a shiny gold bag can't contain itself