Generalized Dijkstra in Haskell
This years’ Advent of Code has loads of 2D grids, and makes you traverse them all to find paths of various kinds. At some point I had to implement Dijkstra’s algorithm, in Haskell. In trying to make my implementation reusable for the following days, I realized that Dijkstra’s algorithm is actually way more general than I remembered (or was taught)! In short, weights don’t have to be real-valued!
In this post, I describe a general interface for the algorithm, such that we can implement it exactly once and use it to compute many different things.
- A primer on Dijkstra’s algorithm
- Taking a step back, generalizing
- Abstract Haskell interface and implementation
- Instantiating the interface
- Closing thoughts
This article is a literate Haskell file, so feel free to download it and try it 1 for yourself! As such, let’s get a few imports and language extensions out of the way:
Haskell Bookkeeping
{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE MultiWayIf #-}
import Control.Monad (when, foldM, forM_)
import Control.Monad.ST (ST, runST)
import Data.Array (Array, (!), listArray, range)
import Data.Array qualified as Array (bounds)
import Data.Array.MArray (newArray, freeze, readArray, writeArray)
import Data.Array.ST (STArray, runSTArray)
import Data.Bifunctor (bimap)
import Data.Function ((&))
import Data.Functor ((<&>))
import Data.Ix (Ix, inRange)
import Data.Kind (Type)
import Data.List (transpose, find)
import Data.Ord (Down(Down))
import Data.Semigroup (Arg(Arg))
import Data.Set (Set)
import Data.Set qualified as Set
A primer on Dijkstra’s algorithm
Let’s recall the problem statement that Dijkstra’s algorithm solves, and the pseudo-code of the algorithm. If you know that already, you can skip to the next section.
The shortest path problem
It’s been a while since I had to use any formalism to talk about graphs proper. So I will be using the notations from the Cormen book that I just looked up for a refresher.
Consider a weighted directed graph G = (V, E, w).
- V denotes the set of vertices.
- E \subseteq V \times V denotes the set of edges.
- Every edge e \in E has an associated (non-negative) weight w(e) \in \mathbb{R}, w(e) > 0.
We call path a sequence of vertices such that there is an edge between every consecutive vertex in the sequence. If we denote \text{paths}(a, b) the set of paths from a to b in G, this means p = \langle v_0, \dots, v_k \rangle \in \text{paths}(a, b) if v_0 = a, v_k = b and \forall 0 ≤ i < k, (v_i, v_{i + 1}) \in E.
We can define the weight of a path as the sum of the weights of its constituent edges.
w(p) = \sum_{i = 1}^k{w(v_{i - 1}, v_i)}
Shortest-paths problems ask questions along the lines of:
- What is the minimum weight from a to b in G?
- What is one path from a to b with minimum weight?
- Can we find all such paths from a to b?
If you interpret the weight as a physical distance, this amounts to finding the shortest trip from one vertex to the other.
Dijkstra’s algorithm is an infamous technique for solving the single-source shortest-paths problem: finding a shortest path from a given source vertex s to every other vertex v \in V. It’s essentially a generalization of Breadth-First Search (BFS) to (positively) weighted graphs. And it’s pretty fast!
Dijkstra’s algorithm
It’s really “just” an extension of BFS to graphs with different non-negative weights. So it traverses vertices in the graph in the order of their shortest path distance from the root. Instead of using a regular queue like in BFS, or a stack like in DFS, Dijsktra’s algorithm uses a priority queue to store discovered vertices for which we have yet to find the shortest distance. Their priority in the queue is the current distance along the current known shortest paths in the graph.
The algorithm can be written as such:
for (u ∈ V) u.distance = ∞
.distance = 0
start
let pqueue = new PositivityQueue
.insert(start, 0)
pqueue
while (!pqueue.isEmpty()) {
= pqueue.popLowest()
u
for (v ∈ V, (u, v) ∈ E) {
let distThroughU = u.distance + w(u, v)
if (distThroughU < v.distance) {
// we've found a new shorter path to v (going through u)
if (v.distance == ∞):
// first time we see v
// it isn't in the queue yet
.insert(v, distThroughU)
pqueue
}else {
.update(v, distThroughU)
pqueue
}.distance = distThroughU
v
}
} }
By the end of the algorithm, u.distance
contains the
shortest distance/weight from u
to the start
,
for every u
.
Let’s look past the proof of correctness.
Taking a step back, generalizing
One thing to notice in the problem statement from earlier and its accompanying algorithm is that weights have very little to do with real numbers. In fact, they don’t have to be scalars at all!
If we denote W the set of weights, the algorithm merely requires:
- An equivalence relation (\cdot \approx \cdot) \subseteq W \times W on weights. It doesn’t have to be definitional equality!
- A total order on weights, that is: (\cdot\leq\cdot) \subseteq W \times W such that it is transitive, reflexive and anti-symmetric (w.r.t \approx).
- A way to add weights together (\cdot
\oplus \cdot) ∷ W \rightarrow W \rightarrow W such that:
\approx is compatible with \oplus: equivalent inputs lead to equivalent sums:
\forall a, b, c, d \in W, a \approx b, c \approx d \Rightarrow a \oplus c \approx b \oplus d
\leq is compatible with \oplus: it must be order-preserving over both its arguments.
\forall a, b, c, d \in W, a \leq b, c \leq d \Rightarrow a \oplus c \leq b \oplus d
\oplus is associative. I don’t think it’s a requirement for Dijkstra’s algorithm but it’s a nice property to have, and makes writing sums unambiguously. And it is probably necessary for Floyd-Warshall.
- To initialize the algorithm, we require a default “largest” weight: an absorbing element \infty for \oplus. It follows that it is an upper bound of W.
- The algorithm also needs a neutral element 0_u for \oplus, the initial weight at the start vertex u\in V. It follows that it is a lower bound of W.
If we summarize, it looks like (W/\approx, \oplus, 0_u) should be a monoid, totally ordered by \leq and with absorbing element \infty. I think this encompasses all the properties stated above, and nothing more, but I haven’t looked that deeply into the formalism.
The restriction that edges must have non-negative weights can simply be reworded as weights having to be strictly larger than the identity element.
\forall e \in E, w(e) > 0_u
Now we can freely redefine the weight of a path:
w(p) = \bigoplus_{i = 1}^k{w(v_{i - 1}, v_i)}
Equipped with this toolkit, we can state the single-source shortest-path problem again: for a given source vertex s \in V, how do we compute the smallest weight achievable on a path from s to any other vertex e \in V?
Without too much surprise, Dijkstra’s algorithm works just as well. I will not prove correctness here, though I probably should.
Abstract Haskell interface and implementation
Weights
Given the requirements on weights we established earlier, we can try to map each of them to their corresponding Haskell counterpart.
- Weights should have an equivalence relation: that’s
Eq
. - Weights should have a total order: that’s
Ord
. - Weights should have an associative addition operation that respects
the order: that’s
Semigroup
.
Sadly we’re not using Agda, and can’t enforce that the order relation and the semigroup operation be compatible using the type system. We’ll just have to be careful when defining the instances.
Therefore, a Weight
should provide instances for all
three classes above.
class (Eq a, Ord a, Semigroup a) => Weight a where
infty :: a
updateWeight :: a -> a -> a
= const updateWeight
infty
is the absorbing element of W. As stated earlier, it must be an upper
bound of W.
But what is this updateWeight
operation here? It is used
to merge equivalent weights. Indeed, during the execution of
the Dijkstra algorithm, in the relaxation phase, we may find that the
weight of going to v by passing through
u is equal to the cost we have already
computed for v. Because we haven’t
decreased the weight, we shouldn’t update the priority of v in the queue, however it can often still be
useful to take those new paths into account.
That’s what this function is for. The only requirement for
updateWeight
is that the output should be in the same
equivalence class as its (equivalent) inputs.
\forall w, w' \in W, w \approx w' \Rightarrow \texttt{updateWeight}(w, w') \approx w \approx w'
As a convention, the first argument is the already computed weight, and the second argument is the newly discovered (equivalent) weight along the new path(s) through u.
The resulting weight won’t change the priority of v in the queue, and the order of traversal. But the new information is now accounted for.
The default implementation for updateWeight
discards the
new weight entirely. This is quite common, say, if we only want to find
“a shortest path”, and not every one of them.
Graphs
Now that we know what weights are, we need to describe what kind of graphs are suitable for our Dijkstra algorithm.
data Dijkstra i c = Dijkstra
bounds :: (i, i)
{ startWeight :: i -> c
, next :: i -> c -> [(c, i)]
, }
So, let’s expand a bit on the fields of the interface.
bounds
describes the lower and upper bound of V. This is just an implementation detail: I want to store intermediate weights in a mutable array during the traversal, for efficiency purposes. So I need to know the size of V.If you cannot reasonnably enumerate all vertices, you can drop the
bounds
field and use a purely-functional persistentMap
instead in the implementation.initCost
returns the initial cost we use for a given start vertex. It must always be an identity element of W, and a lower bound of W.\forall s \in V, w \in W, \texttt{startCost}(s) \oplus w \approx w \oplus \texttt{startCost}(s) \approx w \forall s \in V, w \in W, \texttt{startCost}(s) \leq w
Concretely, this means that rather than have a single identity 0 \in W, we have one for every vertex. By anti-symmetry of the ordering relation they are all equivalent anyway. This is very useful to store information about the starting vertex in the weight. Say, if we’re computing paths, we initially store a 0-length path containing only the starting vertex.
And finally, the bread and butter of the graph: a transition function
next
. For any vertex u and its associated weight w,next u w
returns the neighbours of u, with the weight of the edges. As discussed earlier, weight of edges must be strictly larger than 0.One may wonder why we take as input the weight of u, and indeed it is weird. Most reasonable transition functions ignore it. But this means you can define funky graphs where the weight of an edge depends on the minimal weight to get there from a specific source. I think it is perfectly fine w.r.t the assumptions of the Dijkstra algorithm, though knowing exactly what kind of graph this corresponds to is a bit more tedious.
I show one such example where I rely on this input weight later on.
And here we have it! A description of graphs that can serve as input for the Dijkstra algorithm to solve the single-source shortest-path problem.
Note that this interface is completely agnostic to how we encode our graphs, so long as we can extract a transition function from this underlying representation.
Generic Dijkstra implementation
Finally. It is time. We can implement the Dijkstra algorithm. But first we need a priority queue, with the following interface:
type PQueue :: Type -> Type
emptyQ :: PQueue a
singletonQ :: Ord a => a -> PQueue a
insertQ :: Ord a => a -> PQueue a -> PQueue a
pattern EmptyQ :: PQueue a
pattern (:<) :: Ord a => a -> PQueue a -> PQueue a
For simplicity, let’s just use a wrapper around
Data.Set
.
PQueue
implementation
newtype PQueue a = PQueue (Set a)
= PQueue Set.empty
emptyQ = PQueue . Set.singleton
singletonQ PQueue s) = PQueue (Set.insert x s)
insertQ x (
minView :: PQueue a -> Maybe (a, PQueue a)
PQueue s) =
minView (case Set.minView s of
Just (x, s') -> Just (x, PQueue s')
Nothing -> Nothing
pattern EmptyQ <- (minView -> Nothing)
pattern (:<) x q <- (minView -> Just (x, q))
I haven’t tried existing implementations available on Hackage yet, I should get around to it at some point. It also looks like I may want a priority search queue, so that I can really update the priority for a given key.
At last, the implementation for Dijkstra’s algorithm:
dijkstraFrom :: (Ix i, Weight c) => Dijkstra i c -> i -> Array i c
= dijkstra d s Nothing
dijkstraFrom d s
dijkstraFromTo :: (Ix i, Weight c) => Dijkstra i c -> i -> i -> c
= dijkstra d s (Just e) ! e
dijkstraFromTo d s e
dijkstra :: (Ix i, Weight c)
=> Dijkstra i c -> i -> Maybe i
-> Array i c
Dijkstra{..} :: Dijkstra i c) start cutoff = runSTArray do
dijkstra (<- newArray bounds infty
costs let zero = startWeight start
= singletonQ (zero, start)
queue
writeArray costs start zero
aux costs queuepure costs
where
aux :: STArray s i c -> PQueue (c, i) -> ST s ()
EmptyQ = pure ()
aux costs :< queue) | Just u == cutoff = pure ()
aux costs ((uWeight, u) :< queue) = do
aux costs ((uWeight, u) <- readArray costs u
uWeight' == uWeight') do
when (uWeight let edges = next u uWeight'
!queue <- foldM (relaxNeighbour costs uWeight') queue edges
aux costs queue
relaxNeighbour :: STArray s i c -> c -> PQueue (c, i) -> (c, i)
-> ST s (PQueue (c, i))
!queue (uvWeight, v) = do
relaxNeighbour costs uWeight let !vWeight = uWeight <> uvWeight
<- readArray costs v
vWeight' case vWeight `compare` vWeight' of
GT -> pure queue
EQ -> do
$ updateWeight vWeight' vWeight
writeArray costs v pure queue
LT -> do
writeArray costs v vWeightpure $ insertQ (vWeight, v) queue
Et voilà! Quite straightforward. Short and sweet.
Perhaps the most perplexing bit is the
when (uWeight == uWeight')
block when we pop the queue, and
the fact that we use uWeight'
(the “old” weight)
afterwards. Both of those are necessary because of our priority queue
interface. Because we don’t have a proper way to update the
priority/weight of a vertex in the queue, we instead insert it
again with a lower priority.
But this means that we will find the vertex several times in the queue afterward, and always with a weight strictly larger than what we know to be the minimal weight. Adding this check enforces that we process them only once. Likewise, when we merge weights, we don’t touch the priority queue, so the merged weight is only stored in our array, wheras the (equivalent weight) in the queue is outdated.
All this to say: yes, I know my interface for a priority queue is not very good, and I should really just switch to a priority search queue or something.
Instantiating the interface
Cool cool cool, we have a general Dijkstra implementation now, for any weight with the right structure. But what can we do with it? Time to figure this out.
Let’s define a few utilities for working with 2d grids, and a small grid that we will traverse in all the follow-up examples.
type Coord = (Int, Int)
type Grid = Array Coord Char
-- check if a cell is in bounds and empty
isValidCell :: Grid -> Coord -> Bool
=
isValidCell grid p inRange (Array.bounds grid) p && grid ! p /= '#'
-- get the coordinates around a cell
cellsAround :: Coord -> [Coord]
=
cellsAround (x, y) - 1, y )
[ (x + 1, y )
, (x - 1)
, (x , y + 1)
, (x , y
]
-- get empty cells in bounds around a cell
neighbours :: Grid -> Coord -> [Coord]
=
neighbours grid p filter (isValidCell grid) $ cellsAround p
-- get the coordinates of a given char in the grid
findCoord :: Grid -> Char -> Maybe Coord
=
findCoord grid c range (Array.bounds grid)
& find \p -> grid ! p == c
toGrid :: (Coord, Coord) -> [String] -> Grid
= listArray bounds . concat . transpose toGrid bounds
example :: Grid
= toGrid ((0, 0), (14, 14))
example "###############"
[ "#.......#.....#"
, "#.#.###.#.###.#"
, "#.....#.#...#.#"
, "#.###.#####.#.#"
, "#.#.#.......#.#"
, "#.#.#####.###.#"
, "#...........#.#"
, "###.#.#####.#.#"
, "#...#.....#.#.#"
, "#.#.#.###.#.#.#"
, "#.....#...#.#.#"
, "#.###.#.#.#.#.#"
, "#...#.....#...#"
, "###############"
,
]
endPos :: Coord
startPos,= (1,1)
startPos = (13,13) endPos
Minimum distance
Our simplest example consists in trying to compute the length of the shortest path between two vertices. We define our weight, in this case either an integer for the minimum distance, or \infty for unreachable vertices.
data MinDist = Dist !Int | Infinite deriving (Eq, Ord)
instance Semigroup MinDist where
Infinite <> _ = Infinite
<> Infinite = Infinite
_ Dist x <> Dist y = Dist (x + y)
instance Monoid MinDist where mempty = Dist 0
instance Weight MinDist where infty = Infinite
instance Show MinDist where
show Infinite = "∞"
show (Dist d) = show d
Let’s also define a partial Num MinDist
to be able to
create weights using integer literals.
Num MinDist
instance Num MinDist where
+) = (<>)
(-) = error "not supported"
(*) = error "not supported"
(abs = id
signum = const 1
fromInteger = Dist . fromInteger
Then we instantiate the interface. All sources have a default weight
of 0
, and all edges have a weight of 1
.
minDist :: Grid -> Dijkstra Coord MinDist
= Dijkstra
minDist grid = Array.bounds grid
{ bounds = const 0
, startWeight = \u _ -> (1,) <$> neighbours grid u
, next
}
getMinDist :: Grid -> Coord -> Coord -> MinDist
= dijkstraFromTo . minDist getMinDist
>>> getMinDist example startPos endPos
24
Shortest path
Nice, but what if we want to get an actual shortest path? The answer, again, is to introduce a custom weight.
data ShortestPath = SPath MinDist [Coord]
deriving Show
The goal is to compute the minimal ShortestPath
weight
to a vertex such that it contains the path from the source to the
vertex, in reverse order. So the vertex appears at the beginning of the
list, followed by all the vertices until the source.
Given that we only want to find a shortest path, we can put paths with the same length in the same equivalence class, and compare paths only by looking at their length. To add shortest paths together, we concatenate them. Because the paths are stored in reverse order, the concatenation is flipped.
instance Eq ShortestPath where
SPath x _ == SPath y _ = x == y
instance Ord ShortestPath where
SPath x _ `compare` SPath y _ = compare x y
instance Semigroup ShortestPath where
SPath x xs <> SPath y ys = SPath (x <> y) (ys <> xs)
And here we have it: a new weight for computing a shortest path!
instance Weight ShortestPath where
= SPath infty [] infty
shortestPath :: Grid -> Dijkstra Coord ShortestPath
= Dijkstra
shortestPath grid = Array.bounds grid
{ bounds = \s -> SPath 0 [s]
, startWeight = \u _ -> neighbours grid u
, next & map \v -> (SPath 1 [v], v)
}
getShortestPath :: Grid -> Coord -> Coord -> ShortestPath
= dijkstraFromTo . shortestPath getShortestPath
Note that this time the inital (identity) weight does depend on the source vertex, since the shortest path from the source always contains the source.
>>> getShortestPath example startPos endPos
SPath 25 [(13,13),(12,13),(11,13),...,(1,4),(1,3),(1,2),(1,1)]
Plotting it on the grid, with cells visited by the math marked with
O
:
###############
#O......#.....#
#O#.###.#.###.#
#O....#.#...#.#
#O###.#####.#.#
#O#.#.......#.#
#O#.#####.###.#
#OOOOOOOOOOO#.#
###.#.#####O#.#
#...#.....#O#.#
#.#.#.###.#O#.#
#.....#...#O#.#
#.###.#.#.#O#.#
#...#.....#OOO#
###############
Shortest paths (all of them)
Now what if we want all the shortest paths? Simple, we define a new weight!
data ShortestPaths = SPaths MinDist [[Coord]] deriving Show
The only difference with ShortestPath
is that we store
multiple paths. But we compare the weights in the exact same way.
instance Eq ShortestPaths where
SPaths x _ == SPaths y _ = x == y
instance Ord ShortestPaths where
SPaths x _ `compare` SPaths y _ = compare x y
When adding two weights together, we have to concatenate every possible pair of paths from both weights:
instance Semigroup ShortestPaths where
SPaths x xss <> SPaths y yss =
SPaths (x <> y) [ys <> xs | xs <- xss, ys <- yss]
Finally, we make sure to update the set of paths if we find another (distinct but equivalent) set of paths with the same length:
instance Weight ShortestPaths where
= SPaths infty []
infty
SPaths l xss) (SPaths _ yss) =
updateWeight (SPaths l (xss <> yss)
shortestPaths :: Grid -> Dijkstra Coord ShortestPaths
= Dijkstra
shortestPaths grid = Array.bounds grid
{ bounds = \s -> SPaths 0 [[s]]
, startWeight = \u _ -> neighbours grid u
, next & map \v -> (SPaths 1 [[v]], v)
}
getShortestPaths :: Grid -> Coord -> Coord -> ShortestPaths
= dijkstraFromTo . shortestPaths getShortestPaths
>>> getShortestPaths example startPos endPos
SPaths 24 [[(13,13),(12,13),(11,13),...,(1,4),(1,3),(1,2),(1,1)]
13,13),(12,13),(11,13),...,(2,3),(1,3),(1,2),(1,1)]
,[(13,13),(12,13),(11,13),...,(3,2),(3,1),(2,1),(1,1)]] ,[(
There are exactly 3 shortest paths on the grid.
Those are perhaps the 3 most common ways in which the Dijsktra algorithm is used. It’s already pretty neat that we can use a single implementation to compute them all. But so far all our edges have the same weight, so a BFS would have been sufficient.
Let’s look at fancier instantiations, all taken from my solutions to some Advent of Code puzzles.
Advent of Code, Day 16
Day 16 asks us to find the minimal score to go from
S
to E
in a maze.
day16example :: Grid
=
day16example 0, 0), (14, 14)) $ concat $ transpose
listArray (("###############"
[ "#.......#....E#"
, "#.#.###.#.###.#"
, "#.....#.#...#.#"
, "#.###.#####.#.#"
, "#.#.#.......#.#"
, "#.#.#####.###.#"
, "#...........#.#"
, "###.#.#####.#.#"
, "#...#.....#.#.#"
, "#.#.#.###.#.#.#"
, "#.....#...#.#.#"
, "#.###.#.#.#.#.#"
, "#S..#.....#...#"
, "###############"
, ]
Every step along the way from one cell to its neighbour increases the
score by 1
. But here’s the twist: every left or right
turn along the path adds an extra 1000
to the
score.
data Dir = U | R | D | L deriving (Eq, Ord, Enum, Ix)
Dir
utils
opposite :: Dir -> Dir
=
opposite dir case dir of
U -> D
R -> L
D -> U
L -> R
dirCellsAround :: Coord -> [(Dir, Coord)]
= zip [L, R, U, D] . cellsAround
dirCellsAround
dirNeighbours :: Grid -> Coord -> [(Dir, Coord)]
=
dirNeighbours grid p filter (isValidCell grid . snd) $ dirCellsAround p
Go away if you still intend to solve it on your own. Otherwise, let’s kick on. The trick is to define a new graph that is (conceptually) made of 4 copies of the original graph. One for every direction. A vertex on this new graph represents the position we’re at on the initial graph, along with our current orientation at this coordinate.
day16 :: Grid -> Dijkstra (Dir, Coord) MinDist
= Dijkstra
day16 grid = bimap (U,) (L,) $ Array.bounds grid
{ bounds = const 0
, startWeight = next
, next
}where
next :: (Dir, Coord) -> MinDist -> [(MinDist, (Dir, Coord))]
=
next (udir, u) _
dirNeighbours grid u& map \dv@(vdir, _) ->
if | vdir == opposite udir -> (infty , dv)
| vdir == udir -> (1 , dv)
| otherwise -> (1001 , dv)
In the transition function, we compute the appropriate weight of the edge depending on whether we make a turn or continue on our trajectory. In case we go back on our footsteps, we put an infinite weight.
The puzzle instructions say we start facing East, so with
direction R
. If we run the instantiation above, we know the
smallest score along a path that puts us in a given direction. To find
out the smallest score possible to reach the end, we take the minimum
across all directions, as we don’t care which way we’re facing by the
end of the path.
answerDay16 :: Grid -> MinDist
=
answerDay16 grid let
Just start = findCoord grid 'S'
Just end = findCoord grid 'E'
= dijkstraFrom (day16 grid) (R, start)
dists in minimum $ map (dists !) $ [U, R, D, L] <&> (,end)
>>> answerDay16 day16example
7036
Multiple shortest paths achieve this score. Part 2 asks us to find the total number of cells traversed by at least one such shortest path.
I think you get the picture. We could use ShortestPaths
,
concat them all and nub. Better even, we could define a weight that
stores a Set Coord
of the cells traversed on all shortest
paths. You get it, it’s weights all the way down.
Advent of Code, Day 18
Last example, because I want to finish this post by the end of the weekend. Day 18 gives us as input a list of coordinates (called bytes).
day18bytes :: [Coord]
=
day18bytes 5,4), (4,2), (4,5), (3,0), (2,1), (6,3), (2,4)
[ (1,5), (0,6), (3,3), (2,6), (5,1), (1,2), (5,5)
, (2,5), (6,5), (1,4), (0,4), (6,4), (1,1), (6,1)
, (1,0), (0,5), (1,6), (2,0)]
, (
day18bounds :: (Coord, Coord)
= ((0, 0), (6, 6)) day18bounds
Every nanosecond that passes by, the next coordinate/byte in the input list falls onto a 2d grid, creating a wall at this coordinate. The more we wait, the more walls fill the grid.
Part 2 asks us the question:
What are the coordinates of the first byte that will prevent the exit from being reachable from your starting position?
In order to find this byte, we need to know when we can no longer reach the exit. What if we were able to find the longest-surviving path to the target? Its lifespan would be the time at which the annoying byte appeared, i.e its index in the input list.
Because we want to be able to efficiently check if there is a wall at a given coordinate at a given time, let’s transform the input list into a grid.
data Cell = Empty | Wall Int deriving Show
type Space = Array Coord Cell
Walls in our grid are annotated with the time offset after which they become activated.
day18grid :: Space
= runSTArray do
day18grid <- newArray day18bounds Empty
grid zip day18bytes [0..]) \(b, k) ->
forM_ (Wall k)
writeArray grid b (pure grid
Now we define a weight that encourages longest-surviving shortest paths.
data Oldest = Oldest
lifespan :: Int
{ distance :: MinDist
,deriving Eq
}
instance Ord Oldest where
Oldest l1 d1 `compare` Oldest l2 d2 =
Down l1, d1) `compare` (Down l2, d2) (
We use Down
in our Ord
instance to flip the
ordering relation on the lifespan
field: paths that survive
longer are considered to have a lesser weight.
When joining two optimal paths together, the resulting path will only survive as long as both of them have, and the distances are added up.
instance Semigroup Oldest where
Oldest l1 d1 <> Oldest l2 d2 =
Oldest (l1 `min` l2) (d1 <> d2)
instance Weight Oldest where infty = Oldest 0 infty
day18 :: Space -> Dijkstra Coord Oldest
= Dijkstra
day18 grid = Array.bounds grid
{ bounds = const $ Oldest maxBound 0
, startWeight = next
, next
}where
next :: Coord -> Oldest -> [(Oldest, Coord)]
Oldest _ d) =
next u (
cellsAround u& filter (inRange $ Array.bounds grid)
& map \v -> case grid ! v of
Empty -> (Oldest maxBound 1 , v)
Wall t | Dist t > d -> (Oldest t 1 , v)
Wall t -> (Oldest 0 infty, v)
And then it’s the same old:
startWeight
: The (empty) path from the start to itself will survive forever.next
:- The edge to an empty cell always exists, and has length 1.
- The edge to an as-of-yet unactivated wall will survive until the wall activates.
- The edge to an already activated wall is dead already, and the vertex is inaccessible.
answerDay18 :: Space -> Coord -> Coord -> Coord
=
answerDay18 grid start end let weight = dijkstraFromTo (day18 grid) start end
in day18bytes !! lifespan weight
>>> answerDay18 day18grid (0, 0) (6, 6)
6,1) (
I find this such an elegant way to solve the problem. And it’s crazy fast too! When I ran it on my input, it computed the answer in about 300µs. I’m not used to trying to optimize Haskell for speed, and I’m sure there are ways to make my implementation more efficient, but still.
Now, for the weird part about the definition of
next
in day18
: how is it that we can check if
a wall has been activated yet when looking for the neighbours
of a cell? It’s unusual, right? This means that the weight of an edge is
allowed to depend on the weight of the optimal path reaching it. Indeed,
we use the length d
of the (shortest) longest-surviving
path to u to infer that d
nanoseconds have elapsed before reaching u.
I think that such a definition of next is equivalent to working on a new graph such that any coordinate on the grid corresponds to infinitely many vertices: the coordinate annotated with the time it took to reach it. Then, the transition function of this new graph brings coordinates annotated with time t to coordinates with time t + 1, with the weights from the original graph. And now the dependency on the weight of the incoming path is gone.
I don’t really have a conclusion on this. It feels like the type signature for our transition function is very permissive, and yet the correctness argument still stands under our assumptions. To be investigated.
Closing thoughts
Here we are. I hope this weekend obsession of mine was interesting to someone. It sure was quite surprising to me that an algorithm I was taught a while back could be applied in a more general context quite easily.
I’m not breaking new grounds by any means. Most people know that the Dijsktra algorithm can be tuned to compute different things. But it’s funny to see that generating the shortest path or all of them can be considered specific instances of the generalized smallest weight problem. I also find it quite easy to mess up the implementation when trying to adapt Dijkstra’s algorithm to my problem of interest. With a single implementation and interface, this doesn’t happen.
Disclaimer: I have done little to no research about whether this generalization has been discussed at large already. I did find a few research papers on routing algorithms over networks that give more algebraic structure to weights. I don’t think they match one to one with what I describe here, because they seem to be interested in more general probems. And I haven’t found anything targeted at a non-scientific audience.
One thing that I still find missing in my generalization is the
complete lack of rules over updateWeight
. In our examples,
we implemented it such that it returned what we cared about — a single
path or all of them — but how do we state this formally? I think there’s
hope in introducing another finer equivalence relation for every
equivalence class produced by \approx.
Then I think updateWeight
should be a commutative
and associative operation with respect to these finer
equivalence relations. Who knows. I certainly don’t.
Update: commenters on Reddit have kindly pointed me to related work:
- Algebraic Path
Finding from Iago Leal de Freitas.
They introduce a general version of the Floyd-Warshall algorithm, over any tropical semiring. - A Very General
Method of Computing Shortest Paths from Russel O’Connor.
Similar to the previous article, in that it showcases a general implementation of the Floyd-Warshall algorithm, over *-semirings. - This research paper discussing semiring frameworks for shortest-distance problems.
It looks like requiring Ord c
and a commutative
updateWeight
always yields a semiring as described above,
by picking x \otimes y = min (x,
\texttt{updateWeight}(x, x \oplus y)), or something along those
lines.
Therefore it looks like the only significant difference with my approach is the fact that I allow the weight of an edge in a graph to depend on the smallest weight to the incoming vertex. Because of how Dijkstra’s algorithm works, this is not a problem. But this kind of edge weight would not be possible if we require \oplus to be associative, and is incompatible with how algorithms like Floyd-Warshall construct shortest paths, by concatenating path segments.
But we crucially rely on this flexibility in Day 18 of AOC!
As always, if you have any feedback, or any additional insight on what’s discussed here, please reach out! In particular, I would love to hear about other useful weird weights that could be defined.
Feel free to discuss this post on reddit.
main :: IO ()
= putStrLn "Thank you for reading!" main
You can use markdown-unlit to compile the markdown file with GHC.↩︎