Dec 22, 2024

Generalized Dijkstra in Haskell

This years’ Advent of Code has loads of 2D grids, and makes you traverse them all to find paths of various kinds. At some point I had to implement Dijkstra’s algorithm, in Haskell. In trying to make my implementation reusable for the following days, I realized that Dijkstra’s algorithm is actually way more general than I remembered (or was taught)! In short, weights don’t have to be real-valued!

In this post, I describe a general interface for the algorithm, such that we can implement it exactly once and use it to compute many different things.

A primer on Dijkstra’s algorithm
1. The shortest path problem
2. Dijkstra’s algorithm
Taking a step back, generalizing
Abstract Haskell interface and implementation
Instantiating the interface
Closing thoughts

This article is a literate Haskell file, so feel free to download it and try it ¹ for yourself! As such, let’s get a few imports and language extensions out of the way:

Haskell Bookkeeping

{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE MultiWayIf #-}

import Control.Monad (when, foldM, forM_)
import Control.Monad.ST (ST, runST)
import Data.Array (Array, (!), listArray, range)
import Data.Array qualified as Array (bounds)
import Data.Array.MArray (newArray, freeze, readArray, writeArray)
import Data.Array.ST (STArray, runSTArray)
import Data.Bifunctor (bimap)
import Data.Function ((&))
import Data.Functor ((<&>))
import Data.Ix (Ix, inRange)
import Data.Kind (Type)
import Data.List (transpose, find)
import Data.Ord (Down(Down))
import Data.Semigroup (Arg(Arg))
import Data.Set (Set)
import Data.Set qualified as Set

A primer on Dijkstra’s algorithm

Let’s recall the problem statement that Dijkstra’s algorithm solves, and the pseudo-code of the algorithm. If you know that already, you can skip to the next section.

The shortest path problem

It’s been a while since I had to use any formalism to talk about graphs proper. So I will be using the notations from the Cormen book that I just looked up for a refresher.

Consider a weighted directed graph G = (V, E, w).

V denotes the set of vertices.
E \subseteq V \times V denotes the set of edges.
Every edge e \in E has an associated (non-negative) weight w(e) \in \mathbb{R}, w(e) > 0.

We call path a sequence of vertices such that there is an edge between every consecutive vertex in the sequence. If we denote \text{paths}(a, b) the set of paths from a to b in G, this means p = \langle v_0, \dots, v_k \rangle \in \text{paths}(a, b) if v_0 = a, v_k = b and \forall 0 ≤ i < k, (v_i, v_{i + 1}) \in E.

We can define the weight of a path as the sum of the weights of its constituent edges.

w(p) = \sum_{i = 1}^k{w(v_{i - 1}, v_i)}

Shortest-paths problems ask questions along the lines of:

What is the minimum weight from a to b in G?
What is one path from a to b with minimum weight?
Can we find all such paths from a to b?

If you interpret the weight as a physical distance, this amounts to finding the shortest trip from one vertex to the other.

Dijkstra’s algorithm is an infamous technique for solving the single-source shortest-paths problem: finding a shortest path from a given source vertex s to every other vertex v \in V. It’s essentially a generalization of Breadth-First Search (BFS) to (positively) weighted graphs. And it’s pretty fast!

Dijkstra’s algorithm

It’s really “just” an extension of BFS to graphs with different non-negative weights. So it traverses vertices in the graph in the order of their shortest path distance from the root. Instead of using a regular queue like in BFS, or a stack like in DFS, Dijsktra’s algorithm uses a priority queue to store discovered vertices for which we have yet to find the shortest distance. Their priority in the queue is the current distance along the current known shortest paths in the graph.

The algorithm can be written as such:

for (u ∈ V) u.distance = ∞
start.distance = 0

let pqueue = new PositivityQueue
pqueue.insert(start, 0)

while (!pqueue.isEmpty()) {
  u = pqueue.popLowest()

  for (v ∈ V, (u, v) ∈ E) {
    let distThroughU = u.distance + w(u, v)
    if (distThroughU < v.distance) {
      // we've found a new shorter path to v (going through u)
      if (v.distance == ∞):
        // first time we see v
        // it isn't in the queue yet
        pqueue.insert(v, distThroughU)
      }
      else {
        pqueue.update(v, distThroughU)
      }
      v.distance = distThroughU
    }
  }
}

By the end of the algorithm, u.distance contains the shortest distance/weight from u to the start, for every u.

Let’s look past the proof of correctness.

Taking a step back, generalizing

One thing to notice in the problem statement from earlier and its accompanying algorithm is that weights have very little to do with real numbers. In fact, they don’t have to be scalars at all!

If we denote W the set of weights, the algorithm merely requires:

An equivalence relation (\cdot \approx \cdot) \subseteq W \times W on weights. It doesn’t have to be definitional equality!
A total order on weights, that is: (\cdot\leq\cdot) \subseteq W \times W such that it is transitive, reflexive and anti-symmetric (w.r.t \approx).
A way to add weights together (\cdot \oplus \cdot) ∷ W \rightarrow W \rightarrow W such that:
- \approx is compatible with \oplus: equivalent inputs lead to equivalent sums:
  
  \forall a, b, c, d \in W, a \approx b, c \approx d \Rightarrow a \oplus c \approx b \oplus d
- \leq is compatible with \oplus: it must be order-preserving over both its arguments.
  
  \forall a, b, c, d \in W, a \leq b, c \leq d \Rightarrow a \oplus c \leq b \oplus d
- \oplus is associative. I don’t think it’s a requirement for Dijkstra’s algorithm but it’s a nice property to have, and makes writing sums unambiguously. And it is probably necessary for Floyd-Warshall.
To initialize the algorithm, we require a default “largest” weight: an absorbing element \infty for \oplus. It follows that it is an upper bound of W.
The algorithm also needs a neutral element 0_u for \oplus, the initial weight at the start vertex u\in V. It follows that it is a lower bound of W.

If we summarize, it looks like (W/\approx, \oplus, 0_u) should be a monoid, totally ordered by \leq and with absorbing element \infty. I think this encompasses all the properties stated above, and nothing more, but I haven’t looked that deeply into the formalism.

The restriction that edges must have non-negative weights can simply be reworded as weights having to be strictly larger than the identity element.

\forall e \in E, w(e) > 0_u

Now we can freely redefine the weight of a path:

w(p) = \bigoplus_{i = 1}^k{w(v_{i - 1}, v_i)}

Equipped with this toolkit, we can state the single-source shortest-path problem again: for a given source vertex s \in V, how do we compute the smallest weight achievable on a path from s to any other vertex e \in V?

Without too much surprise, Dijkstra’s algorithm works just as well. I will not prove correctness here, though I probably should.

Abstract Haskell interface and implementation

Weights

Given the requirements on weights we established earlier, we can try to map each of them to their corresponding Haskell counterpart.

Weights should have an equivalence relation: that’s Eq.
Weights should have a total order: that’s Ord.
Weights should have an associative addition operation that respects the order: that’s Semigroup.

Sadly we’re not using Agda, and can’t enforce that the order relation and the semigroup operation be compatible using the type system. We’ll just have to be careful when defining the instances.

Therefore, a Weight should provide instances for all three classes above.

class (Eq a, Ord a, Semigroup a) => Weight a where
  infty :: a

  updateWeight :: a -> a -> a
  updateWeight = const

infty is the absorbing element of W. As stated earlier, it must be an upper bound of W.

But what is this updateWeight operation here? It is used to merge equivalent weights. Indeed, during the execution of the Dijkstra algorithm, in the relaxation phase, we may find that the weight of going to v by passing through u is equal to the cost we have already computed for v. Because we haven’t decreased the weight, we shouldn’t update the priority of v in the queue, however it can often still be useful to take those new paths into account.

That’s what this function is for. The only requirement for updateWeight is that the output should be in the same equivalence class as its (equivalent) inputs.

\forall w, w' \in W, w \approx w' \Rightarrow \texttt{updateWeight}(w, w') \approx w \approx w'

As a convention, the first argument is the already computed weight, and the second argument is the newly discovered (equivalent) weight along the new path(s) through u.

The resulting weight won’t change the priority of v in the queue, and the order of traversal. But the new information is now accounted for.

The default implementation for updateWeight discards the new weight entirely. This is quite common, say, if we only want to find “a shortest path”, and not every one of them.

Graphs

Now that we know what weights are, we need to describe what kind of graphs are suitable for our Dijkstra algorithm.

data Dijkstra i c = Dijkstra
    { bounds      :: (i, i)
    , startWeight :: i -> c
    , next        :: i -> c -> [(c, i)]
    }

So, let’s expand a bit on the fields of the interface.

bounds describes the lower and upper bound of V. This is just an implementation detail: I want to store intermediate weights in a mutable array during the traversal, for efficiency purposes. So I need to know the size of V.

If you cannot reasonnably enumerate all vertices, you can drop the bounds field and use a purely-functional persistent Map instead in the implementation.
initCost returns the initial cost we use for a given start vertex. It must always be an identity element of W, and a lower bound of W.

\forall s \in V, w \in W, \texttt{startCost}(s) \oplus w \approx w \oplus \texttt{startCost}(s) \approx w \forall s \in V, w \in W, \texttt{startCost}(s) \leq w

Concretely, this means that rather than have a single identity 0 \in W, we have one for every vertex. By anti-symmetry of the ordering relation they are all equivalent anyway. This is very useful to store information about the starting vertex in the weight. Say, if we’re computing paths, we initially store a 0-length path containing only the starting vertex.
And finally, the bread and butter of the graph: a transition function next. For any vertex u and its associated weight w, next u w returns the neighbours of u, with the weight of the edges. As discussed earlier, weight of edges must be strictly larger than 0.

One may wonder why we take as input the weight of u, and indeed it is weird. Most reasonable transition functions ignore it. But this means you can define funky graphs where the weight of an edge depends on the minimal weight to get there from a specific source. I think it is perfectly fine w.r.t the assumptions of the Dijkstra algorithm, though knowing exactly what kind of graph this corresponds to is a bit more tedious.

I show one such example where I rely on this input weight later on.

And here we have it! A description of graphs that can serve as input for the Dijkstra algorithm to solve the single-source shortest-path problem.

Note that this interface is completely agnostic to how we encode our graphs, so long as we can extract a transition function from this underlying representation.

Generic Dijkstra implementation

Finally. It is time. We can implement the Dijkstra algorithm. But first we need a priority queue, with the following interface:

type PQueue :: Type -> Type

emptyQ     :: PQueue a
singletonQ :: Ord a => a -> PQueue a
insertQ    :: Ord a => a -> PQueue a -> PQueue a

pattern EmptyQ :: PQueue a
pattern (:<)   :: Ord a => a -> PQueue a -> PQueue a

For simplicity, let’s just use a wrapper around Data.Set.

PQueue implementation

newtype PQueue a = PQueue (Set a)

emptyQ = PQueue Set.empty
singletonQ = PQueue . Set.singleton
insertQ x (PQueue s) = PQueue (Set.insert x s)

minView :: PQueue a -> Maybe (a, PQueue a)
minView (PQueue s) =
  case Set.minView s of
    Just (x, s') -> Just (x, PQueue s')
    Nothing      -> Nothing

pattern EmptyQ   <- (minView -> Nothing)
pattern (:<) x q <- (minView -> Just (x, q))

I haven’t tried existing implementations available on Hackage yet, I should get around to it at some point. It also looks like I may want a priority search queue, so that I can really update the priority for a given key.

At last, the implementation for Dijkstra’s algorithm:

dijkstraFrom :: (Ix i, Weight c) => Dijkstra i c -> i -> Array i c
dijkstraFrom d s = dijkstra d s Nothing

dijkstraFromTo :: (Ix i, Weight c) => Dijkstra i c -> i -> i -> c
dijkstraFromTo d s e = dijkstra d s (Just e) ! e

dijkstra
  :: (Ix i, Weight c) 
  => Dijkstra i c -> i -> Maybe i 
  -> Array i c
dijkstra (Dijkstra{..} :: Dijkstra i c) start cutoff = runSTArray do
  costs <- newArray bounds infty
  let zero  = startWeight start
      queue = singletonQ (zero, start)
  writeArray costs start zero
  aux costs queue
  pure costs

  where

  aux :: STArray s i c -> PQueue (c, i) -> ST s ()
  aux costs EmptyQ = pure ()
  aux costs ((uWeight, u) :< queue) | Just u == cutoff = pure ()
  aux costs ((uWeight, u) :< queue) = do
    uWeight' <- readArray costs u
    when (uWeight == uWeight') do
      let edges = next u uWeight'
      !queue <- foldM (relaxNeighbour costs uWeight') queue edges
      aux costs queue

  relaxNeighbour
    :: STArray s i c -> c -> PQueue (c, i) -> (c, i) 
    -> ST s (PQueue (c, i))
  relaxNeighbour costs uWeight !queue (uvWeight, v) = do
    let !vWeight = uWeight <> uvWeight
    vWeight' <- readArray costs v
    case vWeight `compare` vWeight' of
      GT -> pure queue
      EQ -> do
        writeArray costs v $ updateWeight vWeight' vWeight
        pure queue
      LT -> do
        writeArray costs v vWeight
        pure $ insertQ (vWeight, v) queue

Et voilà! Quite straightforward. Short and sweet.

Perhaps the most perplexing bit is the when (uWeight == uWeight') block when we pop the queue, and the fact that we use uWeight' (the “old” weight) afterwards. Both of those are necessary because of our priority queue interface. Because we don’t have a proper way to update the priority/weight of a vertex in the queue, we instead insert it again with a lower priority.

But this means that we will find the vertex several times in the queue afterward, and always with a weight strictly larger than what we know to be the minimal weight. Adding this check enforces that we process them only once. Likewise, when we merge weights, we don’t touch the priority queue, so the merged weight is only stored in our array, wheras the (equivalent weight) in the queue is outdated.

All this to say: yes, I know my interface for a priority queue is not very good, and I should really just switch to a priority search queue or something.

Instantiating the interface

Cool cool cool, we have a general Dijkstra implementation now, for any weight with the right structure. But what can we do with it? Time to figure this out.

Let’s define a few utilities for working with 2d grids, and a small grid that we will traverse in all the follow-up examples.

type Coord = (Int, Int)
type Grid  = Array Coord Char

-- check if a cell is in bounds and empty
isValidCell :: Grid -> Coord -> Bool
isValidCell grid p =
  inRange (Array.bounds grid) p && grid ! p /= '#'

-- get the coordinates around a cell
cellsAround :: Coord -> [Coord]
cellsAround (x, y) =
    [ (x - 1, y    )
    , (x + 1, y    )
    , (x    , y - 1)
    , (x    , y + 1)
    ]

-- get empty cells in bounds around a cell
neighbours :: Grid -> Coord -> [Coord]
neighbours grid p =
  filter (isValidCell grid) $ cellsAround p

-- get the coordinates of a given char in the grid
findCoord :: Grid -> Char -> Maybe Coord
findCoord grid c =
  range (Array.bounds grid)
  & find \p -> grid ! p == c

toGrid :: (Coord, Coord) -> [String] -> Grid
toGrid bounds = listArray bounds . concat . transpose

example :: Grid
example = toGrid ((0, 0), (14, 14))
  [ "###############"
  , "#.......#.....#"
  , "#.#.###.#.###.#"
  , "#.....#.#...#.#"
  , "#.###.#####.#.#"
  , "#.#.#.......#.#"
  , "#.#.#####.###.#"
  , "#...........#.#"
  , "###.#.#####.#.#"
  , "#...#.....#.#.#"
  , "#.#.#.###.#.#.#"
  , "#.....#...#.#.#"
  , "#.###.#.#.#.#.#"
  , "#...#.....#...#"
  , "###############"
  ]

startPos, endPos :: Coord
startPos = (1,1)
endPos   = (13,13)

Minimum distance

Our simplest example consists in trying to compute the length of the shortest path between two vertices. We define our weight, in this case either an integer for the minimum distance, or \infty for unreachable vertices.

data MinDist = Dist !Int | Infinite deriving (Eq, Ord)

instance Semigroup MinDist where
  Infinite <> _        = Infinite
  _        <> Infinite = Infinite
  Dist x   <> Dist y   = Dist (x + y)

instance Monoid MinDist where mempty = Dist 0
instance Weight MinDist where infty  = Infinite

instance Show MinDist where
  show Infinite = "∞"
  show (Dist d) = show d

Let’s also define a partial Num MinDist to be able to create weights using integer literals.

Num MinDist

instance Num MinDist where
  (+) = (<>)
  (-) = error "not supported"
  (*) = error "not supported"
  abs = id
  signum = const 1
  fromInteger = Dist . fromInteger

Then we instantiate the interface. All sources have a default weight of 0, and all edges have a weight of 1.

minDist :: Grid -> Dijkstra Coord MinDist
minDist grid = Dijkstra
  { bounds      = Array.bounds grid
  , startWeight = const 0
  , next        = \u _ -> (1,) <$> neighbours grid u
  }

getMinDist :: Grid -> Coord -> Coord -> MinDist
getMinDist = dijkstraFromTo . minDist

>>> getMinDist example startPos endPos
24

Shortest path

Nice, but what if we want to get an actual shortest path? The answer, again, is to introduce a custom weight.

data ShortestPath = SPath MinDist [Coord]
  deriving Show

The goal is to compute the minimal ShortestPath weight to a vertex such that it contains the path from the source to the vertex, in reverse order. So the vertex appears at the beginning of the list, followed by all the vertices until the source.

Given that we only want to find a shortest path, we can put paths with the same length in the same equivalence class, and compare paths only by looking at their length. To add shortest paths together, we concatenate them. Because the paths are stored in reverse order, the concatenation is flipped.

instance Eq ShortestPath where
  SPath x _ == SPath y _ = x == y

instance Ord ShortestPath where
  SPath x _ `compare` SPath y _ = compare x y

instance Semigroup ShortestPath where
  SPath x xs <> SPath y ys = SPath (x <> y) (ys <> xs)

And here we have it: a new weight for computing a shortest path!

instance Weight ShortestPath where
  infty = SPath infty []

shortestPath :: Grid -> Dijkstra Coord ShortestPath
shortestPath grid = Dijkstra
  { bounds      = Array.bounds grid
  , startWeight = \s -> SPath 0 [s]
  , next        = \u _ -> neighbours grid u
                            & map \v -> (SPath 1 [v], v)
  }

getShortestPath :: Grid -> Coord -> Coord -> ShortestPath
getShortestPath = dijkstraFromTo . shortestPath

Note that this time the inital (identity) weight does depend on the source vertex, since the shortest path from the source always contains the source.

>>> getShortestPath example startPos endPos
SPath 25 [(13,13),(12,13),(11,13),...,(1,4),(1,3),(1,2),(1,1)]

Plotting it on the grid, with cells visited by the math marked with O:

###############
#O......#.....#
#O#.###.#.###.#
#O....#.#...#.#
#O###.#####.#.#
#O#.#.......#.#
#O#.#####.###.#
#OOOOOOOOOOO#.#
###.#.#####O#.#
#...#.....#O#.#
#.#.#.###.#O#.#
#.....#...#O#.#
#.###.#.#.#O#.#
#...#.....#OOO#
###############

Shortest paths (all of them)

Now what if we want all the shortest paths? Simple, we define a new weight!

data ShortestPaths = SPaths MinDist [[Coord]] deriving Show

The only difference with ShortestPath is that we store multiple paths. But we compare the weights in the exact same way.

instance Eq ShortestPaths where
  SPaths x _ == SPaths y _ = x == y

instance Ord ShortestPaths where
  SPaths x _ `compare` SPaths y _ = compare x y

When adding two weights together, we have to concatenate every possible pair of paths from both weights:

instance Semigroup ShortestPaths where
  SPaths x xss <> SPaths y yss =
    SPaths (x <> y) [ys <> xs | xs <- xss, ys <- yss]

Finally, we make sure to update the set of paths if we find another (distinct but equivalent) set of paths with the same length:

instance Weight ShortestPaths where
  infty = SPaths infty []

  updateWeight (SPaths l xss) (SPaths _ yss) =
    SPaths l (xss <> yss)

shortestPaths :: Grid -> Dijkstra Coord ShortestPaths
shortestPaths grid = Dijkstra
  { bounds      = Array.bounds grid
  , startWeight = \s -> SPaths 0 [[s]]
  , next        = \u _ -> neighbours grid u 
                            & map \v -> (SPaths 1 [[v]], v)
  }

getShortestPaths :: Grid -> Coord -> Coord -> ShortestPaths
getShortestPaths = dijkstraFromTo . shortestPaths

>>> getShortestPaths example startPos endPos
SPaths 24 [[(13,13),(12,13),(11,13),...,(1,4),(1,3),(1,2),(1,1)]
          ,[(13,13),(12,13),(11,13),...,(2,3),(1,3),(1,2),(1,1)]
          ,[(13,13),(12,13),(11,13),...,(3,2),(3,1),(2,1),(1,1)]]

There are exactly 3 shortest paths on the grid.

Those are perhaps the 3 most common ways in which the Dijsktra algorithm is used. It’s already pretty neat that we can use a single implementation to compute them all. But so far all our edges have the same weight, so a BFS would have been sufficient.

Let’s look at fancier instantiations, all taken from my solutions to some Advent of Code puzzles.

Advent of Code, Day 16

Day 16 asks us to find the minimal score to go from S to E in a maze.

day16example :: Grid
day16example = 
  listArray ((0, 0), (14, 14)) $ concat $ transpose
  [ "###############"
  , "#.......#....E#"
  , "#.#.###.#.###.#"
  , "#.....#.#...#.#"
  , "#.###.#####.#.#"
  , "#.#.#.......#.#"
  , "#.#.#####.###.#"
  , "#...........#.#"
  , "###.#.#####.#.#"
  , "#...#.....#.#.#"
  , "#.#.#.###.#.#.#"
  , "#.....#...#.#.#"
  , "#.###.#.#.#.#.#"
  , "#S..#.....#...#"
  , "###############"
  ]

Every step along the way from one cell to its neighbour increases the score by 1. But here’s the twist: every left or right turn along the path adds an extra 1000 to the score.

data Dir = U | R | D | L deriving (Eq, Ord, Enum, Ix)

Dir utils

opposite :: Dir -> Dir
opposite dir =
  case dir of
    U -> D
    R -> L
    D -> U
    L -> R

dirCellsAround :: Coord -> [(Dir, Coord)]
dirCellsAround = zip [L, R, U, D] . cellsAround

dirNeighbours :: Grid -> Coord -> [(Dir, Coord)]
dirNeighbours grid p =
  filter (isValidCell grid  . snd) $ dirCellsAround p

Go away if you still intend to solve it on your own. Otherwise, let’s kick on. The trick is to define a new graph that is (conceptually) made of 4 copies of the original graph. One for every direction. A vertex on this new graph represents the position we’re at on the initial graph, along with our current orientation at this coordinate.

day16 :: Grid -> Dijkstra (Dir, Coord) MinDist
day16 grid = Dijkstra
  { bounds      = bimap (U,) (L,) $ Array.bounds grid
  , startWeight = const 0
  , next        = next
  }
  where 
  next :: (Dir, Coord) -> MinDist -> [(MinDist, (Dir, Coord))]
  next (udir, u) _ =
    dirNeighbours grid u
    & map \dv@(vdir, _) -> 
        if | vdir == opposite udir -> (infty , dv)
           | vdir == udir          -> (1     , dv)
           | otherwise             -> (1001  , dv)

In the transition function, we compute the appropriate weight of the edge depending on whether we make a turn or continue on our trajectory. In case we go back on our footsteps, we put an infinite weight.

The puzzle instructions say we start facing East, so with direction R. If we run the instantiation above, we know the smallest score along a path that puts us in a given direction. To find out the smallest score possible to reach the end, we take the minimum across all directions, as we don’t care which way we’re facing by the end of the path.

answerDay16 :: Grid -> MinDist
answerDay16 grid =
  let
    Just start = findCoord grid 'S'
    Just end   = findCoord grid 'E'
    dists = dijkstraFrom (day16 grid) (R, start)
  in minimum $ map (dists !) $ [U, R, D, L] <&> (,end)

>>> answerDay16 day16example
7036

Multiple shortest paths achieve this score. Part 2 asks us to find the total number of cells traversed by at least one such shortest path.

I think you get the picture. We could use ShortestPaths, concat them all and nub. Better even, we could define a weight that stores a Set Coord of the cells traversed on all shortest paths. You get it, it’s weights all the way down.

Advent of Code, Day 18

Last example, because I want to finish this post by the end of the weekend. Day 18 gives us as input a list of coordinates (called bytes).

day18bytes :: [Coord]
day18bytes =
  [ (5,4), (4,2), (4,5), (3,0), (2,1), (6,3), (2,4)
  , (1,5), (0,6), (3,3), (2,6), (5,1), (1,2), (5,5)
  , (2,5), (6,5), (1,4), (0,4), (6,4), (1,1), (6,1)
  , (1,0), (0,5), (1,6), (2,0)]

day18bounds :: (Coord, Coord)
day18bounds = ((0, 0), (6, 6))

Every nanosecond that passes by, the next coordinate/byte in the input list falls onto a 2d grid, creating a wall at this coordinate. The more we wait, the more walls fill the grid.

Part 2 asks us the question:

What are the coordinates of the first byte that will prevent the exit from being reachable from your starting position?

In order to find this byte, we need to know when we can no longer reach the exit. What if we were able to find the longest-surviving path to the target? Its lifespan would be the time at which the annoying byte appeared, i.e its index in the input list.

Because we want to be able to efficiently check if there is a wall at a given coordinate at a given time, let’s transform the input list into a grid.

data Cell  = Empty | Wall Int deriving Show
type Space = Array Coord Cell

Walls in our grid are annotated with the time offset after which they become activated.

day18grid :: Space
day18grid = runSTArray do
  grid <- newArray day18bounds Empty
  forM_ (zip day18bytes [0..]) \(b, k) ->
    writeArray grid b (Wall k)
  pure grid

Now we define a weight that encourages longest-surviving shortest paths.

data Oldest = Oldest
  { lifespan :: Int
  , distance :: MinDist
  } deriving Eq

instance Ord Oldest where
  Oldest l1 d1 `compare` Oldest l2 d2 =
    (Down l1, d1) `compare` (Down l2, d2)

We use Down in our Ord instance to flip the ordering relation on the lifespan field: paths that survive longer are considered to have a lesser weight.

When joining two optimal paths together, the resulting path will only survive as long as both of them have, and the distances are added up.

instance Semigroup Oldest where
  Oldest l1 d1 <> Oldest l2 d2 =
    Oldest (l1 `min` l2) (d1 <> d2)

instance Weight Oldest where infty = Oldest 0 infty

day18 :: Space -> Dijkstra Coord Oldest
day18 grid = Dijkstra
  { bounds      = Array.bounds grid
  , startWeight = const $ Oldest maxBound 0
  , next        = next
  }
  where
  next :: Coord -> Oldest -> [(Oldest, Coord)]
  next u (Oldest _ d) =
    cellsAround u
    & filter (inRange $ Array.bounds grid)
    & map \v -> case grid ! v of
        Empty               -> (Oldest maxBound 1    , v)
        Wall t | Dist t > d -> (Oldest t        1    , v)
        Wall t              -> (Oldest 0        infty, v)

And then it’s the same old:

startWeight: The (empty) path from the start to itself will survive forever.
next:
- The edge to an empty cell always exists, and has length 1.
- The edge to an as-of-yet unactivated wall will survive until the wall activates.
- The edge to an already activated wall is dead already, and the vertex is inaccessible.

answerDay18 :: Space -> Coord -> Coord -> Coord
answerDay18 grid start end =
  let weight = dijkstraFromTo (day18 grid) start end
  in day18bytes !! lifespan weight

>>> answerDay18 day18grid (0, 0) (6, 6)
(6,1)

I find this such an elegant way to solve the problem. And it’s crazy fast too! When I ran it on my input, it computed the answer in about 300µs. I’m not used to trying to optimize Haskell for speed, and I’m sure there are ways to make my implementation more efficient, but still.

Now, for the weird part about the definition of next in day18: how is it that we can check if a wall has been activated yet when looking for the neighbours of a cell? It’s unusual, right? This means that the weight of an edge is allowed to depend on the weight of the optimal path reaching it. Indeed, we use the length d of the (shortest) longest-surviving path to u to infer that d nanoseconds have elapsed before reaching u.

I think that such a definition of next is equivalent to working on a new graph such that any coordinate on the grid corresponds to infinitely many vertices: the coordinate annotated with the time it took to reach it. Then, the transition function of this new graph brings coordinates annotated with time t to coordinates with time t + 1, with the weights from the original graph. And now the dependency on the weight of the incoming path is gone.

I don’t really have a conclusion on this. It feels like the type signature for our transition function is very permissive, and yet the correctness argument still stands under our assumptions. To be investigated.

Closing thoughts

Here we are. I hope this weekend obsession of mine was interesting to someone. It sure was quite surprising to me that an algorithm I was taught a while back could be applied in a more general context quite easily.

I’m not breaking new grounds by any means. Most people know that the Dijsktra algorithm can be tuned to compute different things. But it’s funny to see that generating the shortest path or all of them can be considered specific instances of the generalized smallest weight problem. I also find it quite easy to mess up the implementation when trying to adapt Dijkstra’s algorithm to my problem of interest. With a single implementation and interface, this doesn’t happen.

Disclaimer: I have done little to no research about whether this generalization has been discussed at large already. I did find a few research papers on routing algorithms over networks that give more algebraic structure to weights. I don’t think they match one to one with what I describe here, because they seem to be interested in more general probems. And I haven’t found anything targeted at a non-scientific audience.

One thing that I still find missing in my generalization is the complete lack of rules over updateWeight. In our examples, we implemented it such that it returned what we cared about — a single path or all of them — but how do we state this formally? I think there’s hope in introducing another finer equivalence relation for every equivalence class produced by \approx. Then I think updateWeight should be a commutative and associative operation with respect to these finer equivalence relations. Who knows. I certainly don’t.

Update: commenters on Reddit have kindly pointed me to related work:

Algebraic Path Finding from Iago Leal de Freitas.
They introduce a general version of the Floyd-Warshall algorithm, over any tropical semiring.
A Very General Method of Computing Shortest Paths from Russel O’Connor.
Similar to the previous article, in that it showcases a general implementation of the Floyd-Warshall algorithm, over *-semirings.
This research paper discussing semiring frameworks for shortest-distance problems.

It looks like requiring Ord c and a commutative updateWeight always yields a semiring as described above, by picking x \otimes y = min (x, \texttt{updateWeight}(x, x \oplus y)), or something along those lines.

Therefore it looks like the only significant difference with my approach is the fact that I allow the weight of an edge in a graph to depend on the smallest weight to the incoming vertex. Because of how Dijkstra’s algorithm works, this is not a problem. But this kind of edge weight would not be possible if we require \oplus to be associative, and is incompatible with how algorithms like Floyd-Warshall construct shortest paths, by concatenating path segments.

But we crucially rely on this flexibility in Day 18 of AOC!

As always, if you have any feedback, or any additional insight on what’s discussed here, please reach out! In particular, I would love to hear about other useful weird weights that could be defined.

Feel free to discuss this post on reddit.

main :: IO ()
main = putStrLn "Thank you for reading!"

You can use markdown-unlit to compile the markdown file with GHC.↩︎