AoC 2024 Day 6: Wanderinator

Source: Day 6: Guard Gallivant

Full solution for today (spoilers!).

Part 1

You are given a grid of walls (#), floors (.), and a guard (^, initially facing up/north). The guard walks forward until they run into a wall at which point they turn right. How many tiles does the guard reach before leaving the map.

This is a great opportunity to use that Grid we introduced last time!

#[derive(Debug, Copy, Clone, Default)]
enum Tile {
    #[default]
    Empty,
    Wall,
}

#[derive(Debug, Clone)]
struct Map {
    guard: Point,
    facing: Direction,
    grid: Grid<Tile>,
}

#[aoc_generator(day6)]
fn parse(input: &str) -> Map {
    let grid = Grid::read(input, &|c| match c {
        '.' => Tile::Empty,
        '#' => Tile::Wall,
        '^' => Tile::Empty,
        _ => panic!("Invalid character: {}", c),
    });

    let guard_index = input.find('^').unwrap();

    let per_row = grid.width + 1;
    let guard = Point::new(
        (guard_index % per_row) as i32,
        (guard_index / per_row) as i32,
    );
    let facing = Direction::Up;

    Map {
        guard,
        facing,
        grid,
    }
}

We do introduce two new structure (that I have used in previous years: Point and Direction). I’ve done a few new things, so let’s see how those work.

`Direction`

A Direction is something we need for the facing of the guard. It can be represented as North/South/East/West or Up/Down/Left/Right. For the moment, we’ll go with the latter. It’s mostly just a data structure. Here’s the full code.

#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
pub enum Direction {
    Up,
    Down,
    Left,
    Right,
}

#[allow(dead_code)]
impl Direction {
    pub fn rotate_cw(&self) -> Direction {
        match self {
            Direction::Up => Direction::Right,
            Direction::Right => Direction::Down,
            Direction::Down => Direction::Left,
            Direction::Left => Direction::Up,
        }
    }

    pub fn rotate_right(&self) -> Direction {
        self.rotate_cw()
    }

    pub fn rotate_ccw(&self) -> Direction {
        match self {
            Direction::Up => Direction::Left,
            Direction::Left => Direction::Down,
            Direction::Down => Direction::Right,
            Direction::Right => Direction::Up,
        }
    }

    pub fn rotate_left(&self) -> Direction {
        self.rotate_ccw()
    }
}

`Point`

Currently only a 32-bit number. I may decide to change that down the line when (I doubt this is an if) I need bigger numbers 😄. It’s signed because we quite often need to add/subtract negative points.

The full code is here if you’re curious. Some specific bits:

#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
pub struct Point {
    pub x: i32,
    pub y: i32,
}

#[allow(dead_code)]
impl Point {
    pub fn new(x: i32, y: i32) -> Self {
        Self { x, y }
    }

    pub fn manhattan_distance(&self, other: &Self) -> i32 {
        (self.x - other.x).abs() + (self.y - other.y).abs()
    }

    pub fn neighbors(&self) -> [Self; 4] {
        [
            *self + Direction::Up,
            *self + Direction::Down,
            *self + Direction::Left,
            *self + Direction::Right,
        ]
    }
}

Next up, I have standard implementations of Add, AddAssign, etc. But what’s interesting is a few From conversions that I haven’t done before:

impl From<(i32, i32)> for Point {
    fn from((x, y): (i32, i32)) -> Point {
        Point::new(x, y)
    }
}

impl From<(isize, isize)> for Point {
    fn from((x, y): (isize, isize)) -> Point {
        Point::new(x as i32, y as i32)
    }
}

impl From<(usize, usize)> for Point {
    fn from((x, y): (usize, usize)) -> Point {
        Point::new(x as i32, y as i32)
    }
}

This lets me take isize or usize tuples and turn them into Point, which is kind of neat. Of course, if those values are out of range, things will explode… but we’ll just deal with that for the time being!

Otherwise, we also want the ability to turn a Direction into a Point (for adding) or adding them together:

// Interop between points and directions

impl From<Direction> for Point {
    fn from(direction: Direction) -> Point {
        match direction {
            Direction::Up => Point::new(0, -1),
            Direction::Down => Point::new(0, 1),
            Direction::Left => Point::new(-1, 0),
            Direction::Right => Point::new(1, 0),
        }
    }
}

impl std::ops::Add<Direction> for Point {
    type Output = Point;

    fn add(self, rhs: Direction) -> Self::Output {
        self + std::convert::Into::<Point>::into(rhs)
    }
}

Actually solving the problem

Okay, we have the framework out of the way.

To solve it, I’m going to implement a Map::walk function:

impl Map {
    fn walk(&self) -> Grid<bool> {
        let Map {
            mut guard,
            mut facing,
            grid,
        } = self;

        let mut visited = Grid::new(grid.width, grid.height);
        visited.set(guard, true);

        let mut duplicates = hashbrown::HashSet::new();
        duplicates.insert((guard, facing));

        while grid.in_bounds(guard) {
            match grid.get(guard + facing) {
                Some(Tile::Empty) => {
                    guard += facing.into();
                    visited.set(guard, true);
                }
                Some(Tile::Wall) => {
                    facing = facing.rotate_cw();
                }
                None => break,
            }
        }

        visited
    }
}

Essentially, this implements the algorithm described in the problem statement and returns a Grid<bool> of all the points the guard visits. We’ll have a problem if the guard ever starts walking in a loop… but that’s a problem for part 2. 😄

Now that we have that, the actual solution is… count them!

#[aoc(day6, part1, v1)]
fn part1_v1(input: &Map) -> usize {
    input.walk(false).unwrap().iter().filter(|&v| *v).count()
}

That’s it. 😄

I suppose that needs an iter function on Grid (and I implemented an iter_enumerate for good measure that will also return the point for each):

impl Grid {
    pub fn iter(&self) -> impl Iterator<Item = &T> {
        self.data.iter()
    }

    pub fn iter_enumerate(&self) -> impl Iterator<Item = (Point, &T)> {
        self.data
            .iter()
            .enumerate()
            .map(|(i, v)| ((i % self.width, i / self.width).into(), v))
    }
}

Running it:

$ cargo aoc --day 6 --part 1

AOC 2024
Day 6 - Part 1 - v1 : 5551
	generator: 59.375µs,
	runner: 38.458µs

Quick enough for now!

I did have some fun rendering this one:

The colors don’t mean anything, they just look fun.

Part 2

How many points are there such that you can add exactly 1 wall to make the guard walk in a repeating loop?

Told you we’d need to loop!

This does require a small modification to the walk function:

impl Map {
    fn walk(&self, check_loops: bool) -> Option<Grid<bool>> {
        let Map {
            mut guard,
            mut facing,
            grid,
        } = self;

        let mut visited = Grid::new(grid.width, grid.height);
        visited.set(guard, true);

        let mut duplicates = hashbrown::HashSet::new();
        duplicates.insert((guard, facing));

        while grid.in_bounds(guard) {
            match grid.get(guard + facing) {
                Some(Tile::Empty) => {
                    guard += facing.into();
                    visited.set(guard, true);
                }
                Some(Tile::Wall) => {
                    facing = facing.rotate_cw();
                }
                None => break,
            }

            if check_loops {
                if duplicates.contains(&(guard, facing)) {
                    return None;
                }
                duplicates.insert((guard, facing));
            }
        }

        Some(visited)
    }
}

I did leave it optional, since Hash isn’t free. So if we don’t think we need it, just don’t use it. Part 1 with check_loops=true:

# With check_loops=false
$ cargo aoc --day 6 --part 1

AOC 2024
Day 6 - Part 1 - v1 : 5551
	generator: 59.375µs,
	runner: 38.458µs

# With check_loops=true
$ cargo aoc --day 6

AOC 2024
Day 6 - Part 1 - v1 : 5551
	generator: 56.875µs,
	runner: 159.667µs

Significant for such a small change.

For the first round on this problem, let’s brute force it. Try every single point on the grid (even if it’s already a wall!). Run walk and check if we get None:

// For each point on the grid, check if adding a wall there would create a loop
#[aoc(day6, part2, v1)]
fn part2_v1(input: &Map) -> usize {
    iproduct!(0..input.grid.width, 0..input.grid.height)
        .filter(|&(x, y)| {
            // The 'visited' function returns None on loops (no path found)
            let mut input = input.clone();
            input.grid.set((x, y), Tile::Wall);
            input.walk(true).is_none()
        })
        .count()
}

iproduct! is a macro from itertools that basically will generate each point for me in a clean way.

Anyways, how does it do?

$ cargo aoc --day 6 --part 2

AOC 2024
Day 6 - Part 2 - v1 : 1939
	generator: 50.583µs,
	runner: 1.584841459s

More than a second?! That cannot stand!

First optimization? We don’t need to check every point.

After that, we’re also doing a clone every iteration of the loop, which I suppose we can avoid by undoing each change. I’ll have to try that.

It is nice to have an answer to verify against though.

Optimization 1: Only checking the path

Okay, first optimization. Because the guard only changes their facing (and thus may introduce a loop) when they run into a wall, we can optimize by only putting walls where they might have some impact:

// Only check adding walls to the original path
// We don't have to check adjacent since you have to 'run into' a wall to turn
#[aoc(day6, part2, limited)]
fn part2_limited(input: &Map) -> usize {
    let visited = input.walk(false).unwrap();
    iproduct!(0..input.grid.width, 0..input.grid.height)
        .filter(|&(x, y)| {
            let p = Point::from((x, y));
            if visited.get(p) == Some(&true) {
                let mut input = input.clone();
                input.grid.set(p, Tile::Wall);
                input.walk(true).is_none()
            } else {
                false
            }
        })
        .count()
}

For this, we’re going to run the algorithm one extra time first (basically part 1), to get all the points on the path. Then, we filter out any points that are not on the path and only check the rest.

As mentioned in the comment, we don’t have to check points adjacent to the path (I originally) did, since if you are ‘beside’ the original path, it won’t turn.

This is a bit more than 3x faster:

$ cargo aoc --day 6 --part 2

AOC 2024
Day 6 - Part 2 - v1 : 1939
	generator: 50.583µs,
	runner: 1.584841459s

Day 6 - Part 2 - limited : 1939
	generator: 54.084µs,
	runner: 377.670083ms

Not too bad, but we can do better!

Optimization 2: Rayon parallelization

This problem fits pretty well into the embarrassingly parallel category. Since we’re cloning, we can check every single point at the same time (if we have enough CPU cores) and just add them up. Enter rayon: easy to use (mostly) drop in parallelization!

// Add rayon parallelization
#[aoc(day6, part2, limited_rayon)]
fn part2_limited_rayon(input: &Map) -> usize {
    let visited = input.walk(false).unwrap();
    iproduct!(0..input.grid.width, 0..input.grid.height)
        .par_bridge()
        .into_par_iter()
        .map(|(x, y)| {
            let p = Point::from((x, y));

            if visited.get(p) == Some(&true) {
                let mut input = input.clone();
                input.grid.set(p, Tile::Wall);
                if input.walk(true).is_none() {
                    1
                } else {
                    0
                }
            } else {
                0
            }
        })
        .sum::<usize>()
}

Okay, it’s not quite a free change. Because there’s a bit of a change in API from a standard Iter to the Itertools iproduct!, we need to add a par_bridge to convert. After that, you appearnly can’t filter in rayon, so instead we’ll map and sum. But the general idea is the same!

$ cargo aoc --day 6 --part 2

AOC 2024
Day 6 - Part 2 - limited : 1939
	generator: 54.084µs,
	runner: 377.670083ms

Day 6 - Part 2 - limited_rayon : 1939
	generator: 50.667µs,
	runner: 44.247666ms

Another 10x speedup! Which is mostly due to the machine I’m running on. On a single core machine, it would probably run if anything a bit slower. But I like it, so in it stays!

Optimization 3: Avoiding clone

One last thing to look at: clone. In both of the above cases, we clone the input on every iterations. We can’t actually fix that in the parallel rayon case (because we don’t want to mutate data in one thread that another might be trying to use), but we can for the single threaded case.

The basic idea: before running the inner walk, set the Wall. Then when we’re done, unset it!

That’s really it:

// Try without cloning the input (more than once)
#[aoc(day6, part2, no_clone)]
fn part2_limited_no_clone(input: &Map) -> usize {
    let mut input = input.clone();

    let visited = input.walk(false).unwrap();
    iproduct!(0..input.grid.width, 0..input.grid.height)
        // Any points not on or adjacent to original path cannot introduce a loop
        .filter(|&(x, y)| {
            let p = Point::from((x, y));
            if visited.get(p) == Some(&true) {
                input.grid.set((x, y), Tile::Wall);
                let result = input.walk(true).is_none();
                input.grid.set((x, y), Tile::Empty);
                result
            } else {
                false
            }
        })
        .count()
}

Unfortunately:

$ cargo aoc --day 6 --part 2

AOC 2024
Day 6 - Part 2 - v1 : 1939
	generator: 50.583µs,
	runner: 1.584841459s

Day 6 - Part 2 - limited : 1939
	generator: 54.084µs,
	runner: 377.670083ms

Day 6 - Part 2 - limited_rayon : 1939
	generator: 50.667µs,
	runner: 44.247666ms

Day 6 - Part 2 - no_clone : 1939
	generator: 39.209µs,
	runner: 376.155625ms

It’s actually pretty much exactly the same speed as the initial limited case. This one is dominated by the cost of the Hash in loop detection. Still, it was an interesting idea I suppose.

Optimization 4: No hash

Speaking of which… didn’t we say that Hash is expensive? (Even with hashbrown. It’s worse with the built in hash function, since it has to make additional security guarantees).

Instead of a duplicates HashSet, let’s create 4 Grid<bool>, one for each direction:

impl Map {
    fn walk(&self, check_loops: bool) -> Option<Grid<bool>> {
        let Map {
            mut guard,
            mut facing,
            grid,
        } = self;

        let mut visited = Grid::new(grid.width, grid.height);
        visited.set(guard, true);

        let mut duplicates_up = Grid::new(grid.width, grid.height);
        duplicates_up.set(guard, true);

        let mut duplicates_left = Grid::new(grid.width, grid.height);
        let mut duplicates_right = Grid::new(grid.width, grid.height);
        let mut duplicates_down = Grid::new(grid.width, grid.height);

        while grid.in_bounds(guard) {
            match grid.get(guard + facing) {
                Some(Tile::Empty) => {
                    guard += facing.into();
                    visited.set(guard, true);
                }
                Some(Tile::Wall) => {
                    facing = facing.rotate_cw();
                }
                None => break,
            }

            if check_loops {
                let duplicates = &mut match facing {
                    Direction::Up => &mut duplicates_up,
                    Direction::Left => &mut duplicates_left,
                    Direction::Right => &mut duplicates_right,
                    Direction::Down => &mut duplicates_down,
                };

                if duplicates.get(guard) == Some(&true) {
                    return None;
                }
                duplicates.set(guard, true);
            }
        }

        Some(visited)
    }
}

That… is slightly ugly. But was it worth it?

# hashmap 
$ cargo aoc --day 6 --part 2

AOC 2024
Day 6 - Part 2 - v1 : 1939
	generator: 50.583µs,
	runner: 1.584841459s

Day 6 - Part 2 - limited_rayon : 1939
	generator: 50.667µs,
	runner: 44.247666ms

# Grid<bool>
$ cargo aoc --day 6 --part 2

Day 6 - Part 2 - v1 : 1939
	generator: 45.666µs,
	runner: 409.640417ms

Day 6 - Part 2 - limited_rayon : 1939
	generator: 52µs,
	runner: 13.176792ms

Huh… that’s pretty good. Even the original solution is sub half a second now!

Worth it!

Benchmarks

Overall benchmarks:

# hashmap
cargo aoc bench --day 6

Day6 - Part1/v1             time:   [18.340 µs 18.428 µs 18.521 µs]
Day6 - Part2/v1             time:   [1.6568 s 1.6652 s 1.6735 s]
Day6 - Part2/limited        time:   [392.86 ms 394.18 ms 395.57 ms]
Day6 - Part2/limited_rayon  time:   [45.311 ms 45.548 ms 45.807 ms]
Day6 - Part2/no_clone       time:   [413.63 ms 416.32 ms 419.08 ms]

# Grid<bool>
$ cargo aoc bench --day 6

Day6 - Part1/v1             time:   [19.739 µs 19.906 µs 20.106 µs]
Day6 - Part2/v1             time:   [428.06 ms 429.59 ms 431.26 ms]
Day6 - Part2/limited        time:   [95.354 ms 95.921 ms 96.571 ms]
Day6 - Part2/limited_rayon  time:   [12.414 ms 12.441 ms 12.467 ms]
Day6 - Part2/no_clone       time:   [97.974 ms 98.957 ms 100.32 ms]