Source: Day 5: If You Give A Seed A Fertilizer
Full solution for today (spoilers!)
Part 1
You are given a set of initial values (seeds) and a series of range maps (where a range of numbers
src..src+len
maps todst..dst+len
). Apply each range map in tur, return the lowest resulting value.
Types
Okay, let’s start with types:
#[derive(Debug, Copy, Clone, PartialEq)]
pub enum Category {
Seed,
Soil,
Fertilizer,
Water,
Light,
Temperature,
Humidity,
Location,
}
// A range mapping; defines a range from src..=(src+len) to dst..=(dst+len)
#[derive(Debug)]
pub struct RangeMap {
pub src: u64,
pub dst: u64,
pub len: u64,
}
impl RangeMap {
// If x is in the source range, map to the destination
pub fn apply(&self, x: u64) -> Option<u64> {
if x < self.src || x >= self.src + self.len {
None
} else {
Some(self.dst + x - self.src)
}
}
}
#[derive(Debug)]
pub struct CategoryMap {
pub src_cat: Category,
pub dst_cat: Category,
pub range_maps: Vec<RangeMap>,
}
impl CategoryMap {
pub fn apply(&self, x: u64) -> u64 {
self.range_maps
.iter()
.find_map(|range_map| range_map.apply(x))
.unwrap_or(x)
}
}
#[derive(Debug)]
pub struct Simulation {
pub seeds: Vec<u64>,
pub category_maps: Vec<CategoryMap>,
}
As mentioned, a RangeMap
takes a value x
and if it’s in the range src..src+len
(inclusive), it will map it.
A category map combines several of these. The Category
fields end up not particular mattering (they’re always ordered in the input) but they were useful enough for debugging. The interesting thing to note here is that the range_maps
in a CategoryMap
are defined as non-overlapping.
So you basically need to find which range (if any) the input value falls in and apply that map. If none do, the input is preserved. This is why RangeMap::apply
returns Option<u64>
. If
So you basically need to find which range (if any) the input value falls in and apply that map. If none do, the input is preserved. This is why RangeMap::apply
returns Option<u64>
. The find_map
in CategoryMap::apply
will then automatically find the first RangeMap
to return Some(...)
(or use the unwrap_or
to default).
Finally, the Simulation
stores the original seed
values plus the ordered list of CategoryMaps
.
Parsing
Okay, we know how we want the data to look, so how do we get it in that form?
nom
.
fn category(s: &str) -> IResult<&str, Category> {
alt((
map(tag("seed"), |_| Seed),
map(tag("soil"), |_| Soil),
map(tag("fertilizer"), |_| Fertilizer),
map(tag("water"), |_| Water),
map(tag("light"), |_| Light),
map(tag("temperature"), |_| Temperature),
map(tag("humidity"), |_| Humidity),
map(tag("location"), |_| Location),
))(s)
}
fn range_map(s: &str) -> IResult<&str, RangeMap> {
let (s, (dst, src, len)) = tuple((
complete::u64,
preceded(space0, complete::u64),
preceded(space0, complete::u64),
))(s)?;
Ok((s, RangeMap { src, dst, len }))
}
fn category_map(s: &str) -> IResult<&str, CategoryMap> {
let (s, (src_cat, dst_cat)) = separated_pair(
category,
tag("-to-"),
terminated(category, terminated(preceded(space1, tag("map:")), newline)),
)(s)?;
let (s, range_maps) = separated_list1(newline, range_map)(s)?;
Ok((
s,
CategoryMap {
src_cat,
dst_cat,
range_maps,
},
))
}
pub fn simulation(s: &str) -> IResult<&str, Simulation> {
let (s, seeds) = delimited(
preceded(tag("seeds:"), space1),
separated_list1(space1, complete::u64),
many1(newline),
)(s)?;
let (s, range_maps) = separated_list1(many1(newline), category_map)(s)?;
let (s, _) = many0(newline)(s)?;
Ok((s, Simulation { seeds, range_maps }))
}
I enjoy nom
.
Solution
Okay, so we have a Simulation
. Should be easy to solve it?
fn main() -> Result<()> {
let stdin = io::stdin();
let input = io::read_to_string(stdin.lock())?;
let (s, simulation) = parse::simulation(&input).unwrap();
assert_eq!(s, "");
let (cat, values) = simulation.category_maps.iter().fold(
(Category::Seed, simulation.seeds),
|(cat, values), range_map| {
assert_eq!(cat, range_map.src_cat);
(
range_map.dst_cat,
values.iter().map(|x| range_map.apply(*x)).collect(),
)
},
);
assert_eq!(cat, Category::Location);
let result = values.iter().min().unwrap();
println!("{result}");
Ok(())
}
Not bad. And it at least verifies that the CategoryMaps
are in the right order!
Keen eyed observers may note that I’ve changed my general solution to solving these problems. Now rather than a part1
and part2
function, I have a separate bin
for each part. This let’s me write up alternate solutions. I’ll write it up (at some point!)
Part 2
Treat each pair of input values as a range. So if the first two inputs to part one were
79 14
, now you have79..=79+14
.
Solution 1: Brute Force
Okay, so the obvious(ish) problem here is that we’re got way more input to deal with. Instead of 20 input values, we not have ~billions.
Let’s try it anyways!
fn main() -> Result<()> {
let stdin = io::stdin();
let input = io::read_to_string(stdin.lock())?;
let (s, mut simulation) = parse::simulation(&input).unwrap();
assert_eq!(s, "");
// Replace seeds with ranges
simulation.seeds = simulation
.seeds
.chunks(2)
.flat_map(|lo_hi| (lo_hi[0]..=(lo_hi[0] + lo_hi[1])).collect::<Vec<_>>())
.collect::<Vec<_>>();
let (cat, values) = simulation.category_maps.iter().fold(
(Category::Seed, simulation.seeds),
|(cat, values), range_map| {
assert_eq!(cat, range_map.src_cat);
(
range_map.dst_cat,
values.iter().map(|x| range_map.apply(*x)).collect(),
)
},
);
assert_eq!(cat, Category::Location);
let result = values.iter().min().unwrap();
println!("{result}");
Ok(())
}
We’re literally chunk(2)
to pull out two values at a time, generating the Ranges
, flat_map
ing them to values, and then running the same program.
And it’s not actually terrible. At least not in release mode.
$ time just run 5 2-brute
cat data/$(printf "%02d" 5).txt | cargo run --release -p day$(printf "%02d" 5) --bin part2-brute
Finished release [optimized] target(s) in 0.01s
Running `target/release/part2-brute`
136096660
just run 5 2-brute 89.57s user 9.88s system 96% cpu 1:42.89 total
Yes. I’ll explain my Justfile
as well. But as you can see, it runs. And it’s only a minute and a half. For brute forcing the simulation billions of times… it’s not bad.
But we can do better!
Solution 2: Parallel Brute Force
Okay, only slightly better. Let’s throw rayon
at the problem. In a nutshell, it makes trivial parallelism (which we certainly have here) easy to implement.
The only change is this line:
values.par_iter().map(|x| range_map.apply(*x)).collect(),
And the only change in that is .iter()
to .par_iter()
. Rayon does the rest.
$ time just run 5 2-brute-par
cat data/$(printf "%02d" 5).txt | cargo run --release -p day$(printf "%02d" 5) --bin part2-brute-par
Finished release [optimized] target(s) in 0.01s
Running `target/release/part2-brute-par`
136096660
just run 5 2-brute-par 106.71s user 15.79s system 627% cpu 19.509 total
That… doesn’t seem right.
date +%T && time just run 5 2-brute-par && date +%T
22:38:17
cat data/$(printf "%02d" 5).txt | cargo run --release -p day$(printf "%02d" 5) --bin part2-brute-par
Finished release [optimized] target(s) in 0.04s
Running `target/release/part2-brute-par`
136096660
just run 5 2-brute-par 105.90s user 15.90s system 583% cpu 20.886 total
22:38:38
Yeah… that isn’t right at all, it’s closer to 20 seconds. I expect it’s summing the time across the parallel workers? Weird.
Edit: Based on this StackExchange answer it’s actually the zsh
(which I’m using) time
versus the bash
one. You can get better behavior with TIMEFMT
or command time
, but…
We’re doing Rusty things anyways, let’s install and use hyperfine.
$ hyperfine 'just run 5 2-brute'
Benchmark 1: just run 5 2-brute
Time (mean ± σ): 101.821 s ± 1.130 s [User: 88.594 s, System: 9.401 s]
Range (min … max): 100.571 s … 103.679 s 10 runs
$ hyperfine 'just run 5 2-brute-par'
Benchmark 1: just run 5 2-brute-par
Time (mean ± σ): 20.431 s ± 0.550 s [User: 106.872 s, System: 16.119 s]
Range (min … max): 19.815 s … 21.537 s 10 runs
There we go. ~5 times faster. But we can do better!
How?
Solution 3: Treat the Ranges as … Ranges
So despite how cool that .chunks(2).flat_map(...).collect
looks, as mentioned, it’s really kind of terrible, multiplying the amount of work a billion times (or so). But really, we don’t have to do this.
If, instead, we treat the ranges as ranges, there are really one a handful of cases of how the range (in a RangeMap
) can overlap the input range:
- The entire input range is lower than the
RangeMap
- The input range starts below but includes some of the
RangeMap
- The input range is contained in the
RangeMap
- The input range starts in the
RangeMap
and goes off the top end - The input range is entirely higher than the
RangeMap
- The input range is larger and completely contains the
RangeMap
If you look at this a different way, there are a maximum of three sub-input ranges:
- The part below the
RangeMap
- The part overlapping the
RangeMap
- And the part above the
RangeMap
When we extend this to a CategoryMap
, the first and third cases are the case where input
is sent to the next RangeMap
(as before), while the middle case is the one that’s actually mapped and done.
So how does this actually turn into code?
impl RangeMap {
// Apply over an input range
// Returns three optional ranges:
// 1. The portion of the original range below self's range
// 2. The portion of the original range overlapping self's range mapped to destination
// 3. The portion of the original range above self's range
#[allow(clippy::type_complexity)]
pub fn apply_range(
&self,
input: RangeInclusive<u64>,
) -> (
Option<RangeInclusive<u64>>,
Option<RangeInclusive<u64>>,
Option<RangeInclusive<u64>>,
) {
let (input_start, input_end) = input.clone().into_inner();
let src_end = self.src + self.len - 1;
let below = if input_start < self.src {
Some(input_start..=self.src.saturating_sub(1).min(input_end))
} else {
None
};
let overlap = if input_end >= self.src && input_start <= src_end {
let overlap_start = input_start.max(self.src);
let overlap_end = input_end.min(src_end);
Some((self.dst + overlap_start - self.src)..=(self.dst + overlap_end - self.src))
} else {
None
};
let above = if input_end > src_end {
Some(src_end.saturating_add(1).max(input_start)..=input_end)
} else {
None
};
(below, overlap, above)
}
}
impl CategoryMap {
pub fn apply_range(&self, input: RangeInclusive<u64>) -> Vec<RangeInclusive<u64>> {
let mut ranges = vec![input.clone()];
let mut result = vec![];
for range_map in self.range_maps.iter() {
let mut unchanged = vec![];
// Mapped ranges are ready to return
// Anything else passes to the next range map
for range in ranges.iter() {
let (below, overlap, above) = range_map.apply_range(range.clone());
if let Some(below) = below {
unchanged.push(below);
}
if let Some(overlap) = overlap {
result.push(overlap);
}
if let Some(above) = above {
unchanged.push(above);
}
}
ranges.clear();
ranges.append(&mut unchanged);
}
// Any unchanged ranges after all maps are returned
result.append(&mut ranges);
result
}
}
I’m not going to lie, the code for this is certainly uglier.
The magic happens in the middle of CategoryMap::apply_range
, where the below
and above
lists are pushed to unchanged
(and sent to the next RangeMap
), but overlap
goes straight to the result
.
You might think that we’d want to do some deduplication/merging of ranges here, but really, it’s not that bad. With 10 or so steps, splitting a maximum of three times, we’re still dealing with well well less than a billion entries (and most don’t overlap anyways).
The comments help a bit, but you really have to decide if performance or clarity is what you need.
To actually use these though, that’s not that bad (still longer):
fn main() -> Result<()> {
let stdin = io::stdin();
let input = io::read_to_string(stdin.lock())?;
let (s, simulation) = parse::simulation(&input).unwrap();
assert_eq!(s, "");
// Replace seeds with ranges
let ranges = simulation
.seeds
.chunks(2)
.map(|lo_hi| lo_hi[0]..=(lo_hi[0] + lo_hi[1]))
.collect::<Vec<_>>();
let (cat, values) =
simulation
.category_maps
.iter()
.fold((Category::Seed, ranges), |(cat, values), range_map| {
assert_eq!(cat, range_map.src_cat);
(
range_map.dst_cat,
values
.iter()
.flat_map(|r| range_map.apply_range(r.clone()))
.collect(),
)
});
assert_eq!(cat, Category::Location);
assert_eq!(cat, Category::Location);
let result = values
.iter()
.map(|r| r.clone().min().unwrap())
.min()
.unwrap();
println!("{result}");
Ok(())
}
As before, we replace the original seeds, but this time we keep RangeInclusives
. Then we fold
as before, using apply_range
instead.
So, does it actually speed up the code?
$ hyperfine 'just run 5 2'
Benchmark 1: just run 5 2
Time (mean ± σ): 82.8 ms ± 4.9 ms [User: 30.9 ms, System: 12.6 ms]
Range (min … max): 79.0 ms … 103.6 ms 28 runs
Why yes. Yes it does.
It’s not microseconds. But it’s awfully fast.
Performance
So how did all the solutions compare?
Solution | Time |
---|---|
Part 1 | 82.4 ms ± 5.6 ms |
Part 2 (Brute Force) | 101.821 s ± 1.130 s |
Part 2 (Parallel Brute Force) | 20.431 s ± 0.550 s |
Part 2 (Ranges) | 82.8 ms ± 4.9 ms |
So… we’re not in the microsecond range anymore. But I think that milliseconds are still absolutely fine. Especially with the ~1000x speedup between Brute Force and Ranges.
And I learned about a new fun tool (hyperfine) along the way!
I’m good with this.
Onward!