Advent of Code: Day 22

Source

Part 1: Simulate an RPG mage battle; finding the winning solution using the least mana. See the original writeup for more details.

First, let’s create some simple abstractions for the players:

class Entity(dict):
    '''Represent a damagable entity such as the player or a boss'''

    def __init__(self, **kwargs):
        for key, val in kwargs.items():
            self[key] = val

    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except:
            return 0

    def damage(self, points):
        '''Apply damage to this entity; minimum damage is always 1'''

        self['Hit Points'] -= max(1, points - self['Armor'])

    def tick_active_spells(self, target):
        '''Apply all active spells to the target, remove any that have expired.'''

        if self['Active Spells']:
            for spell in list(self['Active Spells']):
                self['History'].append(str(spell))
                spell.tick(self, target)
                spell.Duration -= 1
                if spell.Duration <= 0:
                    self['History'].append('{} fades'.format(spell.__class__.__name__))
                    spell.fade(self, target)
                    self['Active Spells'].remove(spell)

Basically, we have an extended dict that will default values to 0, includes a method for applying damage while taking armor into account, and applying an active spells to a given player. That will make more sense once you see the way spells are defined:

class Spell(dict):
    '''
    Create a spell. Spells have `Cost` mana and last `Duration` turns.

    cast() is called when a spell is first cast
    tick() is called each turn (for Duration > 0)
    fade() is called when a spell runs out of duration
    '''


    Cost = float("inf")
    Duration = 0

    def __init__(self):
        self['Duration'] = self.__class__.Duration

    def cast(self, caster, target):
        pass

    def tick(self, caster, target):
        pass

    def fade(self, caster, target):
        pass

    def __repr__(self):
        return '{}({})'.format(self.__class__.__name__, self.Duration)

    def __eq__(self, other):
        return self.__class__.__name__ == other.__class__.__name__

    def __hash__(self):
        return hash(self.__class__.__name__)

class MagicMissle(Spell):
    Cost = 53

    def cast(self, caster, target):
        target.damage(4)

class Drain(Spell):
    Cost = 73

    def cast(self, caster, target):
        target.damage(2)
        caster['Hit Points'] += 2

class Shield(Spell):
    Cost = 113
    Duration = 6

    def cast(self, caster, target):
        caster['Armor'] += 7

    def fade(self, caster, target):
        caster['Armor'] -= 7

class Poison(Spell):
    Cost = 173
    Duration = 6

    def tick(self, caster, target):
        target.damage(3)

class Recharge(Spell):
    Cost = 229
    Duration = 5

    def tick(self, caster, target):
        caster['Mana Points'] += 101

spells = [MagicMissle, Drain, Shield, Poison, Recharge]

As noted in the comment for the Spell class, there are two interesting fields (Cost is the mana cost and Duration is how long an ongoing spell will last) and three functions that can be overridden. cast will be called when the spell is first cast, tick will be called each turn it runs for ongoing spells, and fade will be called when an ongoing spell runs out of time. That will let us encode the five spells in the problem statement.

Finally, we can load the player and boss:

boss = lib.Entity()
for line in sys.stdin:
    key, val = line.strip().split(': ')
    boss[key] = int(val)

player = lib.Entity(**{
    'Hit Points': 50,
    'Mana Points': 500,
    'Active Spells': [],
    'History': [],
})

Now we have everything to solve the problem. My first take at a solution using a priority queue based on the mana spent. That means that as soon as we find a solution where boss['Hit Points'] <= 0, we have the minimal solution:

queue_breaker = 0

states = queue.PriorityQueue()
states.put((0, queue_breaker, player, boss))

best_player = {'Mana Spent': float('inf')}

# This will be used to break ties in the queue since Entities are not orderable
queue_breaker += 1

while not states.empty():
    score, _, player, boss = states.get()

    # If we win, because of the priority queue, this is the best solution
    if boss['Hit Points'] <= 0:
        return player

    # Player died, no point in continuing on this track
    if player['Hit Points'] <= 0:
        continue

    # --- Player's turn ---
    player = copy.deepcopy(player)
    boss = copy.deepcopy(boss)
    player['History'].append('>> Player Turn <<')
    player.tick_active_spells(boss)

    # Branch (see the copy below) to applying each possible spell for the player's turn
    for potential_spell in lib.spells:
        if player['Mana Points'] < potential_spell.Cost:
            continue

        spell = potential_spell()
        if spell in player['Active Spells']:
            continue

        current_player = copy.deepcopy(player)
        current_boss = copy.deepcopy(boss)

        # Cast the player's new spell
        current_player['Mana Points'] -= potential_spell.Cost
        current_player['Mana Spent'] += potential_spell.Cost
        spell.cast(current_player, current_boss)

        if spell['Duration']:
            current_player['Active Spells'].append(spell)

        current_player['History'].append(str(spell))

        # --- Boss's turn ---
        current_player['History'].append('>> Boss Turn <<')
        current_player.tick_active_spells(current_boss)
        current_player.damage(current_boss['Damage'])

        # Store the altered copies back in the queue
        states.put((current_player['Mana Spent'], queue_breaker, current_player, current_boss))
        queue_breaker += 1

Most of the code is spent getting the order of events exactly correct. It’s a bit weird, but it does work in the end (I think I rewrote exactly this code a dozen times and it finally worked…). The only problem with this solution: the search space is huge. I let it run for rather a while and it simulated literally millions of states still without finding a final solution. I need to cut that down.

My next trial was to tweak the scoring algorithm. By doing this, we lose the ability to return immediately once we have a solution, but we gain the ability to find a solution quickly and then throw out any solutions that would be worse than that one. We only have to make a few tweaks to the above code:

best_player = {'Mana Spent': float('inf')}

while not states.empty():
    score, _, player, boss = states.get()

    # If we win, because of the priority queue, this is the best solution
    if boss['Hit Points'] <= 0:
        if early_exit:
            return player
        elif player['Mana Spent'] < best_player['Mana Spent']:
            print('New best mana spent:', player['Mana Spent'])
            best_player = player
            continue

    ...

    # Store the altered copies back in the queue
    score = scoring_function(current_player, current_boss)
    states.put((score, queue_breaker, current_player, current_boss))
    queue_breaker += 1

return best_player

Interestingly, this converges very quickly (a few seconds) on my correct solution, then spends a (long) while making sure it’s correct. In earlier (incorrect) simulations, it would find a few increasingly good solutions before finally ending up at a steady state.

This runs much more quickly, but guaranteeing that we have a correct solution is still difficult. Instead, let’s try a quick Monte Carlo simulation:

def random_spells():
    while True:
        yield random.choice(lib.spells)

class GameOverException(Exception):
    def __init__(self, player_won, reason):
        self.player_won = player_won
        self.reason = reason

def check_game_over(player, boss):
    if boss['Hit Points'] <= 0:
        raise GameOverException(True, 'boss died')

    if player['Hit Points'] <= 0:
        raise GameOverException(False, 'player died')

def fight(player, boss, spell_iterator):
    while True:
        check_game_over(player, boss)

        # --- Player turn ---
        player.tick_active_spells(boss)
        check_game_over(player, boss)

        for i, potential_spell in enumerate(spell_iterator):
            if i >= 10:
                raise GameOverException(False, 'failed to cast 10 spells')

            if potential_spell.Cost > player['Mana Points']:
                continue

            spell = potential_spell()
            if spell in player['Active Spells']:
                continue

            player['History'].append('Player casts {}'.format(potential_spell.__name__))
            player['Mana Points'] -= potential_spell.Cost
            player['Mana Spent'] += potential_spell.Cost

            spell.cast(player, boss)
            check_game_over(player, boss)

            if spell.Duration:
                player['Active Spells'].append(spell)

            break

        # --- Boss turn ---
        player.tick_active_spells(boss)
        check_game_over(player, boss)

        player.damage(boss['Damage'])
        check_game_over(player, boss)

def monte_carlo(player, boss, timeout = TIME_TO_RUN):
    start = time.time()
    best_player = {'Mana Spent': float('inf')}
    simulations = 0
    wins = 0

    while True:
        if time.time() - start > TIME_TO_RUN:
            break

        simulations += 1
        current_boss = copy.deepcopy(boss)
        current_player = copy.deepcopy(player)

        try:
            fight(current_player, current_boss, random_spells())
        except GameOverException as game_over:
            if game_over.player_won:
                wins += 1
                if current_player['Mana Spent'] < best_player['Mana Spent']:
                    print('New best:', current_player['Mana Spent'])
                    best_player = current_player

    return simulations, wins, best_player

Basically, just fire off random spells (trying up to 10 times in a given round to account for running low on mana and no duplicates) until one player wins (I’m using a try catch block to handle that so that I can check for winners more cleanly than in the first solution). If it’s the player, see if we spent less mana than any solution we’ve found thus far. Rinse and repeat. Running it for a minute, it seems to find the best solution after roughly 2-3 minutes of running on my laptop. So not great, but at least an alternative.

Part 2: On each of the player’s turns (not on the boss’s turns), the player loses 1 HP. Find the winning combination of spells that uses the least mana.

Based on the way that I structured the code, this is actually as easy as adding a new spell with unlimited Duration:

class HardMode(Spell):
    Duration = float('inf')

    def tick(self, caster, target):
        self.toggle = not getattr(self, 'toggle', False)
        if self.toggle:
            caster.damage(1)

There’s a bit of weirdness there to make sure that it only runs half as often as most spells (since we don’t have half hit points), but this works perfectly. We then start the player with this as an active spell:

player = lib.Entity(**{
    'Hit Points': 50,
    'Mana Points': 500,
    'Active Spells': [lib.HardMode()],
    'History': [],
})

And that’s it. The rest of the simulation is identical. Since HardMode isn’t in the lib.spells list, it won’t get cast by either solution. Even if we wanted to we couldn’t because of the limitation of only one of each ongoing spell at a time.

This actually makes the problem a bit harder computationally. There were already not that many combinations of spells that would win in the first case and there are even fewer this time around. I ran the Monte Carlo simulation several times for five minutes each without it randomly stumbling on a valid ordering that works for this case. I could probably have tweaked the generation algorithm to be a bit smarter, but it wasn’t necessary. The priority queue solution with the boss HP weighting found the solution quickly enough.

Since the code for this one is a bit more complicated, feel free to check it out on GitHub to see the whole picture: GitHub:jpverkamp/advent-of-code. I’ve been (and will continue to) uploading my solutions there, but previously I’ve just directly included the entire code in order in the posts.

That all being said, I honestly think this was my least favorite problem of them thus far. It was interesting in that it actually mattered what algorithm you chose to solve it (I imagine that a recursive solution with memoization could be even faster), but the implementation details were just way too fiddly. As mentioned earlier, I wrote out more or less exactly the same algorithms a dozen times before I finally had one that actually returned the correct answer (for the most part, they were finding solutions that were too low). So it goes.

I look forward to the final three problems!