Go is faster than Python? (an example parsing huge JSON logs)

Recently at work I came across a problem where I had to go through a year’s worth of logs and corelate two different fields across all of our requests. On the good side, we have the logs stored as JSON objects (archived from Datadog which collects them). On the down side… it’s kind of a huge amount of data. Not as much as I’ve dealt with at previous jobs/in some academic problems, but we’re still talking on the order of terabytes.

On one hand, write up a quick Python script, fire and forget. It takes maybe ten minutes to write the code and (for this specific example) half an hour to run it on the specific cloud instance the logs lived on. So we’ll start with that. But then I got thinking… Python is supposed to be super slow right? Can I do better?

(Note: This problem is mostly disk bound. So Python actually for the most part does just fine.)


AoC 2021 Day 23: Amphipodinator

Source: Amphipod

Part 1: Given 4 rooms full of amphipods with various energy costs for movement (a=1, b=10, c=100, d=1000) and a hallway, how much energy does it take (at minimum) to sort the amphipods into their own rooms with the following conditions:


AoC 2021 Day 21: Dicinator

Source: Dirac Dice

Part 1: Play a simple game (describe below) with a loaded D100 (that always rolls 1, 2, 3, … 99, 100, 1, …). Return the score of the losing player times the number of times the die was rolled.


AoC 2021 Day 12: Submarine Spider

Source: Passage Pathing

Part 1: Given a list of edges in a bi-directional graph, count the number of paths from start to end such that nodes named with lowercase letters are visited once, and nodes with uppercase letters can be visited any number of times.


AoC 2017 Day 18: Duetvm

Source: Duet

Part 1: Create a virtual machine with the following instruction set:

  • snd X plays a sound with a frequency equal to the value of X
  • set X Y sets register X to Y
  • add X Y set register X to X + Y
  • mul X Y sets register X to X * Y
  • mod X Y sets register X to X mod Y
  • rcv X recovers the frequency of the last sound played, if X is not zero
  • jgz X Y jumps with an offset of the value of Y, iff X is greater than zero

In most cases, X and Y can be either an integer value or a register.

What is the value recovered by rcv the first time X is non-zero?


AoC 2017 Day 17: Spinlock

Source: Spinlock1

Part 1: Start with a circular buffer containing [0] and current_position = 0. For n from 1 up to 2017:

  1. Step forward steps (puzzle input)
  2. Input the next value for n, set current_position to n, increment n
  3. Repeat

What is the value after 2017?

It’s a bit weird to describe, but the given example helps (assume steps = 3):

0 (1)
0 (2) 1
0  2 (3) 1
0  2 (4) 3  1
0 (5) 2  4  3  1
0  5  2  4  3 (6) 1
0  5 (7) 2  4  3  6  1
0  5  7  2  4  3 (8) 6  1
0 (9) 5  7  2  4  3  8  6  1