A Quick Look at RC4

In cryptography work, RC4 (Rivest Cipher 4) is well known as both one of the easiest to implement and fastest to run symmetric encryption algorithms. Unfortunately, over time there have been a number of attacks on RC4, both in poorly written protocols (such as in the case of WEP) or statistical attacks against the protocol itself.

Still, for how well it formed, it’s an amazingly simple algorithm, so I decided to try my hand at implementing it.

read more...


Performance problems with Flask and Docker

I had an interesting problem recently on a project I was working on. It’s a simple Flask-based webapp, designed to be deployed to AWS using Docker. The application worked just fine when I was running it locally, but as soon as I pushed the docker container…

Latency spikes. Bad enough that the application was failing AWS’s healthy host checks, cycling in and out of existence1:

read more...


ts: Timestamping stdout

Loving data as much as I do, I like to optimize things. To make sure I’m actually going the right way, it’s useful to time things. While it’s trivial in most languages to add timing, it’s even easier if you don’t have to.

read more...


Updating dotfiles

After all of these updates to my dotfiles, I finally want something that I can use to keep them up to date. For that, let’s write a quick script that can do just that.

read more...


Regex search and replace

Another random task that I find myself doing distressingly often: performing a regular expression search and replace recursively across a bunch of files. You can do this relatively directly with tools like sed, but I can never quite remember the particularly flavor of regular expression syntax sed uses.

read more...


Gorellian sorting

It’s been a while, so I figured I should get in a quick coding post. From /r/dailyprogrammer, we have this challenge:

The Gorellians, at the far end of our galaxy, have discovered various samples of English text from our electronic transmissions, but they did not find the order of our alphabet. Being a very organized and orderly species, they want to have a way of ordering words, even in the strange symbols of English. Hence they must determine their own order.

For instance, if they agree on the alphabetical order: UVWXYZNOPQRSTHIJKLMABCDEFG

Then the following words would be in sorted order based on the above alphabet order: WHATEVER ZONE HOW HOWEVER HILL ANY ANTLER COW

read more...


Command line user agent parsing

Quite often when working with internet data, you will find yourself wanting to figure out what sort of device users are using to access your content. Luckily, if you’re using HTTP, there is a standard for that: The user-agent header.

Since I’m in exactly that position, I’ve added a new script to my Dotfiles that reads user agents on stdin, parses them, and writes them back out in a given format.

read more...


Combining sort and uniq

A fairly common set of command line tools (at least for me) is to combine sort and uniq to get a count of unique items in a list of unsorted data. Something like this:

$ find . -type 'f' | rev | cut -d "." -f "1" | rev | sort | uniq -c | sort -nr | head

2649 htm
1458 png
 993 cache
 612 jpg
 135 css
 102 zip
  99 svg
  60 gif
  45 js
  27 pdf

read more...