The earliest memory I have of ‘programming’ is in the early/mid 90s when my father brought home a computer from work. We could play games on it … so of course I took the spreadsheet program he used (LOTUS 123, did I date myself with that?) and tried to modify it to print out a helpful message for him. It … halfway worked? At least I could undo it so he could get back to work…

After that, I picked up programming for real in QBASIC (I still have a few of those programs lying around), got my own (junky) Linux desktop from my cousin, tried to learn VBasic (without a Windows machine), and eventually made it to high school… In college, I studied computer science and mathematics, mostly programming in Java/.NET, although with a bit of everything in the mix. A few of my oldest programming posts on this blog are from that time.

After that, on to grad school! Originally, I was going to study computational linguistics, but that fell through. Then programming languages (the school’s specialty). And finally I ended up studying censorship and computer security. That’s about where I am today!

But really, I still have a habit of doing a little bit of everything. Whatever seems interesting at the time!

Novel compression

2014-05-19

Last week on /r/dailyprogrammer, there was a neat trio of posts all about a new compression algorithm:

More specifically, we’re going to represent compressed text with the following rules:

If the chunk is just a number (eg. 37), word number 37 from the dictionary (zero-indexed, so 0 is the 1st word) is printed lower-case.
If the chunk is a number followed by a caret (eg. 37^), then word 37 from the dictionary will be printed lower-case, with the first letter capitalised.
If the chunk is a number followed by an exclamation point (eg. 37!), then word 37 from the dictionary will be printed upper-case.
If it’s a hyphen (-), then instead of putting a space in-between the previous and next words, put a hyphen instead.
If it’s any of the following symbols: . , ? ! ; : (edit: also ’ and “), then put that symbol at the end of the previous outputted word.
If it’s a letter R (upper or lower), print a new line.
If it’s a letter E (upper or lower), the end of input has been reached.
edit: any other block of text, represent as a literal ‘word’ in the dictionary

Got it? Let’s go!

(If you’d like to follow along: full source)

Trigonometric Triangle Trouble

2014-05-02

Yesterday’s post at /r/dailyprogrammer managed to pique my interest¹:

A triangle on a flat plane is described by its angles and side lengths, and you don’t need all of the angles and side lengths to work out everything about the triangle. (This is the same as last time.) However, this time, the triangle will not necessarily have a right angle. This is where more trigonometry comes in. Break out your trig again, people.

Gorellian sorting

2014-04-01

It’s been a while, so I figured I should get in a quick coding post. From /r/dailyprogrammer, we have this challenge:

The Gorellians, at the far end of our galaxy, have discovered various samples of English text from our electronic transmissions, but they did not find the order of our alphabet. Being a very organized and orderly species, they want to have a way of ordering words, even in the strange symbols of English. Hence they must determine their own order.

For instance, if they agree on the alphabetical order: UVWXYZNOPQRSTHIJKLMABCDEFG

Then the following words would be in sorted order based on the above alphabet order: WHATEVER ZONE HOW HOWEVER HILL ANY ANTLER COW

Caesar cipher

2014-03-12

Here’s a 5 minute¹ coding challenge from Programming Praxis:

A caeser cipher, named after Julius Caesar, who either invented the cipher or was an early user of it, is a simple substitution cipher in which letters are substituted at a fixed distance along the alphabet, which cycles; children’s magic decoder rings implement a caesar cipher. Non-alphabetic characters are passed unchanged. For instance, the plaintext PROGRAMMINGPRAXIS is rendered as the ciphertext SURJUDPPLQJSUDALV with a shift of 3 positions.

– Source: Wikipedia, public domain

Brownian trees

2014-03-11

Pretty pretty picture time¹:

Dis/re-emvowelification

2014-02-27

So far this week we’ve had a pair of related posts at the DailyProgrammer subreddit¹:

Basically, if you’re given a string with vowels, take them out. If you’re given one without vowels, put them back in. One of the two is certainly easier than the other². :)

Crossing hands

2014-02-27

Thirty second programming problem from Programming Praxis:

Your task is to write a progam that determines how many times the hands cross in one twelve-hour period, and compute a list of those times.

Ready?

Exploring parallelism in Racket with SHA-512 mining

2014-02-16

While I’ve been getting a fair few programming exercises from Reddit’s /r/dailyprogrammer, more recently I’ve started following a few other sub-Reddits, such as /r/programming and /r/netsec. While browsing the former, I came across this intriguing gem of a problem: HashChallenge: can you find the lowest value SHA-512 hash?

Command line user agent parsing

2014-02-07

Quite often when working with internet data, you will find yourself wanting to figure out what sort of device users are using to access your content. Luckily, if you’re using HTTP, there is a standard for that: The user-agent header.

Since I’m in exactly that position, I’ve added a new script to my Dotfiles that reads user agents on stdin, parses them, and writes them back out in a given format.

Combining sort and uniq

2014-02-07

A fairly common set of command line tools (at least for me) is to combine sort and uniq to get a count of unique items in a list of unsorted data. Something like this:

$ find . -type 'f' | rev | cut -d "." -f "1" | rev | sort | uniq -c | sort -nr | head

2649 htm
1458 png
 993 cache
 612 jpg
 135 css
 102 zip
  99 svg
  60 gif
  45 js
  27 pdf