Command line user agent parsing

Quite often when working with internet data, you will find yourself wanting to figure out what sort of device users are using to access your content. Luckily, if you’re using HTTP, there is a standard for that: The user-agent header.

Since I’m in exactly that position, I’ve added a new script to my Dotfiles that reads user agents on stdin, parses them, and writes them back out in a given format.

For example:

$ ua --format "{user_agent[family]} {user_agent[major]}.{user_agent[minor]}"
Mobile Safari 5.1

$ ua --format "{os[family]}"

$ ua --format "{device[family]}"

The code is very straight forward. We are using argparse to get the command line argument, the Python ua-parser library to get a dictionary containing all of the necessary information, and Python’s str.format method to do formatting:

    default='{device[family]}\t{os[family]} {os[major]}.{os[minor]}\t{user_agent[family]} {user_agent[major]}.{user_agent[minor]}',
args = parser.parse_args()

for line in sys.stdin:
	data = user_agent_parser.Parse(line.strip())
	print args.format.format(**data)

If you’d like to see / download the entire code, you can see it in my dotfiles repository: ua.

comments powered by Disqus