A quick mitmproxy setup

2025-02-19

Another quick thing that I set up for the first time in a long time. It’s honestly as much a note for myself as anything, but perhaps you’ll find it useful too.

The problem: We were having intermittent issues with a content security policy. One of the warnings that cropped up was the inclusion of 'inline-speculation-rules' in the policy. This is currently only supported in Chrome and the issue was only appearing in Firefox. I could of course go through the effort of removing the header locally and testing–but what if I could lie to the browser and change the header on the fly?

Well, for that, you have a number of options. Burp Suite, ZAP, Charles Proxy. Many more, I’m sure. Any of these can modify traffic on the fly like that, but they’ll all designed for so much more than that, making them a bit unwieldy. What I really wanted was something that was a whole lot smaller and did only this one thing (or could be at least configured as such)

Enter mitmproxy. I’ve used it before, but never quite like this. As the name suggests, mitmproxy is designed to man-in-the-middle yourself as a proxy–feed all web requests through it and it can read requests, modify and forward (or block them), read responses, modify or replace them entirely, and all so much more.

Exactly what I needed!

Infinite Craft Bot

2024-03-14

You’ve probably seen Neil.fun’s Infinite Craft game somewhere on the internet. If not, in a nutshell:

You start with 4 blocks: Earth, Fire, Water, and Wind.
You can combine any two blocks, for example:
- Earth + Water = Plant
- Plant + Fire = Smoke
- Smoke + Smoke = Cloud

That’s… pretty much it, from a gameplay perspective. There’s not really any goal, other than what you set yourself (try to make Cthulhu!). Although if you manage to find something no one has ever made before, you get a neat little note for it!

So wait, what do I mean by ‘something no one has ever seen before’?

Well, if two elements have ever been combined by anyone before, you get a cached response. Barring resets of the game (no idea if / how often this has happened, but I assume it has), if A + B = C for you, A + B = C for everyone.

And here’s the fun part: if you find a combination no one has ever found before: Neil.fun will send the combination out to an LLM to generate the new answer. The specific prompt isn’t public (so far as I know), but essentially what that means is that you have a basically infinite crafting tree¹!

So of course seeing something like this I want to automate it. 😄

Mongo DB Data Exfiltration via Search Conditions

2023-11-07

I recently participated in a security capture the flag (CTF) exercise through work. The goal was–in a wide variety of ways–to find a hidden string of the form flag{...} somewhere in the problem. Some required exploiting sample websites, some parsing various data formats or captures, some required reverse engineering code or binaries, and (new this year) some required messing with LLMs.

As I tend to do for just about everything, I ended up writing up my own experiences. I won’t share that, since it’s fairly tuned to the specific problems and thus 1) not interesting and 2) probably not mine to share, but I did want want to share a few interesting techniques I found/used. If it helps anyone either defend against similar attacks in the real world or (more importantly 😄) someone comes across this while trying to solve a CTF of their own, all the better.

Okay, first technique: extracting data from a MongoDB database using search conditions.

Automated transcripts from video with Whisper(.cpp)

2023-03-27

I tend to be something of a digital packrat. If there’s interesting data somewhere, I’ll collect it just in case I want to do something with it.

Helpful? Usually not. But it does lead to some interesting scripts.

In this case, I have a site that hosts videos. I want to download those videos and get a text based transcription of them. With new AI tools, that shouldn’t be hard at all. Let’s give it a try!

Go is faster than Python? (an example parsing huge JSON logs)

2022-02-11

Recently at work I came across a problem where I had to go through a year’s worth of logs and corelate two different fields across all of our requests. On the good side, we have the logs stored as JSON objects (archived from Datadog which collects them). On the down side… it’s kind of a huge amount of data. Not as much as I’ve dealt with at previous jobs/in some academic problems, but we’re still talking on the order of terabytes.

On one hand, write up a quick Python script, fire and forget. It takes maybe ten minutes to write the code and (for this specific example) half an hour to run it on the specific cloud instance the logs lived on. So we’ll start with that. But then I got thinking… Python is supposed to be super slow right? Can I do better?

(Note: This problem is mostly disk bound. So Python actually for the most part does just fine.)

A CLI Tool for Bulk Processing Github Dependabot Alerts (with GraphQL!)

2022-02-03

Dependabot is … somewhat useful. When it comes to letting you know that there are critical issues in your dependencies that can be fixed simply by upgrading the package (they did all the work for you*). The biggest problem is that it can just be insanely noisy. In a busy repo with multiple Node.JS codebases (especially), you can get dozens to even hundreds of reports a week. And for each one, you optimally would update the code… but sometimes it’s just not practical. So you have to decide which updates you actually apply.

So. How do we do it?

Well the traditional rest based Github APIs don’t expose the dependabot data, but the newer GraphQL one does! I’ll admit, I haven’t used as much GraphQL as I probably should, it’s… a bit more complicated than REST. But it does expose what I need.

A simple Flask Logging/Echo Server

2022-02-01

A very simple server that can be used to catch all incoming HTTP requests and just echo them back + log their contents. I needed it to test what a webhook actually returned to me, but I’m sure that there are a number of other things it could be dropped in for.

It will take in any GET/POST/PATCH/DELETE HTTP request with any path/params/data (optionally JSON), pack that data into a JSON object, and both log that to a file (with a UUID1 based name) plus return this object to the request.

Warning: Off hand, there is already a potential security problem in this regarding DoS. It will happily try to log anything you throw at it, no matter how big and will store those in memory first. So long running requests / large requests / many requests will quickly eat up your RAM/disk. So… don’t leave this running unattended? At least not without additional configuration.

That’s it! Hope it’s helpful.

Pulling more than 5000 logs from datadog

2022-01-25

Datadog is pretty awesome. I wish I had it at my previous job, but better late than never. In particular, I’ve used it a lot for digging through recent logs to try to construct various events for various (security related) reasons.

One of the problems I’ve come into though is that eventually you’re going to hit the limits of what datadog can do. In particular, I was trying to reconstruct user’s sessions and then check if they made one specific sequence of calls or another one. So far as I know, that isn’t directly possible, so instead, I wanted to download a subset of the datadog logs and process them locally.

Easy enough, yes? Well: https://stackoverflow.com/questions/67281698/datadog-export-logs-more-than-5-000

Turns out, you just can’t export more than 5000 logs directly. But… they have an API with pagination!

AoC 2021 Day 25: Cucumbinator

2021-12-25

Source: Sea Cucumber

Part 1: Load a grid of empty cells (`.`), east movers (`>`), and south movers (`v`). Each step, move all east movers than all south movers (only if they can this iteration). Wrap east/west and north/south. How many steps does it take the movers to get stuck?

AoC 2021 Day 24: Aluinator

2021-12-24

Source: Arithmetic Logic Unit

Part 1: Simulate an ALU with 4 registers (`w`, `x`, `y`, and `z`) and instructions defined below. Find the largest 14 digit number with no 0 digits which result in `z=0`.

JP's Blog

Programming, Language: Python

All posts

Recent posts

A quick mitmproxy setup

Infinite Craft Bot

Mongo DB Data Exfiltration via Search Conditions

Automated transcripts from video with Whisper(.cpp)

Go is faster than Python? (an example parsing huge JSON logs)

A CLI Tool for Bulk Processing Github Dependabot Alerts (with GraphQL!)

A simple Flask Logging/Echo Server

Pulling more than 5000 logs from datadog

AoC 2021 Day 25: Cucumbinator

Source: Sea Cucumber

Part 1: Load a grid of empty cells (`.`), east movers (`>`), and south movers (`v`). Each step, move all east movers than all south movers (only if they can this iteration). Wrap east/west and north/south. How many steps does it take the movers to get stuck?

AoC 2021 Day 24: Aluinator

Source: Arithmetic Logic Unit

Part 1: Simulate an ALU with 4 registers (`w`, `x`, `y`, and `z`) and instructions defined below. Find the largest 14 digit number with no 0 digits which result in `z=0`.

Programming, Language: Python

All posts

Recent posts

Source: Sea Cucumber

Part 1: Load a grid of empty cells (.), east movers (>), and south movers (v). Each step, move all east movers than all south movers (only if they can this iteration). Wrap east/west and north/south. How many steps does it take the movers to get stuck?

Source: Arithmetic Logic Unit

Part 1: Simulate an ALU with 4 registers (w, x, y, and z) and instructions defined below. Find the largest 14 digit number with no 0 digits which result in z=0.

Part 1: Load a grid of empty cells (`.`), east movers (`>`), and south movers (`v`). Each step, move all east movers than all south movers (only if they can this iteration). Wrap east/west and north/south. How many steps does it take the movers to get stuck?

Part 1: Simulate an ALU with 4 registers (`w`, `x`, `y`, and `z`) and instructions defined below. Find the largest 14 digit number with no 0 digits which result in `z=0`.