2025-07-24

Last modified: 2025-07-24 23:22:40

< 2025-07-23 2025-07-30 >

Webmail
Docker containers

Webmail

I was thinking a bit more about what I'd want from an email workflow. I think The Screener is not quite it.

I want the "Inbox" view to only show me things that are new or pinned. When some new email comes in, it should go into the inbox even if it's from a new person. I should have the ability to screen people out of the inbox.

And for things that I care about but not right now, I can pin so that they stay in the inbox - but only for a limited time! Stuff shouldn't get pinned indefinitely. If I don't deal with it before it expires, then it expires and that's it.

The philosophy is that if I ignore something then it should go away. It won't stack up for ever and ever until the inbox becomes unmanageable.

Obviously I will have a searchable "Archives" view or something that shows everything, even if it's seen or not pinned. But the "inbox" view is only:

new emails from people who are not blocked, and
pinned emails until the pin expires

As a specific immediate issue: "Mark all seen" should only mark the visible emails as seen, not the invisible ones.

Docker containers

The idea: people may be inadvertently leaking company secrets by publishing Docker containers on DockerHub that they think are private. I should make some tooling to run regexes over the "user-generated" layers of Docker containers, and highlight any secrets I find. And then a browser extension or something that lets me browse DockerHub and add things for processing with one click.

Firstly: how do I extract just the user-generated layers of a Docker container? Because I don't want to grep over the entire Ubuntu filesystem for every single container.

I think I'd want to just extract one layer at a time, grep over it, then clean it up and look at the next layer. Starting from the first layer that looks like it was written by the user. So probably from a "COPY" command rather than a "RUN"?

docker pull foo
docker save foo -o foo.tar
mkdir foo
tar -xf foo.tar -C foo
config=$(cat foo/manifest.json | jq -r '.[0].Config')
cat foo/$config

That gives you a JSON object where .history is a list of "layers". For each layer, either .empty_layer=true or .empty_layer is missing. If it's true then there is a corresponding sha256 in the outer object's .rootfs.diff_ds, which I guess are in order of creation (maybe reverse? unsure).

And then as well as .empty_layer you also have .created_by which tells you the line from the Dockerfile that made it. So we're firstly looking to combine the diff_ds and history sections to find out the sha256 for each layer. And then we're looking for the layers that begin "COPY" (for now), and then we'll extract the corresponding layers.

Actually potentially another vulnerability is if people include credentials in their "RUN" commands without realising that the command itself is included in the final container image. Keep that one in the back pocket. Also "ENV".

So in the example one I'm looking at, the "COPY" commands are not the final ones, but the two before that. The 2 diff_ids before the final one are:

  "sha256:fdfee7550ad9149252e6569a4e1d5192b4ebb08abf54273020259a5e5e666a5e",
  "sha256:0a9e0116e43789b59efd25019853be9a6f168a46b6b206cad18fa1c7cd7c7aed",

And indeed it looks like these correspond to the "COPY" commands. So the diff_ds list is not reversed. For "sh256:0a9..." I need to extract "foo/sha256/0a9..." as a tar file.

So let's automate this process. Let's say I have a list of regexes I want to run, and I want a script that goes from a container name to a list of regex matches.

Sweet, is working.

Now what patterns do I want?

Another idea is looking for containers that contain git repos with uncommitted changes.

Also, for searching for containers, what programs are people likely to have packaged in a way that accidentally leaks secrets?

From a few hours of looking at this:

I came across API keys for Auth0, Coinbase, Crypto.com, Telegram, Etherscan
files called .env are a good target
maybe I'd want a system that scrapes dockerhub for new pushes and then indexes the filenames?
is full-text indexing too much?
and the really hard part is making sure I don't waste any effort on images that aren't going to pay out a bug bounty

< 2025-07-23 2025-07-30 >