Problems with Protohackers

Fri 18 November 2022
Tagged: protohackers

This is a list of things wrong with Protohackers, and my thoughts on what to do about them.

1. Docker

Protohackers is the second "important" service I've deployed with Docker, and the second time I've regretted it.

The big problem is that there's a short period of unnecessary downtime on every deployment.

The API server is a Mojolicious application running under hypnotoad. Hypnotoad supports downtime-free deployments out of the box. Just run hypnotoad foo, and the running instance of foo is gracefully upgraded to the new version on disk. Great success. Only it doesn't work if your application is running in Docker, because Docker containers are meant to be immutable. So instead of updating your program code and asking hypnotoad to restart it, you have to stop the container and start a new one, and the application is down in the meantime.

The user-facing website is a Next.js site, compiled to a static site and served from nginx. This should be easy-mode for downtime-free deployments! Just replace the site content and do nothing else and nginx will start serving the fresh content. Sadly, the website is also deployed as a Docker container (running nginx inside the container to serve the static site), so when the site needs to be updated, the container needs to be stopped and a new one needs to be started.

The solution checker is a Perl daemon that consumes jobs from a beanstalk tube and executes individual problem checker scripts, depending on the problem. Most of the time I'm changing the individual checker scripts rather than the "orchestrator", so most upgrades would be free here, if only I were updating the files in the container instead of destroying the container and making a new one. (Even if I need to update the "orchestrator" part, that should be trivial: I could start an instance of the new checker and send a signal to the old one to tell it to finish up its existing jobs, not reserve any new jobs, and exit when done - this would also work with Docker but is inconvenient to do with docker-compose).

So every significant component of my application can in principle be deployed with no downtime, and yet the website is going down for 30 seconds every time I fix a typo in the problem text. For what?

The benefits of using Docker are that it simplifies daemon supervision (compared to writing systemd units or something), and it makes sure that the versions of dependencies are the same in development as they are in production. But writing systemd units is only a 1-time effort, and not substantially more complicated than writing a docker-compose.yml. And maybe I don't care too much about keeping dependency versions in sync? So I think the plan is to move away from Docker and just run everything directly on Linux, with shell scripts to handle deployments.

2. Problem statements

Every problem statement so far has had some part that is ambiguous or misleading.

It's very difficult for me to dispassionately read the problem statement without already knowing anything about the protocol it is trying to specify, so I end up not writing down all of my assumptions. Different people have different preconceptions about how a protocol is likely to work. For example, I might assume things like:

a request-response protocol can have multiple requests on a single session
when lines are separated by "\n", the final line has to end in a "\n" as well
byte order in network protocols is always big-endian
the important abstraction of a TCP-like protocol is a byte stream, not a packet stream
when measuring the length of a payload, you want to count the length of actual data after interpreting escape sequences

And someone else might have different preconceptions. A good problem statement makes these assumptions explicit, but it's hard to think of all the things that you are taking for granted, whilst you're taking them for granted.

I expect the thing to do is to recruit a playtester or two to read the problem statements before they're released, implement a solution, and then tell me what was ambiguous or misleading. If you want to do it, get in touch. The upside is that you get to solve the problems sooner than everyone else, and you get to contribute to making the problems better. The downsides are that you aren't allowed to talk to anyone about the problems until after they're released, and you aren't allowed on the leaderboard. And I can't pay you (yet).

3. Checker error messages

People consistently complain that the error messages from the checker are not clear enough. This is quite frustrating, as I actually do go out of my way to make the error messages helpful, it's just that there are a lot of different ways to fail the checks, and I don't have time to make them all maximally helpful. Most users only ever encounter a small subset of the possible error messages, and obviously they think that those particular messages should have more work put into them. (And probably they're right).

Quite often the problem is that the user has sent a string of text (maybe thousands of characters long) that is not correct. Just printing out both strings and saying they're different is not likely to be substantially helpful when they're longer than trivial examples. So I think I want a general-purpose string diffing function that finds the first difference in 2 strings and elucidates it succinctly. That wouldn't be massively complicated to do.

Apart from that, another common problem is that the user has sent a binary object with a field or two missing, so the checker is stuck waiting for more bytes before it decides what to do. Maybe at the point that the check times out it would be useful if it dumped some information about the contents of its buffers. This isn't super easy to do, because the timeout is handled at a different layer, but it would be possible. At least it would be useful to somehow give an indication when the checker has received a partial message and is waiting for more. It's quite tricky to think of a general-case solution here, it would have to work differently for every problem.

I think if I get a playtester or two, they're likely to have suggestions about the error messages that they encounter during testing, so that would help me work on the error messages more effectively.

Another view is that the checker shouldn't be providing useful error messages at all! The users should debug their own applications by looking at what they're actually sending and receiving and comparing it to what they expect. Having the checker provide "useful" error messages just causes the users to treat the checker as an automatic debugger, and then get frustrated when it doesn't work by magic. Maybe it would be better if the error messages were actually worse, and there was a tacit understanding that debugging the program is the user's problem rather than the checker's.

4. Frequency of problems

Making the problems is quite time-consuming. I was planning to do one problem every 2 weeks, but I've pushed the next one out to 3 weeks away, and it might turn into every 3 weeks now.

I originally picked a 2-week period because that makes 26 problems a year, and Advent of Code has 25 problems a year, so I thought it would be in the ballpark of reasonability. But the amount of time I spend on making Protohackers problems has given me a newfound appreciation for how hard Eric Wastl must work on Advent of Code.

Coming up with problem ideas hasn't been too hard, because I still haven't run out of principles that I want to cover. And writing a reference solution for each problem isn't too hard either. The hard parts are 1.) trying to make a checker that checks all of the important parts of the problem, without assuming any behaviour specific to my reference solution, and 2.) trying to write a problem statement that accurately and succinctly conveys the protocol specification, with every important behaviour explicitly documented.

5. User acquisition

There are 852 user accounts at the moment, of whom 399 have completed any problem, of whom 261 have solved any non-echo-server problem.

I don't know how most people are finding the site. I'm not advertising anywhere. It would be good to be more purposeful about attracting new users, but I don't have any great ideas. I Tweet, a little bit, but the Twitter account only has 74 followers so I don't think it does a lot of good. I emailed a handful of programming newsletters to ask if they would mention it, but none of them were interested.

I need better ideas here.

6. Monetisation

Apart from being annoyed by the ambiguous problem statements and confusing error messages, people like Protohackers, and frequently tell me so. If you've made something people like, then you've created wealth and should try to keep some of it for yourself in the form of cash. This would also allow me to pay any would-be playtesters without feeling like I was throwing money in a hole.

So I'd love to make money out of Protohackers, but I'm not sure how viable it is until there are a lot more users. Ideas include:

Ask the users to pay

A couple of people have already offered to make a donation, but I don't have a good way to accept ad hoc donations, and I don't want people to feel obliged to donate. Maybe I could set up a Patreon, but I can't see more than 5% of people wanting to pay anything, and 5% of 260 arguably-active users is, like, 13 people? Paying a few quid a month? Doesn't seem worth it.

Sell merchandise

I'm very picky about clothing quality, and can't recall ever buying branded merchandise that I was actually satisfied with, so I'd probably spend a lot of time and money trying out different suppliers. It also has the same problem as "asking users to pay" where I'll earn a tiny amount of money each from a tiny handful of people, so it's not really worth doing.

Solicit recruitment ads

I think the best option is recruitment ads. The Protohackers userbase is highly targeted towards competent programmers, and in particular success at Protohackers implies competency at network programming, and I imagine hiring competent network programmers is incredibly valuable to some tech companies. Much more valuable than anything else I can imagine anyone wanting to advertise on Protohackers.

So putting recruitment ads on the site is probably the best idea. Then the problem becomes who exactly wants to recruit network programmers? How do I contact them? How much do they want to pay?

So if you want to advertise on Protohackers, or you know someone who might, please get in touch. For the time being, you can simply thumb down the leaderboard, click on the GitHub links, and contact people yourself without paying me anything. You would only want to pay if you want to invest in a recruitment pipeline, rather than a 1-time mailshot to a handful of candidates.

If you like my blog, please consider subscribing to the RSS feed or the mailing list:

James Stanley