How to read from a TCP socket (but were too afraid to ask)Sat 10 February 2024
Tagged: software, protohackers
You can get surprisingly far, before it bites you, with only a fuzzy and incorrect understanding of how you should read from a TCP socket. I see this often in (failing) Protohackers solutions. Once you are over the initial hurdle of reading enough documentation to actually get a TCP session connected, there are 2 key things you need to understand:
- TCP gives you a stream of bytes, not packets.
- read() can give you fewer bytes than you asked for.
If either of those was a surprise: keep reading.
The main misconception people have is that when they're reading from a TCP socket, they are receiving packets. This is the wrong way to think about it. If you're writing anything higher level than the TCP implementation itself, then you should forget about packets. TCP is exposed to you via a pair of byte streams.
Typically both streams are on the same file handle (they have the same file descriptor), but remember there are two underlying streams: writing puts bytes into the stream that is sent to the other side, and reading gets bytes out of the stream coming from the other side.
There are a few reasons that people are not immediately disabused of packet-oriented thinking:
- When you learn about networking, you learn about packets, so it is natural to assume that you'll be handling packets.
- When you write test programs to send short strings over localhost, it probably appears as if every write() pops out in a corresponding read().
- When you test your server with netcat, it appears as if each line is sent as a discrete "packet" because the line-buffering on netcat's stdin means each line doesn't get read by netcat until you press enter, and then it all gets sent to the socket at once.
If you write a program that tries to read "packets", there are a handful of potential issues you can encounter:
- A single read might get less than an entire "packet", your program may think it is a malformed packet, and you'll drop some data and potentially get out of sync with the stream.
- A single read might get more than an entire "packet", your program may not notice the extra, and you'll drop some data and potentially get out of sync with the stream.
Just stop trying to think about packets. The kernel will deal with packets. You will deal with byte streams.
To transfer discrete messages over a byte stream, you need some sort of message encoding. Some simple schemes include:
- Fixed length: every message is the same length, so you know you have a full message when you have that many bytes (see Protohackers problem 2).
- Predictable length: not every message is the same length, but you can compute how long the message is going to be based on what type of message it is (see Protohackers problem 6).
- Length prefixed: the start of each message tells you how many bytes the message is.
- Line delimited: each message is a line of text, so messages are terminated by newline characters (see Protohackers problem 1).
You can do whatever you want as long as you can work out where the message boundaries are. If you want to transfer structured data I suggest JSON lines for a text format or length-prefixed protobuf for a binary format.
The other misconception is about the semantics of reading from a socket.
Depending on the platform you are using, this could bite you in 2 different ways. Normally read() takes an argument saying how many bytes you want, and then there are 2 common ways for it to work:
- it gives you back any amount of bytes, up to the maximum you gave
- it blocks, and keeps reading, until either the end of the stream, or it gives back exactly the number of bytes you asked for
The actual read() system call is "type 1": if there are some bytes available immediately, but not as many as you asked for, you'll just get back whatever is available immediately.
C's fread() works the second way: it blocks until either the end of the stream, or it has exactly the number you asked for.
Neither of these semantics is necessarily "better" than the other. In the first case, you have to manually make sure you have all the bytes you want (i.e. keep calling read() until you have enough). In the second case, you have to make sure you don't block the entire program in the course of getting all the bytes you want (e.g. if you want 5 bytes but the kernel only has 1 to give you, fread() will block even though select() told you the socket was readable).
The following properties are common to both types:
- it can give you less than you asked for (in the case of "type 2": only at the end of the stream)
- it can block indefinitely if no data is available (in the case of "type 2": it can also block indefinitely even if some data is available!)
- it can give you less than one full packet's worth of data (but you don't care about packets!)
- it can give you more than one full packet's worth of data (but you don't care about packets!)
- it is not the case that one call to write() on the other side will exactly land in one call to read() on your side
Here are some classifications that I'm aware of:
- C: read(): type 1, fread(): type 2
- Perl: read(): type 2, sysread(): type 1
- C#, NetworkStream: Read(): type 1, ReadExactly(): type 2
- Go, io.Reader: Read(): type 1, ReadFull(): type 2
If you want to write reliable software, you need to find out what read semantics you're using. Don't just guess.
(Also, for what it's worth, the write() system call has the same property as read(), in that it may write fewer bytes than you asked it to; check the return value to know how many bytes were actually written, and then try again to send the rest).
While you're here, I have an axe to grind...
There's one more thing that I want you to know: half-close. You remember how we agreed that there are 2 separate byte streams? Well, corollary to that is that the 2 streams can be closed independently: you can close the stream you are writing to, even while you still want to read from the other side. If you understand the stream abstraction, this should be natural and good.
Sadly, some "transparent proxies" inadvertently break half-close, by tearing down connections as soon as they see either of the streams get closed. Please don't do this!
The correct thing to do is to propagate the half-close onwards. If you imagine the TCP session as a pipeline running from the client, through the proxy, to the server, and then back through the proxy to the client, then it is obvious that the half-close should be propagated on through the pipeline in the same path that normal messages would take. A half-close shouldn't "jump ahead" of any pending data in the pipeline by closing the session immediately.
(I wanted to draw a neat animation showing a client, a proxy, and a server, with water pipes connecting them, and in the good case, when the flow from the client to the proxy is shut off, all the valves will get shut off one by one, in order, through the whole pipeline, following the last of the flowing water; and in the bad case, as soon as the first valve is shut off, the rest of the pipes all get closed at once and the water they contain is dumped on the ground. But making animations is too much trouble, so please imagine it instead).
There is a blog post from Excentis about how some NAT proxies break half-close.
Nmap's ncat breaks half-close, I have submitted a patch but it has been ignored.
Ngrok breaks half-close, I found out because people tried to solve the Protohackers Smoke Test using ngrok, and couldn't.
If you're writing any kind of proxy, please implement half-close properly.
It's not that hard to read from a TCP socket, but it is easy to get it subtly wrong if you don't have the right mental models.
And (this is turning into more of a Protohackers ad than intended, but) if you want to test your understanding, you could try solving some of the Protohackers problems. If you really want a challenge: Problem 7 has you implement a basic TCP-like protocol on top of UDP.
If you like my blog, please consider subscribing to the RSS feed or the mailing list: