2025-06-30

Last modified: 2025-06-30 22:23:00

< 2025-06-28 2025-07-02 >

sngrep

Earlier today we saw https://github.com/irontec/sngrep segfault when presented with a very-large (~60K) HEP packet.

Is this just an issue with old CentOS sngrep or is it current?

Can I reproduce it at home? Can I fix it?

I got Cursor to write me a HEP spamming tool. It got it almost right on the first try, then I copied the https://github.com/sipcapture/HEP/blob/master/docs/HEP3_Network_Protocol_Specification_REV_37.pdf into the context and it got it totally right. What a time to be alive.

So now I need to make it send very large packets.

Yes! It reproduces the segfault. Great success. The code is now trash but if it reliably triggers the bug I don't care.

Interestingly, it doesn't reliably trigger the bug. It triggered it once but isn't doing it again. Annoying.

It has crashed again, only once more though, and slightly different, no "Segmentation fault" message this time.

Maybe I should make it log all the packets to files, and a way to play them back, so that when it segfaults I have a chance of repeating it.

OK! I found out what triggers it. If you press F5 to clear the screen and then a really big packet comes in, it segfaults straight away.

(gdb) bt
#0  __memcpy_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:461
#1  0x00005555555607d8 in memcpy (__len=55630, __src=0x7ffff77b5df9, __dest=0x7ffff002ab30) at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
#2  capture_eep_receive_v3 (pkt=pkt@entry=0x0, size=size@entry=0) at capture_eep.c:833
#3  0x0000555555560a93 in capture_eep_receive () at capture_eep.c:581
#4  capture_eep_receive () at capture_eep.c:575
#5  accept_eep_client (info=0x5555555af0b0) at capture_eep.c:173
#6  0x00007ffff7c9caa4 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:447
#7  0x00007ffff7d29c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

The memcpy call in question is:

payload = sng_malloc(header.caplen);
memcpy(payload, (void*) buffer + pos + sizeof(hep_chunk_t), header.caplen);

i.e. the exact same size is allocated right before, so seems unlikely to be a buffer overflow. What is sng_malloc?

It's just a wrapper around malloc that won't let you allocate negative bytes or too many bytes, and zeroes the allocated space.

Interestingly, MALLOC_MAX_SIZE is only set to 102400, which seems low, and is less than double the length we're trying to copy here. Although the dest pointer is non-NULL which suggests sng_malloc did allocate it.

But pkt and size in capture_eep_receive_v3 are both 0. Is that true or is that a debugger issue?

That is true, it's just used that way.

Ha! The buffer it allocates is only MAX_CAPTURE_LEN. And recvfrom() is called with MAX_CAPTURE_LEN as a size limit, but it looks like it doesn't properly check the size of everything within the HEP. So just sending a short HEP packet with a giant length field would probably do it.

Omg, so many issues here, where to start?

always check the return value of sng_malloc
check that the offsets within the HEP don't exceed the size of the buffer
do this for all of the different HEP versions, not just HEP3

Should everything that uses malloc be using sng_malloc? Or do we not care about some things?

https://github.com/irontec/sngrep/pull/511

packet.c uses malloc but doesn't check the return values. I don't have the heart to fix that bug across this entire application though.

I have submitted a bunch of fixes for memory errors. The new thing is that when I send a bunch of corrupted HEP, the "Idx" column seems to no longer be consecutive? You'll get loads of the same number in a row. How does that happen?

Ahh, it's just that once Idx goes above 9999 it starts dropping the leading digits. Is that expected? I guess so.

< 2025-06-28 2025-07-02 >