This Article IsCreated at 2024-02-12Last Modified at 2024-03-30Referenced as ia.www.b36

Short Read/Write and Buffered File

Recently, I was bitten by a bug in my code, where the buffer is too small and reads return prematurely without error. This has led me to a deep dive into the madness story of what I later know as “short read” on Linux.

Yes, this article is mostly me rambling. If you are only interested in short read/write, please read this seriously written article.

First, you probably already know that blocking syscalls like p?(read|write)v? already may short read or short write. This is nothing fancy.

Because of this correspondence, the same syscalls in io_uring may also short r/w. (As it turns out, in earlier kernel, it won’t!)

This does not make much sense to me, since io_uring allows linking requests together, and you can’t link read/write in any request chain whatsoever because the kernel might short r/w! Pain.

In the article about short r/w, I read that f(read|write) won’t do short IO! To find out how it works, we return once again to musl libc. Get a copy of musl libc and follow along.

We first examine fread in musl libc.

As it turns out, buffering and using the same file handle for read and write need special consideration. __toread handles clearing out the write buffer before trying to read in anything.

Next, we turn our eyes to fwrite. Rather than the oddly specific logic where '\n' determines when to actually send the syscall, we also find __towrite, within documented the author’s experience with summoning nasal demons.

Scary, the whole experience is.

Update at 2024-03-30:
Now that I know atomic write exists on Linux, where the kernel will ensure short write does not happen often. So I guess I can chain write requests together in io_uring and expect it to work? If it fails (writes less than expected), revert the whole operation with seek.