Dirty Frag: page-cache corruption in Linux

How a shared page cache turns a local bug into file corruption

Linux keeps file-backed pages in a shared page cache so the same data can be reused across processes. That is normal and useful until kernel code writes into a page it should have treated as read-only content. Dirty Frag crosses that boundary and makes in-place kernel work happen on shared cache pages.

The useful part is that the overwrite lands in RAM before it reaches disk or a visible file operation. The file on storage can still be intact while reads pull the modified bytes from memory. That makes the damage awkward to spot and gives the attacker a clean way to shape the next stage of the exploit.

splice() matters here because it can plant page references into buffer structures without copying the data out first. When that path is combined with zero-copy frags, kernel crypto ends up operating on the same page cache-backed memory that later gets read back as if nothing happened. It is the kind of mistake that looks small in a diff and rude in production.

splice() and zero-copy frags put attacker data next to kernel writes

The first step is getting attacker-controlled data into the right place without forcing a copy. Zero-copy send paths make that possible by keeping references to page cache pages in frag slots. From there, kernel code that expects private scratch space can end up writing in place.

That is the part that matters operationally. The attacker does not need to replace a whole file in one go, just place bytes where kernel processing will touch them. Two small writes are enough when they land in the right offsets.

The overwrite lands in RAM first, then shows up in later reads

The page cache sits in memory, so corruption can show up before any on-disk evidence. A later read from the same file can return the modified bytes even though the file itself was not edited in the usual way. That makes the corruption persistent enough for an exploit chain and annoying enough for defenders.

The practical consequence is simple: checks that only compare file contents on disk can miss the issue. If the cache is already poisoned, the system can keep serving the bad data until the page is evicted or the machine is rebooted.

Why the exploit chain reaches root on real systems

Dirty Frag is not a single neat bug. It is a chain. One path gives an in-place write primitive in the xfrm ESP decryption fast paths, and another fills the gap with a second write route through RxRPC and AF_ALG. Together they give enough control to turn cache corruption into root.

esp4 and esp6 give the first in-place write primitive

The xfrm ESP code paths in esp4 and esp6 are the first useful piece. They process decrypted traffic in place, which is fine when the buffers are private and brittle when the buffers point at shared page-cache-backed pages. Once the crypto path writes back into those pages, the attacker has a controlled corruption primitive.

This is not a broad smash-everything bug. It is narrower and more annoying than that. The write is small, but small writes are plenty when the target is a structure, a pointer, or a byte sequence that later changes control flow.

RxRPC and AF_ALG fill in the second path and make the result usable

RxRPC gives the second path needed to turn the initial corruption into something practical. AF_ALG, via algif_aead, can also be used in the chain through Copy Fail style behaviour. The point is not the name of the module, it is the ability to get another in-place write where the cache still matters.

That second route makes the result usable on real systems. A lone four-byte write is awkward. Two separate controlled writes, each landing in a predictable place, are enough to shape the file state that later gets read by a privileged process.

Why the bug survives across major distributions and older kernels

This chain is not tied to one odd distribution build. It reaches kernels going back to about 2017, which is old enough to be embarrassing and recent enough to be common. That gives it a wide blast radius across mainstream Linux installs, including systems that have been quietly left to age in place.

The exploit also arrived with public code because the disclosure embargo broke. That removes the usual delay where defenders get a neat window to patch before the details spread. In practice, that means exposed systems can be probed quickly once the issue is known, and CVE tracking alone can lag the real risk if identifiers are not assigned yet or do not match local naming.

What to disable, patch, and watch in practice

The blunt mitigation is to unload and denylist esp4, esp6, and rxrpc if the host can live without them. That removes the vulnerable paths, but it also breaks useful things. Disabling esp4 and esp6 will disrupt IPsec ESP traffic and can take VPN tunnels with it. Unloading rxrpc can break services that rely on it, including AFS.

For a live system, the command is straightforward:

modprobe -r esp4 esp6 rxrpc

If the modules may come back later, denylist them through modprobe configuration:

echo 'Denylist esp4' >> /etc/modprobe.d/dirtyfrag-mitigation.conf
echo 'Denylist esp6' >> /etc/modprobe.d/dirtyfrag-mitigation.conf
echo 'Denylist rxrpc' >> /etc/modprobe.d/dirtyfrag-mitigation.conf

Patch paths matter more than workarounds once they are available. Live patching products such as KernelCare have been used for this class of fix, and patched kernels have also been published through testing repositories on some distributions. If a host carries container workloads, treat host kernel exposure as the real issue. A container does not stop a local kernel exploit from reaching the base system if the vulnerable modules are present.

Watch for short, odd file changes that do not line up with normal edit activity. Cache corruption often shows up as a file read that returns the wrong bytes, not as a tidy audit trail. If the system is meant to be exposed to untrusted local users, the safe position is to remove the vulnerable paths, install the patched kernel, and keep them gone until the patch is actually running.

Related posts

Vector | vdev-v0.3.3

Vector vdev v0 3 3: patch release with crash, leak and parsing fixes, connector and tooling improvements, upgrade notes on prechecks, rolling updates, compat

Loki | v3.7.2

Loki v3 7 2: security and CVE fixes, updated S3 client to aws sdk v1 97 3, ruler panic fix for unset validation scheme, S3 Object Lock sends SHA256 checksum

Loki | v3.7.2

Loki v3 7 2: Patch release with CVE fixes, AWS S3 SDK update, ruler panic fix, S3 Object Lock SHA256 checksum support