CopyFail (CVE-2026-31431): Linux Kernel Local Privilege Escalation Analysis, Affected Versions, and Fix Guide

CopyFail (CVE-2026-31431) is a local privilege escalation vulnerability that affects mainstream Linux distributions. The core issue comes from the interaction between AF_ALG, splice, and authencesn, allowing attackers to tamper with the page cache to achieve privilege escalation and potentially escape containers. Keywords: Linux kernel, privilege escalation vulnerability, page cache.

The technical specification snapshot summarizes the vulnerability at a glance

Parameter Details
Vulnerability Name CopyFail
CVE CVE-2026-31431
Type Local Privilege Escalation (LPE), logic flaw
Affected Platforms Ubuntu, Debian, RHEL, SUSE, Amazon Linux, and others
Affected Kernels Linux 4.14 ~ 6.18.21, 6.19.11, 7.0-rc1
Exploitation Prerequisite Unprivileged local user access
Primary Interfaces AF_ALG, splice(), AEAD
Core Dependencies algif_aead, authencesn
CVSS 7.8 (High)
GitHub Stars Not provided in the original source

This kernel vulnerability emerges from the combination of three “reasonable features”

The danger of CopyFail does not come from a single memory corruption primitive. Instead, it comes from semantic mismatch across multiple subsystems. It is not a traditional race condition, it does not rely on complex heap shaping, and it exploits the failure of boundary assumptions around zero-copy page cache references and AEAD output buffers.

Compared with Dirty Cow and Dirty Pipe, CopyFail has a lower exploitation barrier. Attackers do not need to win a timing race, adapt to many kernel variants, or persist changes to disk. With only ordinary local user privileges, they may be able to poison a setuid program in memory at the page cache layer.

The impact scope can be summarized as long-lived, broadly distributed, and easy to exploit

The flaw was introduced in 2017 after a performance optimization and remained present across nearly nine years of Linux mainline development and downstream distributions. Its key risk is that the modification happens only in the page cache while the file on disk remains unchanged. As a result, integrity checks based on on-disk hashes may completely miss the compromise.

Input SGL:  AAD | Ciphertext | Tag (target file page cache)
                    | copied |      | sg_chain linked in
Output SGL: AAD | Ciphertext |------| target file page cache
                    | user receive buffer | writable tail corruption point

This diagram shows that the tail of the output scatterlist includes references to the file page cache, which creates the condition for a later out-of-bounds write.

The trigger chain is built on the combination of AF_ALG and splice

AF_ALG is the Linux kernel crypto interface exposed to user space. Even unprivileged users can create the corresponding socket and invoke kernel algorithms. The splice() system call provides zero-copy transfer, allowing file page cache pages to enter the kernel data path by reference instead of being copied into a private buffer.

When a target file is passed into an AF_ALG socket through splice, the kernel may directly hold references to that file’s page cache pages in a scatterlist. Up to this point, the behavior still reflects a high-performance design. The real problem appears later, when the AEAD in-place optimization conflicts with assumptions made by the algorithm implementation.

The 2017 in-place optimization changed the trusted boundary of the output buffer

In the original out-of-place mode, input and output were separated, and output was written into a user-private buffer. After the optimization, src and dst pointed to the same scatterlist to avoid one data copy. However, this also allowed what should have remained a read-only input reference to indirectly enter the writable output path.

In particular, when processing the authentication tag, the kernel did not copy the corresponding data. Instead, it used sg_chain() to append the tag entry from the input directly to the tail of the output list. As a result, the first part of the output belonged to the user buffer, but the second part could already point to the target file page cache.

/* The core issue is not syntax, but that the output boundary is being reused */
sg_chain(dst_tail, 2, tag_sg);   // Link the input Tag page directly to the tail of the output SGL
req->src = req->dst;             // Enable in-place mode so input and output share the same data chain

This pseudocode expresses the root cause of the flaw: the output chain no longer belongs entirely to the caller’s private memory.

The scratch write in authencesn turns a questionable design into an exploitable vulnerability

authencesn is an AEAD implementation designed for IPsec ESN scenarios. For a long time, it assumed that the caller always provided an output buffer large enough for its needs, so it used the tail of the output as temporary scratch space. This historical behavior remained harmless inside the kernel, but once AF_ALG exposed it to user space, it became a dangerous assumption.

The most critical step is that it continues writing 4 bytes after assoclen + cryptlen. If the tail of the output has already been chained to the target file page cache at that point, those 4 bytes are no longer written into temporary space. Instead, they directly corrupt the file page cache contents.

scatterwalk_map_and_copy(tmp, dst, 0, 8, 0);                  // Read the first 8 bytes of AAD
scatterwalk_map_and_copy(tmp, dst, 4, 4, 1);                  // Rearrange ESN-related bytes
scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1); // Write 4 bytes past the output tail

This logic shows the origin of the write primitive used during exploitation: fixed length, reliably triggered, and independent of a valid authentication tag.

Authentication failure does not roll back page cache corruption

Worse, even if the final tag verification fails and the decryption path returns an error, the earlier 4-byte write has already occurred. An attacker does not need to construct valid decryptable ciphertext. They only need to drive the kernel far enough along this path to complete page-cache-level tampering.

This gives CopyFail strong real-world value. By repeatedly writing in 4-byte chunks, an attacker can inject shellcode or jump fragments into the page cache of a target setuid program and then gain privileges when that program executes.

The exploitation path is essentially a 4-byte arbitrary page cache write assembled into a privilege escalation chain

Public materials indicate that attackers often choose setuid-root programs such as /usr/bin/su as the corruption target. The process includes creating an AF_ALG socket, binding authencesn(hmac(sha256),cbc(aes)), sending crafted AAD, splicing a target file segment at a chosen offset, and then triggering the decryption path through recv.

Each trigger writes 4 bytes. After enough iterations, the attacker can assemble a full payload inside the page cache. Once the corrupted target program is executed, the kernel will preferentially load code from the page cache, which then activates the privilege escalation logic.

import os, socket

def trigger(fd, off, chunk):
    alg = socket.socket(38, 5, 0)
    alg.bind(("aead", "authencesn(hmac(sha256),cbc(aes))"))  # Bind the vulnerable AEAD algorithm
    cli, _ = alg.accept()
    r, w = os.pipe()
    os.splice(fd, w, off + 4, offset_src=0)                   # Send the target file page cache into the pipe
    os.splice(r, cli.fileno(), off + 4)                       # Then zero-copy it into the ALG socket
    try:
        cli.recv(8 + off)                                     # Trigger kernel decryption and the out-of-bounds write
    except OSError:
        pass

This minimized pseudocode is provided only to illustrate the exploitation chain structure. It does not include directly runnable attack details.

The remediation strategy is already clear and should take priority over routine version upgrades

The official fix strategy is straightforward: revert the in-place optimization introduced in 2017 so that algif_aead returns to out-of-place mode, preventing file page cache references from entering a writable output scatterlist.

Prioritize upgrades to the following fixed versions: Linux 6.18.22+, 6.19.12+, 7.0+, or use a distribution-provided backport patch. After upgrading, you must reboot. Otherwise, the old kernel will still be running.

If you cannot upgrade immediately, you can first reduce exposure by disabling algif_aead

If your production environment cannot be rebooted immediately for an upgrade, you can temporarily disable the algif_aead module to reduce the exposed attack surface. This measure may affect a small number of applications that depend on AF_ALG AEAD, so validate the impact on business workloads first.

uname -r                                   # Confirm the currently running kernel version
lsmod | grep algif_aead                    # Check whether the vulnerable module is loaded
echo "install algif_aead /bin/false" | sudo tee /etc/modprobe.d/disable-algif-aead.conf
sudo rmmod algif_aead 2>/dev/null          # Temporarily unload the module to block common exploitation paths
sudo apt update && sudo apt upgrade -y && sudo reboot

These commands help verify the environment, apply a temporary mitigation, and complete the formal upgrade.

The threat to cloud-native and container environments deserves special attention

Because the page cache is a globally shared resource, an unprivileged user inside a container may theoretically be able to poison setuid programs on the host through the shared host page cache, provided the exploitation conditions are met. That means the impact is not limited to single-host privilege escalation and may extend to container escape scenarios.

For that reason, you should not evaluate remediation priority based on the CVSS 7.8 score alone. You should also weigh four practical factors together: a public proof of concept, a low exploitation barrier, common availability of local user access, and spillover risk in containerized environments. In multi-tenant nodes, the real-world risk of this class of bug often exceeds what the score alone suggests.

The FAQ provides structured answers to common operational questions

Q1: Why is CopyFail easier to exploit than Dirty Cow?

A1: Because it does not depend on a race window. Its write primitive is stable, the exploitation steps are fixed, and public exploitation chains require less version-specific adaptation, which increases reliability.

Q2: Why might file integrity tools fail to detect the issue?

A2: Because the corruption occurs in the page cache rather than in the on-disk file itself. Hash checks based on the disk image may still pass, even though the runtime execution path uses the poisoned cached contents.

Q3: Will temporarily disabling algif_aead affect system-wide encryption capabilities?

A3: In most cases, it will not affect standard TLS, disk encryption, or SSH. However, it may affect specific applications that rely on the AF_ALG AEAD interface, so validate the change in a test environment before deployment.

The core summary reconstructs the vulnerability chain and the defense priorities

[AI Readability Summary]

CopyFail (CVE-2026-31431) is a high-severity Linux kernel local privilege escalation vulnerability caused by the interaction of AF_ALG, splice, an in-place AEAD optimization, and an out-of-bounds scratch write in authencesn. This article breaks down the affected versions, exploitation chain, risk boundaries, and recommended remediation steps.

AI Visual Insight:

The exploit path is best understood as a trust-boundary failure: zero-copy page cache references flow into a writable AEAD output scatterlist, and a fixed 4-byte scratch write turns that design flaw into practical page cache corruption. From there, attackers can assemble a privilege escalation chain against setuid binaries or, in some environments, pursue container escape.