Back to Blog
critical SEVERITY5 min read

How buffer overflow happens in C xxd utility and how to fix it

A critical buffer overflow vulnerability was discovered in the xxd utility's `xxdline()` function where `strcpy()` was used without bounds checking on file input. An attacker could craft a malicious hex dump file with oversized lines to trigger memory corruption. The fix replaces the unsafe `strcpy()` with `snprintf()` to enforce buffer size limits.

O
By Orbis AppSec
Published June 14, 2026Reviewed June 14, 2026

Answer Summary

This is a buffer overflow vulnerability (CWE-120) in C code within the xxd hex dump utility. The vulnerable `strcpy(z, l)` call at line 576 of `xxd.c` copied user-controlled file input without checking buffer bounds. The fix replaces `strcpy()` with `snprintf(z, sizeof(z), "%s", l)` to enforce the destination buffer's size limit, preventing memory corruption from oversized input lines.

Vulnerability at a Glance

cweCWE-120
fixReplace strcpy() with snprintf() using sizeof(z) as the size limit
riskRemote code execution or denial of service via crafted input file
languageC
root causestrcpy() used without bounds checking on file-derived input
vulnerabilityBuffer Overflow (Classic)

Introduction

In the xxd hex dump utility, we discovered a critical buffer overflow vulnerability in src/xxd/xxd.c at line 576. The xxdline() function, which processes hex dump file input for the revert mode (xxd -r), used strcpy(z, l) to copy line data without any bounds checking. Since the source buffer l is populated directly from user-provided file input, an attacker could craft a hex dump file with lines exceeding the destination buffer z's size, triggering a stack buffer overflow.

This vulnerability is particularly dangerous because xxd is a command-line utility that processes arbitrary user-provided files, making the attack vector directly accessible to anyone who can supply input to the tool.

The Vulnerability Explained

The vulnerable code resided in the xxdline() function, which handles line-by-line processing of hex dump files during revert operations. Here's the problematic code:

static void xxdline(FILE *fp, char *l, char *colors, int nz)
{
  static signed char zero_seen = 0;

  if (!nz && zero_seen == 1) {
    strcpy(z, l);  // VULNERABLE: No bounds checking!
    if (colors) {
      memcpy(z_colors, colors, strlen(z));
    }

The issue is straightforward but severe:

  1. l is file-derived input: The buffer l contains data read from a hex dump file provided by the user
  2. z is a fixed-size buffer: The destination buffer z has a predetermined size
  3. strcpy() has no length limit: This function copies bytes until it encounters a null terminator, regardless of the destination buffer's capacity

Attack Scenario

An attacker could exploit this vulnerability with these steps:

  1. Create a malicious hex dump file with a line containing more than 256 bytes of hex characters
  2. Run xxd -r malicious_file.hex > output.bin
  3. When xxdline() processes the oversized line, strcpy(z, l) writes beyond z's boundary
  4. This corrupts the stack, potentially overwriting the return address
  5. With a carefully crafted payload, the attacker achieves arbitrary code execution

A simple proof-of-concept payload would be a hex dump file containing a single line with 512+ hex characters:

444444444444444444444444444444444444444444444444... (512+ '4' characters)

The regression test in the PR demonstrates this exact attack vector:

/* Generate oversized hex payloads */
char payload_256[513];
memset(payload_256, '4', 512);
payload_256[512] = '\0';

The Fix

The fix is elegant and follows C security best practices—replacing the unbounded strcpy() with the bounds-checked snprintf():

Before (Vulnerable)

strcpy(z, l);

After (Fixed)

snprintf(z, sizeof(z), "%s", l);

This change provides several security guarantees:

  1. sizeof(z) enforces the buffer limit: The second argument explicitly tells snprintf() the maximum number of bytes to write
  2. Automatic null-termination: Unlike strncpy(), snprintf() always null-terminates the output (if size > 0)
  3. Truncation over corruption: If l exceeds the buffer size, the data is truncated rather than overflowing

The PR also added a functional test to prevent regression:

it('handles long lines in revert mode', function()
  t.skip(t.is_arch('s390x'), 'FIXME: xxd not built correctly on s390x with QEMU?')
  local long_line = ('4'):rep(512) .. '\n'
  fn.system({ testprg('xxd'), '-r' }, long_line)
  eq(0, eval('v:shell_error'))
end)

This test ensures xxd gracefully handles 512-character lines without crashing—a direct verification that the buffer overflow is mitigated.

Note on Additional Vulnerable Locations

The PR notes that line 1115 in the same file uses a similar pattern and may need review. This highlights an important principle: when fixing one instance of an unsafe pattern, always search for similar patterns throughout the codebase.

Prevention & Best Practices

Immediate Actions

  1. Ban unsafe string functions: Configure your compiler and linters to warn on strcpy(), strcat(), sprintf(), and gets()
  2. Use bounds-checked alternatives:
    - strcpy()snprintf() or strlcpy()
    - strcat()strncat() or strlcat()
    - sprintf()snprintf()
    - gets()fgets()

Compiler Flags

Enable stack protection and buffer overflow detection:

gcc -fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2 source.c

Static Analysis

Use tools that detect buffer overflow patterns:

  • Semgrep: Rules for detecting unsafe C string functions
  • Clang Static Analyzer: Built-in checks for buffer overflows
  • Coverity: Commercial tool with deep buffer analysis

Code Review Checklist

When reviewing C code that handles external input:

  • [ ] Are all string copies bounds-checked?
  • [ ] Is sizeof() used correctly (not on pointers)?
  • [ ] Are buffer sizes validated before use?
  • [ ] Is input length checked before processing?

Key Takeaways

  • Never use strcpy() with file-derived input: The xxdline() function processed user-provided hex dump files, making strcpy(z, l) a direct attack vector
  • snprintf() is the safe replacement for strcpy() in C: It enforces bounds and always null-terminates
  • Command-line utilities processing user files are high-risk: xxd's -r mode reads arbitrary files, requiring defensive coding throughout
  • Search for pattern siblings: The PR notes line 1115 may have the same issue—one vulnerability often indicates more
  • Regression tests prevent security fixes from being undone: The Lua test with 512-character input ensures this specific attack vector stays closed

How Orbis AppSec Detected This

  • Source: File input processed by the xxd utility's revert mode (xxd -r), specifically the line buffer l populated from user-provided hex dump files
  • Sink: strcpy(z, l) at src/xxd/xxd.c:576 in the xxdline() function
  • Missing control: No bounds checking between the file-derived input length and the destination buffer z's capacity
  • CWE: CWE-120 (Buffer Copy without Checking Size of Input)
  • Fix: Replaced strcpy(z, l) with snprintf(z, sizeof(z), "%s", l) to enforce the buffer size limit

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

Buffer overflows remain one of the most dangerous vulnerability classes in C programming, and this xxd vulnerability demonstrates why. A single strcpy() call without bounds checking created a critical security flaw that could be exploited through a crafted input file. The fix—replacing strcpy() with snprintf()—is simple but essential.

When working with C code that processes external input, always assume the input is malicious. Use bounds-checked functions, enable compiler protections, and implement regression tests for security fixes. The few extra characters of code for snprintf(z, sizeof(z), "%s", l) versus strcpy(z, l) are the difference between secure software and a critical vulnerability.

References

Frequently Asked Questions

What is a buffer overflow vulnerability?

A buffer overflow occurs when a program writes data beyond the allocated memory boundary, potentially corrupting adjacent memory, crashing the application, or enabling arbitrary code execution.

How do you prevent buffer overflow in C?

Use bounds-checked functions like snprintf() instead of strcpy(), strncpy() instead of strcat(), and always validate input lengths before copying data into fixed-size buffers.

What CWE is buffer overflow?

Buffer overflow is classified as CWE-120 (Buffer Copy without Checking Size of Input), part of the broader CWE-119 family of memory corruption vulnerabilities.

Is using strncpy() enough to prevent buffer overflow?

strncpy() helps but has pitfalls—it doesn't null-terminate if the source exceeds the limit. snprintf() is safer as it always null-terminates and returns the number of characters that would have been written.

Can static analysis detect buffer overflow?

Yes, static analysis tools can detect many buffer overflow patterns, especially unsafe function usage like strcpy(), gets(), and sprintf(). However, complex data flows may require dynamic analysis or manual review.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #40236

Related Articles

critical

How buffer overflow in URL parsing happens in C++ HTTP client and how to fix it

A critical buffer overflow vulnerability in the HTTP client's URL parsing function allowed attackers to overflow a stack-allocated host buffer through specially crafted URLs with excessively long hostnames. The vulnerability enabled arbitrary code execution by overwriting the return address. The fix adds proper bounds validation before the memcpy() operation to ensure the hostname length never exceeds the destination buffer size.

critical

How integer overflow in _wopendir() happens in C Windows dirent and how to fix it

A critical integer overflow vulnerability in `include/compat/dirent_msvc.h` allowed an attacker-controlled directory path length to wrap the `sizeof(wchar_t) * n + 16` allocation calculation, resulting in a dangerously undersized heap buffer. Subsequent writes to that buffer caused a heap overflow, enabling potential memory corruption or code execution on Windows systems. The fix adds a pre-allocation bounds check and proper errno signaling to safely reject overflow-inducing inputs.

critical

How buffer overflow in memcpy() happens in C/C++ embedded firmware and how to fix it

A critical buffer overflow vulnerability was discovered in the ESP32-based micro-journal firmware where `memcpy()` calls used `strlen()` without bounds checking, allowing oversized USB descriptor strings to corrupt adjacent memory. The fix replaces unbounded `strlen()` with `strnlen()` calls that enforce the destination buffer sizes (8, 16, and 4 bytes respectively), preventing heap/stack corruption from malicious USB devices.

high

How Denial of Service via crafted URI templates happens in Ruby addressable and how to fix it

A high-severity Denial of Service vulnerability (CVE-2026-35611) was discovered in the Ruby `addressable` gem versions prior to 2.9.0, which could allow attackers to crash or hang applications by sending specially crafted URI templates. The fix upgrades the dependency from version 2.8.7 to 2.9.0 across the Gemfile, Gemfile.lock, and gemspec in a Fastlane project, eliminating the vulnerable code path entirely.

critical

How Server-Side Request Forgery (SSRF) happens in Python requests.get() and how to fix it

A critical Server-Side Request Forgery (SSRF) vulnerability was discovered in `models/common.py` where `requests.get()` fetched images from arbitrary URLs without validating whether the target resolved to internal infrastructure. An attacker could supply URLs targeting AWS metadata endpoints (169.254.169.254), private networks, or localhost services through the Flask REST API. The fix introduces DNS-resolution-based validation using Python's `socket.getaddrinfo()` and `ipaddress` module to block

critical

How heap buffer overflow happens in C WiFi frame capture and how to fix it

A critical buffer overflow vulnerability in the ESP32 WiFi frame capture feature (feat_capture_hs.c) allowed attackers within WiFi range to craft oversized 802.11 frames that would overflow heap buffers and achieve remote code execution. The fix adds explicit length validation before memcpy operations and rejects oversized frames rather than silently truncating them.