Introduction
In the Leptonica image processing library, we discovered a critical buffer overflow vulnerability in prog/fuzzing/recog_basic_fuzzer.cc at line 12. The fuzzer entry point function LLVMFuzzerTestOneInput uses sprintf to construct a temporary file path into a fixed 256-byte char filename[256] buffer without any bounds checking. While the current format string /tmp/libfuzzer.%d with a PID value is unlikely to exceed 256 bytes in practice, the use of sprintf violates secure coding principles and establishes a dangerous pattern — one that becomes exploitable if the format string or its arguments ever change.
This matters because fuzzer code runs with untrusted, adversarial input by design. Any weakness in the fuzzer harness itself can be leveraged to corrupt the fuzzing process or the host system.
The Vulnerability Explained
Here's the vulnerable code in prog/fuzzing/recog_basic_fuzzer.cc:
L_RECOG *recog;
char filename[256];
sprintf(filename, "/tmp/libfuzzer.%d", getppid());
FILE *fp = fopen(filename, "wb");
if (!fp)
The sprintf function writes formatted output to filename without any awareness of the buffer's size. The function signature is:
int sprintf(char *str, const char *format, ...);
Notice there's no size parameter — sprintf will write as many bytes as the formatted string requires, regardless of how much space filename actually has.
Why This Is Dangerous
-
No bounds enforcement: If
getppid()somehow returned a value that, combined with the format string, exceeded 255 characters (plus null terminator),sprintfwould happily write past the end offilename. -
Stack corruption: Since
filenameis a stack-allocated buffer, overflow would corrupt adjacent stack variables (like therecogpointer), the saved frame pointer, or the return address — classic stack smashing. -
Predictable file path: The temporary file
/tmp/libfuzzer.<pid>uses a predictable path based on the parent PID. Combined with the lack of input size validation before writing fuzzer data to this file, an attacker controlling fuzzer input could write arbitrary data to a known location. -
Pattern propagation: This code serves as a template for other fuzzers. The unsafe
sprintfpattern gets copied into new fuzzer harnesses, multiplying the risk.
Attack Scenario
Consider this exploitation path specific to this code:
- An attacker identifies that
recog_basic_fuzzerwrites untrusted data to/tmp/libfuzzer.<pid>— a predictable path. - The fuzzer harness itself uses
sprintfwithout bounds checking for the filename. - If the code is modified to include additional path components or user-controlled data in the format string (a common evolution in fuzzer harnesses), the buffer overflow becomes directly exploitable.
- Stack corruption could redirect execution when
LLVMFuzzerTestOneInputreturns, giving the attacker code execution in the context of the fuzzing process.
The Fix
The fix is surgical and precise — replacing sprintf with snprintf on line 12:
Before (Vulnerable)
sprintf(filename, "/tmp/libfuzzer.%d", getppid());
After (Fixed)
snprintf(filename, sizeof(filename), "/tmp/libfuzzer.%d", getppid());
How This Solves the Problem
The snprintf function signature includes an explicit size parameter:
int snprintf(char *str, size_t size, const char *format, ...);
By passing sizeof(filename) (which evaluates to 256), snprintf guarantees:
- At most 255 characters are written to
filename(reserving one byte for the null terminator). - Output is truncated rather than overflowing if the formatted string would exceed the buffer.
- The buffer is always null-terminated, preventing string functions from reading past the buffer.
Using sizeof(filename) rather than a hardcoded 256 ensures the size limit automatically stays correct if the buffer declaration ever changes — a defensive programming practice that eliminates an entire class of maintenance bugs.
Prevention & Best Practices
Immediate Actions
- Ban
sprintfin your codebase: Use compiler flags like-Werror=deprecated-declarationsor linting rules to flagsprintfusage. - Replace all
sprintfwithsnprintf: This is a mechanical transformation that can be automated. - Use
sizeof()for stack buffers: Always passsizeof(buffer)rather than magic numbers.
Defensive Patterns for Fuzzer Harnesses
// GOOD: Bounded write with error checking
char filename[256];
int written = snprintf(filename, sizeof(filename), "/tmp/libfuzzer.%d", getppid());
if (written < 0 || (size_t)written >= sizeof(filename)) {
return 0; // Truncation occurred — bail safely
}
Tools for Detection
- Compiler warnings:
-Wall -Wformat-overflow -Wformat-truncation - Static analysis: Coverity, PVS-Studio, and clang-tidy's
bugprone-not-null-terminated-result - Semgrep rules: Custom rules matching
sprintfcalls with fixed buffers - AddressSanitizer (ASan): Catches overflows at runtime during fuzzing
Security Standards
- CWE-120: Buffer Copy without Checking Size of Input
- CERT C Rule STR31-C: Guarantee that storage for strings has sufficient space for character data and the null terminator
- OWASP: Buffer overflow prevention guidelines
Key Takeaways
- Never use
sprintf()with stack-allocated buffers in fuzzer harness code — fuzzers process adversarial input by design, making any weakness in the harness itself a security risk. - The
char filename[256]+sprintfpattern inrecog_basic_fuzzer.ccis a textbook CWE-120 violation that modern compilers can catch with proper warning flags enabled. - Using
sizeof(filename)withsnprintfcreates a self-maintaining size constraint — if the buffer size changes, the bounds check automatically adapts. - Fuzzer harness code IS production code — it ships in the repository, runs in CI/CD, and establishes patterns that other developers copy.
- Predictable temporary file paths (
/tmp/libfuzzer.<pid>) combined with unbounded writes compound the risk — fixing the buffer overflow is necessary but the predictable path pattern also deserves review.
How Orbis AppSec Detected This
- Source: Untrusted data from the
LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)fuzzer entry point - Sink:
sprintf(filename, "/tmp/libfuzzer.%d", getppid())inprog/fuzzing/recog_basic_fuzzer.cc:12 - Missing control: No bounds checking on the
sprintfoutput — the 256-bytefilenamebuffer has no size enforcement during the write operation - CWE: CWE-120 (Buffer Copy without Checking Size of Input)
- Fix: Replaced
sprintfwithsnprintf(filename, sizeof(filename), ...)to enforce buffer size limits and prevent overflow
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
Buffer overflows via sprintf remain one of the most common and dangerous vulnerabilities in C/C++ code. This specific instance in recog_basic_fuzzer.cc demonstrates that even "simple" code — constructing a filename from a PID — can harbor critical security flaws when unsafe functions are used. The fix is straightforward: replace sprintf with snprintf and always pass the buffer size. This one-character-difference function (s → sn) is the difference between a buffer overflow and safe, bounded string formatting.
For teams maintaining C/C++ codebases, the lesson is clear: audit every sprintf call, enable format-overflow compiler warnings, and treat fuzzer harness code with the same security rigor as any production component.