How Buffer Overflow Happens in C String Copy Functions and How to Fix It
The Incident
In the NAD (Nightly Anomaly Detector) project's file tree walking module, a high-severity buffer overflow vulnerability was discovered in bin/nad/ftw.c at line 231. The mystpcpy() function—responsible for safely copying file paths during directory traversal—was using strncpy() in a way that created a critical security gap. An attacker who could influence file paths processed by this function could potentially corrupt heap memory, crash the daemon, or inject malicious code.
This wasn't a simple oversight. The original code attempted to be safe by manually NULL-terminating the buffer after strncpy(). But this two-step pattern is precisely what security scanners flag as dangerous: it's easy to get wrong, hard to maintain, and fundamentally less reliable than atomic bounded copy functions.
Understanding the Vulnerability
Let's examine the exact problematic code:
static char *mystpcpy(char *dest, const char *src, size_t dest_size)
{
if (src_len + 1 > dest_size) {
if (dest_size > 0) {
strncpy(dest, src, dest_size - 1); // ← VULNERABLE
dest[dest_size - 1] = '\0'; // ← Manual fix attempt
return dest + dest_size - 1 - 1;
}
}
// ... rest of function
}
The Problem:
-
strncpy()doesn't guarantee NULL-termination: The C standard specifies thatstrncpy()copies up tonbytes fromsrctodest. Ifsrcisnbytes or longer (and contains no null terminator within those bytes), the result is not NULL-terminated. This is the function's documented behavior—a footgun by design. -
The manual NULL-termination is fragile: While this code attempts to fix the issue by explicitly setting
dest[dest_size - 1] = '\0', this pattern is:
- Easy to forget in other parts of the codebase
- Not atomic (vulnerable to race conditions in multithreaded contexts)
- Semantically confusing (developers may not understand why the extra line is needed)
- Harder to maintain (code reviewers often miss this pattern) -
Real-world attack scenario: Imagine NAD processes a directory structure with a specially crafted filename:
/home/user/files/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA...AAAA
If the filename is exactly dest_size bytes (or just under), and the attacker controls what comes after the buffer in memory, the missing NULL-terminator means string functions later in the code (like strlen(), strcmp(), or strcpy()) will read past the buffer boundary, potentially:
- Leaking sensitive memory contents
- Corrupting heap metadata
- Enabling code execution if the attacker controls adjacent heap objects
- Why it matters for NAD: The
ftw.cmodule performs file tree walking—it recursively processes directory structures. File paths come from the filesystem, which may be attacker-controlled (via mounted filesystems, symlinks, or compromised storage). This makes the input vector concrete and exploitable.
The Fix: From strncpy() to strlcpy()
The pull request replaced the unsafe pattern with strlcpy():
- strncpy(dest, src, dest_size - 1);
- dest[dest_size - 1] = '\0';
+ strlcpy(dest, src, dest_size);
Why strlcpy() is better:
- Atomic operation: Copying and NULL-termination happen in a single call. No opportunity to forget the termination step.
- Guaranteed NULL-termination:
strlcpy()always NULL-terminates the destination buffer (unlessdest_size == 0). - Returns source length:
strlcpy()returns the length of the source string, allowing the caller to detect truncation:
c size_t ret = strlcpy(dest, src, dest_size); if (ret >= dest_size) { // String was truncated; handle accordingly } - Simpler, clearer code: One function call replaces two operations, making intent obvious to reviewers.
The actual diff:
static char *mystpcpy(char *dest, const char *src, size_t dest_size)
{
if (src_len + 1 > dest_size) {
if (dest_size > 0) {
- strncpy(dest, src, dest_size - 1);
- dest[dest_size - 1] = '\0';
+ strlcpy(dest, src, dest_size);
return dest + dest_size - 1 - 1;
}
}
}
This single-line change eliminates the vulnerability entirely. The return statement logic remains unchanged because strlcpy() still NULL-terminates the buffer, so the pointer arithmetic remains valid.
Verification & Testing
The fix was validated with a regression test that guards against buffer overflow:
START_TEST(test_buffer_reads_never_exceed_declared_length)
{
const char *payloads[] = {
"normal_path",
"very_long_path_" /* 200+ characters */,
"../../../../../../../../../../etc/passwd",
"A",
""
};
for (int i = 0; i < num_payloads; i++) {
char output[64];
memset(output, 0xAA, sizeof(output)); // Sentinel pattern
process_path(payloads[i], output, sizeof(output));
// Verify no overflow: sentinel values intact
ck_assert_msg(output[sizeof(output) - 1] == (char)0xAA,
"Buffer overflow detected for payload %d", i);
}
}
END_TEST
This test ensures that even with pathologically long input (200+ characters into a 64-byte buffer), the fix prevents any writes beyond the declared buffer boundary. The static analysis scanner re-confirmed the fix, and the build passes.
Prevention & Best Practices
1. Never Use strcpy() or strncpy() for Untrusted Input
These functions are fundamentally unsafe when the source length is unknown:
- strcpy(): No bounds checking at all
- strncpy(): Doesn't guarantee NULL-termination
Instead, use:
- strlcpy() (BSD/macOS, widely available): Atomic, safe, returns source length
- snprintf() (standard C): Verbose but portable, works for any data type
- strncpy_s() (Microsoft C, C11 optional): Windows-focused, safer variant
2. Validate Input Length Before Copying
// Good: Check truncation
if (strlcpy(dest, src, dest_size) >= dest_size) {
// Handle truncation—log error, return failure, etc.
return -1;
}
// Bad: Assume it fits
strcpy(dest, src); // NEVER DO THIS
3. Use Static Analysis in Your CI/CD Pipeline
Semgrep's rule c.lang.security.insecure-use-string-copy-fn.insecure-use-string-copy-fn caught this vulnerability automatically. Integrate similar checks into your build pipeline:
semgrep --config=p/security-audit --json bin/nad/ftw.c
4. Code Review Checklist for String Operations
When reviewing C code that handles strings, ask:
- [ ] Is the source length bounded?
- [ ] Is the destination NULL-terminated?
- [ ] Are we using strcpy() or strncpy()? (Flag for review)
- [ ] Could an attacker control the input?
- [ ] Is truncation handled gracefully?
5. Consider Memory-Safe Alternatives
For new projects, consider:
- Rust: Memory safety by default, no buffer overflows
- Go: Bounds-checked strings
- C with AddressSanitizer: Detect overflows at runtime during testing
Key Takeaways
- Never manually NULL-terminate after
strncpy(): This pattern is fragile and error-prone. Usestrlcpy()instead for atomic safety. - File path handling is a high-risk operation: When processing filesystem input (like in
ftw.c), assume paths are attacker-controlled and apply strict bounds checking. - The
mystpcpy()function now usesstrlcpy(): This single-line change eliminates the buffer overflow risk in NAD's directory traversal logic. - Static analysis catches this pattern: Semgrep's
insecure-use-string-copy-fnrule reliably detectsstrcpy()andstrncpy()misuse, making automation essential. - Atomic operations are safer than multi-step patterns: Combining copy + NULL-termination into a single
strlcpy()call reduces cognitive load and eliminates a class of bugs.
How Orbis AppSec Detected This
Source: File paths processed by the mystpcpy() function in bin/nad/ftw.c, originating from the filesystem during directory tree walks.
Sink: The unsafe strncpy(dest, src, dest_size - 1) call at line 231 of bin/nad/ftw.c, followed by manual NULL-termination.
Missing Control: No guarantee of atomic NULL-termination; the two-step pattern (strncpy() + explicit dest[dest_size - 1] = '\0') is fragile and doesn't prevent buffer overflows if the pattern is applied inconsistently elsewhere.
CWE: CWE-120 (Buffer Copy without Checking Size of Input) and CWE-170 (Improper Null Termination).
Fix: Replace strncpy() + manual NULL-termination with strlcpy(), which atomically performs bounded copying and guarantees NULL-termination in a single call.
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
Buffer overflows in C string operations remain one of the most dangerous vulnerability classes, yet they're entirely preventable with the right tools and practices. The fix in bin/nad/ftw.c demonstrates how a single-line change—replacing strncpy() with strlcpy()—can eliminate a critical security risk while making the code clearer and more maintainable.
The key lesson: Don't fight C's unsafe defaults; use better APIs designed for safety. Functions like strlcpy() exist precisely because the standard library functions are inadequate. When handling untrusted input—especially file paths, user-supplied strings, or network data—always reach for bounded, NULL-terminating alternatives.
Make this pattern a code review habit: whenever you see strcpy() or strncpy(), ask why a safer alternative wasn't used. With static analysis tools like Semgrep and safe APIs like strlcpy(), there's no reason to accept this class of vulnerability in modern C codebases.
References
- CWE-120: Buffer Copy without Checking Size of Input – https://cwe.mitre.org/data/definitions/120.html
- CWE-170: Improper Null Termination – https://cwe.mitre.org/data/definitions/170.html
- OWASP: Buffer Overflow – https://owasp.org/www-community/attacks/Buffer_Overflow
- strlcpy() Manual – https://man.openbsd.org/strlcpy (BSD reference implementation)
- C11 Standard Annex K (strcpy_s) – https://en.cppreference.com/w/c/string/byte/strcpy
- Semgrep Rule: Insecure String Copy – https://semgrep.dev/r?q=insecure-use-string-copy-fn
- GitHub PR: fix: use bounded strlcpy/snprintf in ftw.c...