Buffer Overflow in C: How Unsafe strcpy Puts Your App at Risk
Introduction
Few vulnerability classes have haunted software engineering as persistently as the buffer overflow. First documented in the 1970s and famously weaponized in the 1988 Morris Worm, buffer overflows remain a top cause of exploitable security bugs in C and C++ software today. Despite decades of awareness, they keep appearing in production codebases — including, as we'll explore here, a recent critical finding in sisyphus/board.c.
If you write C, maintain a legacy C codebase, or work on any system that links against native code, this post is for you. We'll walk through exactly what went wrong, how an attacker could exploit it, what the fix looks like, and how to prevent this class of bug from sneaking into your code in the first place.
The Vulnerability Explained
What Is a Buffer Overflow?
A buffer overflow occurs when a program writes more data into a fixed-size block of memory (a "buffer") than that buffer was allocated to hold. The excess data spills over into adjacent memory, corrupting whatever happens to live there — other variables, control structures, or even return addresses on the call stack.
In C, this happens most easily through string-handling functions that don't know — or don't check — the size of their destination buffer.
The Vulnerable Code
The vulnerability was found at sisyphus/board.c:389, flagged by the Semgrep rule utils.custom.buffer-overflow-strcpy. The root cause: an unsafe C buffer function used without size checking.
The classic offenders look like this:
// ❌ UNSAFE — strcpy does not check destination buffer size
char dest[256];
strcpy(dest, user_supplied_input);
// ❌ UNSAFE — strcat does not check remaining space
char buf[256] = "Hello, ";
strcat(buf, username);
// ❌ UNSAFE — sprintf does not limit output length
char message[256];
sprintf(message, "Welcome, %s!", username);
In each of these cases, if user_supplied_input, username, or any other source string is longer than the destination buffer allows, the write operation happily marches past the end of the buffer, overwriting adjacent memory.
How Could It Be Exploited?
Buffer overflows are not just crash bugs — in the right conditions, they are code execution vulnerabilities. Here's a simplified attack scenario:
Scenario: Stack-Based Buffer Overflow
Imagine board.c processes a board name submitted by a user over a network connection:
void process_board_name(const char *input) {
char name_buf[256];
strcpy(name_buf, input); // Vulnerable line
// ... further processing
}
The stack frame for process_board_name looks roughly like this in memory:
[ name_buf (256 bytes) ][ saved registers ][ return address ]
If an attacker sends a carefully crafted string of 300+ bytes, the overflow can reach and overwrite the return address. When process_board_name returns, instead of jumping back to the legitimate caller, the CPU jumps to an address the attacker controls — potentially shellcode embedded in the payload itself, or a gadget in an existing library (a technique called Return-Oriented Programming, or ROP).
What an attacker can achieve:
- Crash the application (Denial of Service)
- Corrupt program state (logic bypass, privilege escalation)
- Execute arbitrary code (full system compromise)
- Leak sensitive memory contents (information disclosure)
Real-World Impact
This vulnerability is classified as HIGH severity and maps to CWE-120: Buffer Copy without Checking Size of Input. In a production application, exploitation could mean:
- Remote code execution if the affected function processes network input
- Privilege escalation if the process runs with elevated permissions
- Data corruption leading to silent integrity failures
- Application crashes causing service outages
The Fix
What Changed
The fix replaces unbounded string functions with size-bounded alternatives that accept an explicit maximum length parameter, ensuring writes can never exceed the declared buffer size.
Before (vulnerable):
// ❌ No size limit — will overflow if input > 255 chars
char dest[256];
strcpy(dest, source);
// ❌ No format output limit
char msg[256];
sprintf(msg, "Board: %s", board_name);
After (safe):
// ✅ strlcpy: copies at most sizeof(dest)-1 bytes, always null-terminates
char dest[256];
strlcpy(dest, source, sizeof(dest));
// ✅ snprintf: limits output to sizeof(msg)-1 chars, always null-terminates
char msg[256];
snprintf(msg, sizeof(msg), "Board: %s", board_name);
Why This Works
The key insight is explicit size propagation. By passing sizeof(dest) directly to the copy function:
- The compiler knows the buffer size at the call site
- The function enforces the limit — it will truncate rather than overflow
- Null termination is guaranteed —
strlcpyandsnprintfalways null-terminate within bounds - The intent is documented — future readers immediately see the buffer is size-bounded
A Note on strncpy vs strlcpy
You might wonder: why not just use strncpy? This is a common point of confusion.
// ⚠️ strncpy — prevents overflow, but does NOT guarantee null termination
char dest[256];
strncpy(dest, source, sizeof(dest));
// If source >= 256 chars, dest[255] is NOT set to '\0' — still dangerous!
strncpy was designed for fixed-width record fields, not general string handling. If the source is exactly as long as (or longer than) the buffer, it leaves the destination without a null terminator, which causes the next string operation to run off the end of the buffer anyway.
strlcpy (available on BSD, macOS, and via libbsd on Linux) always null-terminates and returns the length of the source string, making it easy to detect truncation:
size_t result = strlcpy(dest, source, sizeof(dest));
if (result >= sizeof(dest)) {
// Truncation occurred — handle it explicitly
fprintf(stderr, "Warning: input truncated from %zu to %zu bytes\n",
result, sizeof(dest) - 1);
}
The Security Invariant
The PR establishes a clear security invariant that should be enforced going forward:
Buffer reads never exceed the declared length.
This invariant is verifiable, testable, and unambiguous — exactly what a good security property looks like.
Prevention & Best Practices
1. Ban Unsafe Functions at the Compiler Level
Many teams use compiler flags or static analysis to forbid the most dangerous functions outright:
# GCC/Clang: treat use of dangerous functions as errors
CFLAGS += -Werror=implicit-function-declaration
# Or use _FORTIFY_SOURCE for runtime checks
CFLAGS += -D_FORTIFY_SOURCE=2 -O2
You can also use #pragma or a custom header to poison unsafe symbols:
// security_guards.h
#pragma GCC poison strcpy strcat sprintf gets
Any use of these functions will now fail at compile time.
2. Use Static Analysis in CI/CD
The vulnerability in this case was caught by Semgrep, an excellent open-source static analysis tool. Add it to your pipeline:
# .github/workflows/security.yml
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/c.lang.security.insecure-use-string-copy-without-size
p/owasp-top-ten
Other tools worth integrating:
- CodeQL — deep semantic analysis, free for open source
- Coverity — industry-standard for C/C++
- AddressSanitizer (ASan) — runtime detection of memory errors
- Valgrind — memory error detection and profiling
3. Write Fuzz Tests
Fuzzing is one of the most effective techniques for finding buffer overflows:
// fuzz_board.c — AFL/libFuzzer harness
#include <stdint.h>
#include <stddef.h>
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Feed arbitrary input to the vulnerable function
char safe_input[size + 1];
memcpy(safe_input, data, size);
safe_input[size] = '\0';
process_board_name(safe_input);
return 0;
}
Compile with:
clang -fsanitize=fuzzer,address -o fuzz_board fuzz_board.c board.c
./fuzz_board -max_len=65536
4. Enable Memory Safety Mitigations
Modern platforms offer several layers of defense-in-depth:
| Mitigation | What It Does | How to Enable |
|---|---|---|
| Stack Canaries | Detects stack smashing at runtime | -fstack-protector-strong |
| ASLR | Randomizes memory layout | OS-level, enabled by default on modern Linux |
| NX/DEP | Marks stack non-executable | -z noexecstack |
| RELRO | Hardens GOT/PLT sections | -Wl,-z,relro,-z,now |
| CFI | Validates indirect calls | -fsanitize=cfi (Clang) |
# Recommended hardening flags for C projects
CFLAGS += -fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2
LDFLAGS += -Wl,-z,relro,-z,now -Wl,-z,noexecstack
5. Consider Memory-Safe Languages for New Code
Where feasible, new systems code written in Rust eliminates this entire class of vulnerability by design. Rust's ownership model and bounds-checked slices make buffer overflows a compile-time error, not a runtime exploit. (Notably, this project already has Rust dependencies — expanding their use could prevent future vulnerabilities of this type.)
6. Relevant Standards and References
- CWE-120: Buffer Copy without Checking Size of Input
- CWE-121: Stack-based Buffer Overflow
- OWASP: Buffer Overflow
- CERT C: STR31-C. Guarantee that storage for strings has sufficient space for character data and the null terminator
- SEI CERT C Coding Standard: MSC24-C. Do not use deprecated or obsolescent functions
Conclusion
Buffer overflows are old, well-understood, and entirely preventable — yet they continue to appear in real codebases every day. The fix here is elegant in its simplicity: swap strcpy for strlcpy, swap sprintf for snprintf, and pass the buffer size explicitly. Three characters and a size argument stand between your users and potential code execution.
The key takeaways from this vulnerability and its fix:
- Never use
strcpy,strcat,sprintf, orgetsin new C code. They are functionally deprecated for security-sensitive contexts. - Always use size-bounded alternatives (
strlcpy,strncat,snprintf) and passsizeof(buffer)explicitly. - Check for truncation — when input is silently truncated, your program's behavior may change in ways that introduce other bugs.
- Layer your defenses — static analysis, fuzzing, compiler hardening flags, and runtime sanitizers work together to catch what code review misses.
- Establish and test security invariants — the invariant "buffer reads never exceed declared length" is simple, testable, and should be encoded in your CI pipeline.
Security is not a single fix — it's a culture of careful habits, automated tooling, and continuous vigilance. Every unsafe function replaced is one fewer foothold for an attacker.
This vulnerability was identified and fixed as part of an automated security scanning pipeline. Automated tools are a force multiplier for security — but they work best when developers understand the underlying principles behind the findings.