Buffer Overflow in C++: How Unsafe strcpy Puts Applications at Risk
Vulnerability: Buffer Overflow via Unsafe C String Functions
Severity: High
File:src/display.cpp
Scanner: Semgrep (utils.custom.buffer-overflow-strcpy)
Status: ✅ Fixed
Introduction
Few vulnerability classes have as long and storied a history as the buffer overflow. First documented in the 1970s and famously weaponized by the Morris Worm in 1988, buffer overflows remain a leading cause of critical security vulnerabilities in C and C++ codebases today. Despite decades of awareness, unsafe string-handling functions like strcpy, strcat, and sprintf continue to slip into production code — sometimes in security-sensitive paths.
This post covers a real-world buffer overflow vulnerability discovered by automated static analysis in src/display.cpp and explains how a targeted, surgical fix eliminates the risk. Whether you're a C++ veteran or a developer just starting to explore native code, understanding this class of bug is essential for writing safe, resilient software.
The Vulnerability Explained
What Is a Buffer Overflow?
A buffer overflow occurs when a program writes more data into a fixed-size memory buffer than it was allocated to hold. The excess data spills into adjacent memory regions, potentially overwriting other variables, return addresses, or control data.
In C and C++, this most commonly happens with string manipulation functions that do not check how much space is available in the destination buffer before copying data.
The offending pattern in this case was the use of functions like strcpy() — or similar unbounded variants — at line 44 of src/display.cpp.
The Dangerous Functions
The C standard library provides several string functions that are notoriously unsafe:
// ❌ UNSAFE: No bounds checking — copies until null terminator
strcpy(dest, src);
// ❌ UNSAFE: No bounds checking on destination
strcat(dest, src);
// ❌ UNSAFE: No limit on output size
sprintf(buffer, "User: %s", username);
The problem with all three is the same: they trust the source data to fit in the destination. If an attacker controls the source string — through user input, a network packet, a file read, or an environment variable — they can supply a string longer than the destination buffer.
What Happens When It Overflows?
Depending on where the buffer lives in memory and what surrounds it, a buffer overflow can cause:
| Outcome | Description |
|---|---|
| Application crash | Corrupted memory causes a segmentation fault |
| Data corruption | Adjacent variables are silently overwritten |
| Control flow hijack | Return addresses on the stack are overwritten, redirecting execution |
| Arbitrary code execution | Attacker-supplied shellcode is executed with the app's privileges |
| Privilege escalation | If the process runs as root or a privileged user, full system compromise |
A Concrete Attack Scenario
Imagine display.cpp is responsible for rendering a username or display label pulled from an external source (a config file, API response, or user-supplied input):
// Somewhere in src/display.cpp (line ~44)
char display_buffer[64];
strcpy(display_buffer, user_supplied_name); // ❌ No size check!
render_label(display_buffer);
An attacker who can influence user_supplied_name — say, by crafting a malicious configuration file or intercepting an API response — could supply a 200-character string. The first 64 bytes fill display_buffer. The remaining 136 bytes overflow into adjacent stack memory, potentially overwriting the saved return address.
When render_label returns, instead of jumping back to the legitimate caller, execution jumps to an address the attacker controls. This is the classic stack smashing attack, and it's been exploited in the wild for over 35 years.
The Fix
What Changed
The fix replaces unbounded C string functions with size-bounded alternatives that accept an explicit maximum length parameter, preventing writes beyond the buffer's allocated size.
The two primary safe replacements used in this fix are:
strlcpy instead of strcpy:
// ❌ BEFORE: Unsafe, no bounds checking
char display_buffer[64];
strcpy(display_buffer, source_string);
// ✅ AFTER: Safe, size-bounded copy
char display_buffer[64];
strlcpy(display_buffer, source_string, sizeof(display_buffer));
snprintf instead of sprintf:
// ❌ BEFORE: Unsafe formatted output
char label[128];
sprintf(label, "Display: %s", user_input);
// ✅ AFTER: Safe formatted output with explicit size limit
char label[128];
snprintf(label, sizeof(label), "Display: %s", user_input);
How the Fix Works
The key insight is simple: tell the function how big your buffer is.
strlcpy(dest, src, size)copies at mostsize - 1characters and always null-terminates the result. No matter how longsrcis,destwill never overflow.snprintf(dest, size, fmt, ...)writes at mostsize - 1characters of formatted output. Excess output is silently truncated, but memory is never corrupted.
Using sizeof(display_buffer) rather than a hardcoded number is a deliberate best practice — if the buffer size ever changes during a refactor, the size limit automatically updates with it, eliminating a whole class of maintenance bugs.
Why This Is a Complete Fix
Some developers wonder: "Can't I just make the buffer bigger?" The answer is: no, that only delays the problem. A larger buffer is still a fixed-size buffer. An attacker can always supply a longer string. The only real fix is to enforce a hard ceiling on how much data can be written — which is exactly what strlcpy and snprintf do.
Prevention & Best Practices
1. Ban Unsafe Functions at the Linter Level
The most effective prevention is making unsafe functions impossible to accidentally use. Configure your static analysis tools to flag them as errors:
Semgrep rule (like the one that caught this bug):
rules:
- id: buffer-overflow-strcpy
patterns:
- pattern: strcpy(...)
- pattern: strcat(...)
- pattern: sprintf(...)
- pattern: gets(...)
message: "Unsafe C string function. Use strlcpy, strncat, snprintf, or fgets."
severity: ERROR
languages: [c, cpp]
Compiler flags that help:
# GCC/Clang: Enable fortified source checks
-D_FORTIFY_SOURCE=2
# Enable all warnings + treat as errors
-Wall -Wextra -Werror
# AddressSanitizer for runtime detection during testing
-fsanitize=address
2. Prefer C++ String Abstractions
In modern C++, the best solution is often to avoid raw character arrays entirely:
// ✅ std::string handles memory automatically
#include <string>
std::string display_buffer = source_string; // No overflow possible
render_label(display_buffer);
std::string dynamically allocates memory as needed and never overflows a fixed buffer. Reserve raw char[] arrays for performance-critical paths, FFI boundaries, or embedded contexts where dynamic allocation isn't available.
3. Use the Complete Safe-Function Checklist
| Unsafe Function | Safe Replacement | Notes |
|---|---|---|
strcpy |
strlcpy or strncpy |
Prefer strlcpy; strncpy doesn't guarantee null termination |
strcat |
strlcat or strncat |
Same caveat applies |
sprintf |
snprintf |
Always pass sizeof(buffer) |
gets |
fgets |
gets is removed from C11 entirely |
scanf("%s") |
scanf("%63s") |
Specify max field width |
4. Adopt a Threat Modeling Mindset
Ask these questions about every buffer in your code:
- Where does the data come from? Is it user-supplied, network-received, or file-read?
- What's the maximum possible size? Is that maximum enforced before the copy?
- What happens if this buffer overflows? Is there sensitive data or a return address nearby?
Any buffer fed by external data is a potential attack surface.
5. Relevant Security Standards
This vulnerability maps to well-established security standards:
- CWE-120: Buffer Copy without Checking Size of Input ("Classic Buffer Overflow")
- CWE-676: Use of Potentially Dangerous Function
- OWASP Top 10 2021 – A03: Injection (memory corruption is a subset)
- SEI CERT C Coding Standard: STR31-C — Guarantee that storage for strings has sufficient space for character data and the null terminator
- MISRA C:2012: Rule 21.6 — The standard library input/output functions shall not be used
6. Automate Detection in CI/CD
The fact that this vulnerability was caught by an automated scanner — Semgrep — before it reached production is exactly how modern secure development should work. Integrate static analysis into your pipeline:
# Example GitHub Actions step
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/c
p/cpp
p/security-audit
Running security scans on every pull request means vulnerabilities are caught at the cheapest possible moment — before they're deployed.
Conclusion
Buffer overflows are a solved problem — we've had the tools to prevent them for decades. Yet they persist because unsafe functions are easy to reach for, code reviews miss subtle size mismatches, and the consequences aren't always immediately visible. This vulnerability in src/display.cpp is a reminder that even experienced developers can introduce these bugs, and that automated static analysis is an essential safety net.
The fix here is clean and principled:
- Replace
strcpy/sprintfwithstrlcpy/snprintf— enforce hard size limits - Use
sizeof(buffer)— keep size limits tied to the actual allocation - Verify with a re-scan — confirm the fix is complete, not just cosmetic
More broadly, the best defense against buffer overflows is a layered one: safe-by-default language features (like std::string), compiler hardening flags, runtime sanitizers during testing, and static analysis in CI. No single layer is sufficient; together, they make this class of vulnerability extremely difficult to introduce and nearly impossible to miss.
Write code as if every string is adversarial. Because one day, it will be.
This vulnerability was automatically detected and fixed by OrbisAI Security. Automated security scanning helps teams find and fix issues like this before they reach production.