What is a buffer overflow vulnerability?

A buffer overflow occurs when a program writes data beyond the allocated boundary of a buffer in memory, potentially overwriting adjacent memory and enabling code execution or crashes.

How do you prevent buffer overflow in C?

Use size-bounded functions like strncpy(), strncat(), and snprintf() instead of their unbounded counterparts, always passing the exact buffer size as the limit parameter.

What CWE is buffer overflow?

CWE-120 (Buffer Copy without Checking Size of Input) covers classic buffer overflows where data is copied into a buffer without verifying the input fits within the allocated space.

Is using strncpy() enough to prevent buffer overflow?

strncpy() prevents overflow but doesn't guarantee null-termination if the source exceeds the limit. You must manually null-terminate the buffer after strncpy() to avoid undefined behavior on subsequent string operations.

Can static analysis detect buffer overflow?

Yes, static analysis tools like Coverity, Clang Static Analyzer, and Semgrep can detect many buffer overflow patterns, especially those involving known-unsafe functions like strcpy() and sprintf().

Buffer Overflow in C: How Unsafe `strcpy` Almost Broke Everything

Severity: High | Type: Buffer Overflow | File: lib/source/algorithms/gimbal_md5.c | CWE: CWE-120 (Buffer Copy Without Checking Size of Input)

Introduction

Buffer overflows are one of the oldest classes of security vulnerabilities in software development — and yet, they remain stubbornly common in C codebases today. They've been responsible for some of the most devastating exploits in history, from the Morris Worm in 1988 to modern remote code execution vulnerabilities in widely-deployed software.

This post covers a real-world buffer overflow vulnerability discovered and patched in gimbal_md5.c, a C source file implementing MD5 hashing functionality. The root cause? A handful of unsafe string-handling functions that trusted the caller to provide well-behaved input — a trust that attackers are more than happy to betray.

Whether you write C professionally, maintain legacy codebases, or just want to understand why memory safety matters, this post will walk you through what went wrong, how it was fixed, and how to make sure it never happens again.

The Vulnerability Explained

What Is a Buffer Overflow?

A buffer overflow occurs when a program writes more data into a fixed-size block of memory (a "buffer") than it was designed to hold. The excess data spills into adjacent memory regions, potentially overwriting critical data structures, return addresses, or executable code.

In C, several standard library functions are notorious for enabling this class of bug because they perform string operations without any built-in length checking:

Unsafe Function	What It Does	The Problem
`strcpy(dest, src)`	Copies `src` to `dest`	Copies until `\0` — no size limit
`strcat(dest, src)`	Appends `src` to `dest`	Same issue — no bounds
`sprintf(buf, fmt, ...)`	Formats into `buf`	No output size limit
`gets(buf)`	Reads a line into `buf`	So dangerous it was removed from C11

The vulnerability in gimbal_md5.c involved exactly this pattern: using strcpy (or similar unbounded functions) to copy data into a fixed-size destination buffer at line 301, without verifying that the source data would fit.

How Does It Work in Practice?

Consider a simplified version of the vulnerable pattern:

// VULNERABLE CODE — DO NOT USE
void process_hash_input(const char *user_input) {
    char buffer[256];  // Fixed-size stack buffer

    strcpy(buffer, user_input);  // ❌ No bounds check!
    // ... proceed to compute MD5 ...
}

If user_input is 300 bytes long, strcpy will happily copy all 300 bytes into a 256-byte buffer. The extra 44 bytes don't just disappear — they overwrite whatever happens to live next to buffer on the stack. That might be:

Local variables — corrupting program logic
The saved frame pointer — destabilizing the call stack
The return address — redirecting execution to attacker-controlled code

A Concrete Attack Scenario

Imagine this code is part of an application that accepts file paths or user-supplied strings to hash. An attacker crafts a malicious input:

# Attacker sends 512 bytes of 'A' followed by a crafted return address
payload = b"A" * 256          # Fill the buffer
payload += b"A" * 8           # Overwrite saved frame pointer
payload += b"\xef\xbe\xad\xde"  # Overwrite return address with attacker's target

When process_hash_input returns, instead of going back to the legitimate caller, the CPU jumps to the address the attacker injected. On modern systems, mitigations like ASLR (Address Space Layout Randomization), stack canaries, and NX bits make this harder to exploit — but not impossible, especially in embedded systems, older platforms, or when combined with information leak vulnerabilities.

Even without achieving code execution, an attacker can use this to:

Crash the process (Denial of Service)
Corrupt hash outputs, undermining data integrity checks
Leak sensitive memory contents by manipulating adjacent data

The Fix

The patch replaces all unbounded string operations with their size-aware counterparts, enforcing the security invariant:

Buffer reads must never exceed the declared length.

Before (Vulnerable)

// ❌ BEFORE: Unbounded copy — trusts the caller completely
void gimbal_md5_update_string(MD5Context *ctx, const char *input) {
    char staging_buffer[256];

    strcpy(staging_buffer, input);         // No size check
    sprintf(staging_buffer, "%s", input);  // Also unbounded

    // ... process staging_buffer ...
}

After (Fixed)

// ✅ AFTER: Bounded copy — enforces maximum size
void gimbal_md5_update_string(MD5Context *ctx, const char *input) {
    char staging_buffer[256];

    strlcpy(staging_buffer, input, sizeof(staging_buffer));
    // OR equivalently:
    snprintf(staging_buffer, sizeof(staging_buffer), "%s", input);

    // ... process staging_buffer ...
}

Why This Works

strlcpy(dst, src, size) — Unlike strcpy, this function accepts a size parameter specifying the total size of the destination buffer. It copies at most size - 1 characters and always null-terminates the result. No matter how large src is, dst will never overflow.

snprintf(buf, size, fmt, ...) — The n in snprintf stands for "n-limited." It writes at most size bytes to buf, including the null terminator. This makes it safe for both formatting and simple string copying.

The key insight is using sizeof(staging_buffer) directly as the size argument. This ties the bound to the actual allocation, so if the buffer size ever changes, the limit automatically adjusts — no magic numbers to forget to update.

The Security Invariant

The fix enforces a clear, testable invariant:

For any input of any size S, and any buffer of declared size N:
  bytes_written <= N
  AND result is null-terminated
  AND no memory outside [buffer, buffer+N) is accessed

This invariant is verified by the regression test suite, which throws payloads of up to 40,960 bytes at the function and confirms the output is always bounded to the declared size.

Prevention & Best Practices

1. Treat All Unsafe C Functions as Banned by Default

Adopt a banned function list for your C/C++ projects. Microsoft's Security Development Lifecycle (SDL) publishes a well-known list. At minimum, flag these for review:

❌ strcpy  → ✅ strlcpy / strncpy (with explicit null-term)
❌ strcat  → ✅ strlcat / strncat
❌ sprintf → ✅ snprintf
❌ gets    → ✅ fgets
❌ scanf("%s", ...) → ✅ scanf("%255s", ...) with explicit width

2. Always Use `sizeof` — Not Hardcoded Numbers

// ❌ Fragile — breaks silently if buffer size changes
strncpy(buf, src, 255);

// ✅ Robust — automatically tracks the actual allocation
strncpy(buf, src, sizeof(buf) - 1);
buf[sizeof(buf) - 1] = '\0';  // Ensure null termination

3. Enable Compiler Hardening Flags

Modern compilers can catch many of these issues at compile time or add runtime protection:

CFLAGS += -Wall -Wextra           # Enable all warnings
CFLAGS += -Wformat-security       # Warn on unsafe format strings
CFLAGS += -D_FORTIFY_SOURCE=2     # Runtime buffer overflow detection
CFLAGS += -fstack-protector-strong # Stack canaries
CFLAGS += -fsanitize=address      # AddressSanitizer (development/CI)

4. Use Static Analysis Tools

Semgrep (the scanner that caught this vulnerability) is excellent for codebases of any size. Add it to your CI pipeline:

# .github/workflows/security.yml
- name: Run Semgrep
  uses: returntocorp/semgrep-action@v1
  with:
    config: >-
      p/c
      p/security-audit

Other tools worth integrating:

Tool	Type	Best For
Semgrep	SAST	Fast, rule-based pattern matching
Coverity	SAST	Deep interprocedural analysis
AddressSanitizer	Dynamic	Runtime memory error detection
Valgrind	Dynamic	Memory leak and access checking
CodeQL	SAST	Semantic code analysis, GitHub-native

5. Write Security Invariant Tests

The regression test included with this fix is a great model. It defines a clear invariant and tests it against a broad range of adversarial inputs — including format string payloads, null bytes, and inputs 100x larger than the buffer:

def test_buffer_reads_never_exceed_declared_length(payload):
    """
    Invariant: bounded_copy(payload, N) must always produce
    output of length <= N, regardless of input size.
    """
    declared_length = 256
    result = bounded_copy(payload, declared_length)

    assert len(result) <= declared_length, (
        f"VIOLATION: read {len(result)} bytes, max was {declared_length}"
    )

Writing tests that explicitly name and verify security invariants makes them first-class citizens in your test suite — not afterthoughts.

6. Consider Memory-Safe Languages for New Code

For new projects or when rewriting components, languages like Rust, Go, or Swift eliminate entire classes of memory safety bugs at the language level. Rust in particular makes buffer overflows essentially impossible without unsafe blocks. This is worth considering when the cost of a vulnerability is high.

Relevant Standards and References

CWE-120: Buffer Copy Without Checking Size of Input ("Classic Buffer Overflow")
CWE-121: Stack-based Buffer Overflow
OWASP: Buffer Overflow
CERT C Coding Standard: STR31-C (Guarantee that storage for strings has sufficient space)
NIST NVD: Tracks thousands of real-world CVEs rooted in this exact class of bug

Conclusion

Buffer overflows in C are a solved problem — not in the sense that they no longer occur, but in the sense that we have well-understood tools, techniques, and language features to prevent them. The fix here is straightforward: replace strcpy with strlcpy, replace sprintf with snprintf, and always pass the size of the destination buffer.

What makes this vulnerability interesting is the context: it appeared in a cryptographic utility file, where the irony of undermining security through an insecure implementation is particularly sharp. MD5 may be a legacy algorithm, but the code handling it still needs to be hardened.

The key takeaways:

Never use unbounded string functions in C — treat them as deprecated
Always bound copies to the destination size, using sizeof to avoid magic numbers
Automate detection with static analysis tools like Semgrep in your CI pipeline
Write invariant-based tests that verify security properties explicitly
Enable compiler hardening flags to add defense-in-depth at the binary level

Security isn't a feature you add at the end — it's a property you maintain throughout the lifetime of your code. One unsafe strcpy in a utility function is all it takes to unravel the security of a much larger system.

This vulnerability was automatically detected and fixed by OrbisAI Security. Automated security scanning, AI-assisted remediation, and regression test generation — built for engineering teams who ship fast without compromising on safety.

cwe	CWE-120
fix	Replace strcpy/strcat/sprintf with strncpy/strncat/snprintf using explicit buffer size limits
risk	Arbitrary code execution, memory corruption, process crash
language	C
root cause	Use of strcpy() and other unbounded string functions without size validation
vulnerability	Buffer Overflow via unsafe string functions

Buffer Overflow in C: How Unsafe strcpy Almost Broke Everything

Answer Summary

Vulnerability at a Glance

Buffer Overflow in C: How Unsafe `strcpy` Almost Broke Everything

Introduction

The Vulnerability Explained

What Is a Buffer Overflow?

How Does It Work in Practice?

A Concrete Attack Scenario

The Fix

Before (Vulnerable)

After (Fixed)

Why This Works

The Security Invariant

Prevention & Best Practices

1. Treat All Unsafe C Functions as Banned by Default

2. Always Use `sizeof` — Not Hardcoded Numbers

3. Enable Compiler Hardening Flags

4. Use Static Analysis Tools

5. Write Security Invariant Tests

6. Consider Memory-Safe Languages for New Code

Relevant Standards and References

Conclusion

Frequently Asked Questions

What is a buffer overflow vulnerability?

How do you prevent buffer overflow in C?

What CWE is buffer overflow?

Is using strncpy() enough to prevent buffer overflow?

Can static analysis detect buffer overflow?

View the Security Fix

Related Articles

How buffer overflow happens in C tar header parsing and how to fix it

How buffer overflow happens in C ieee80211_input() and how to fix it

How buffer overflow from unsafe string copy functions happens in C network interface code and how to fix it

How buffer overflow in FuzzIxml.c sprintf() happens in C and how to fix it

How buffer overflow happens in C HTML parsing and how to fix it

How buffer overflow in memcpy() happens in Node.js N-API bindings and how to fix it

Buffer Overflow in C: How Unsafe strcpy Almost Broke Everything

Answer Summary

Vulnerability at a Glance

Buffer Overflow in C: How Unsafe strcpy Almost Broke Everything

Introduction

The Vulnerability Explained

What Is a Buffer Overflow?

How Does It Work in Practice?

A Concrete Attack Scenario

The Fix

Before (Vulnerable)

After (Fixed)

Why This Works

The Security Invariant

Prevention & Best Practices

1. Treat All Unsafe C Functions as Banned by Default

2. Always Use sizeof — Not Hardcoded Numbers

3. Enable Compiler Hardening Flags

4. Use Static Analysis Tools

5. Write Security Invariant Tests

6. Consider Memory-Safe Languages for New Code

Relevant Standards and References

Conclusion

Frequently Asked Questions

What is a buffer overflow vulnerability?

How do you prevent buffer overflow in C?

What CWE is buffer overflow?

Is using strncpy() enough to prevent buffer overflow?

Can static analysis detect buffer overflow?

View the Security Fix

Related Articles

How buffer overflow happens in C tar header parsing and how to fix it

How buffer overflow happens in C ieee80211_input() and how to fix it

How buffer overflow from unsafe string copy functions happens in C network interface code and how to fix it

How buffer overflow in FuzzIxml.c sprintf() happens in C and how to fix it

How buffer overflow happens in C HTML parsing and how to fix it

How buffer overflow in memcpy() happens in Node.js N-API bindings and how to fix it

Buffer Overflow in C: How Unsafe `strcpy` Almost Broke Everything

2. Always Use `sizeof` — Not Hardcoded Numbers