What is a buffer overflow in C string operations?

A buffer overflow occurs when data is written beyond the allocated size of a buffer. In this case, `strcat()` and `sprintf()` concatenate filter strings without checking if the destination buffer has enough space, allowing attackers to overflow the buffer with crafted filter specifications.

How do you prevent buffer overflow in C string operations?

Always use bounds-checking functions like `snprintf()`, `strncat()`, or `strncpy()` that take a size parameter. Track the remaining buffer capacity and ensure all string operations respect those limits.

What CWE is this buffer overflow?

CWE-120 (Buffer Copy without Checking Size of Input) and CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer). This is a classic unsafe string handling vulnerability.

Is input validation alone enough to prevent this buffer overflow?

No. While input validation helps, the real fix is using size-aware functions. Even with validation, a sufficiently long valid filter specification could still overflow the buffer if the buffer size is fixed and too small.

Can static analysis detect this buffer overflow?

Yes. Modern static analysis tools can flag `strcat()` and `sprintf()` calls as dangerous when used with unbounded input, and can detect fixed-size buffers that may overflow. Orbis AppSec detects this pattern automatically.

Critical Buffer Overflow in NCO Filter String Construction: How `strcat()` Without Bounds Checking Can Corrupt Memory

Introduction

Buffer overflow vulnerabilities are among the oldest and most dangerous classes of security bugs in systems programming. Despite decades of awareness, they continue to appear in production codebases — often in places that seem innocuous at first glance, like a loop that builds a string. This post examines a critical severity buffer overflow discovered in the NetCDF Operators (NCO) library, a widely-used suite of tools for manipulating scientific data in NetCDF format.

The vulnerability lived inside nco_flt.c, the component responsible for parsing and constructing compression filter specification strings. A loop that iteratively called strcat() and sprintf() to build a composite filter string had no bounds checking whatsoever — a classic recipe for heap or stack memory corruption.

If you write C or C++, or if you maintain code that processes user-supplied data into fixed-size buffers, this post is directly relevant to you.

The Vulnerability Explained

What Is a Buffer Overflow?

A buffer overflow occurs when a program writes more data into a buffer than it was allocated to hold. The excess data spills into adjacent memory regions, potentially overwriting control structures, return addresses, or other variables. In the worst case, an attacker can craft input that places executable shellcode into memory or hijacks program control flow.

What Went Wrong in `nco_flt.c`

The vulnerable code lived in the nco_cmp_prs function, which parses user-provided compression specifications and assembles them into a standardized string format. Here's a simplified look at what the original loop was doing:

/* VULNERABLE CODE (before fix) */
cmp_sng_std = (char *)nco_malloc(NCO_FLT_SNG_LNG_MAX * sizeof(char));
cmp_sng_std[0] = '\0';

for (flt_idx = 0; flt_idx < flt_nbr; flt_idx++) {
    if (flt_alg[flt_idx] != nco_flt_unk) {
        // ⚠️ No bounds check — appends directly to buffer
        (void)strcat(cmp_sng_std, nco_flt_enm2nmid(flt_alg[flt_idx], NULL));
    } else {
        flt_nm_id[0] = '\0';
        // ⚠️ sprintf with no length limit
        (void)sprintf(flt_nm_id, "%u", flt_id[flt_idx]);
        // ⚠️ Again, no bounds check on destination
        (void)strcat(cmp_sng_std, flt_nm_id);
    }

    if (flt_prm_nbr[flt_idx] > 0)
        (void)strcat(cmp_sng_std, ","); // ⚠️ Unchecked

    int_sng[0] = '\0';
    for (prm_idx = 0; prm_idx < flt_prm_nbr[flt_idx]; prm_idx++) {
        // ⚠️ Writes to intermediate buffer, then strcat to main buffer
        (void)sprintf(int_sng, "%d%s", flt_prm[flt_idx][prm_idx],
                      prm_idx < flt_prm_nbr[flt_idx] - 1 ? "," : "");
    }
    (void)strcat(cmp_sng_std, int_sng); // ⚠️ Unchecked concatenation

    if (flt_idx < flt_nbr - 1)
        (void)strcat(cmp_sng_std, spr_sng); // ⚠️ Unchecked
}

Let's break down the specific problems:

1. `strcat()` Is Blind to Buffer Boundaries

The C standard library function strcat(dst, src) appends src to dst by scanning for the null terminator in dst and then copying bytes from src until it hits src's null terminator. It has no idea how large dst's allocated buffer is. Every single strcat() call here is a potential overflow if the accumulated string grows beyond NCO_FLT_SNG_LNG_MAX.

2. `sprintf()` Without a Length Limit

sprintf(buf, fmt, ...) writes formatted output to buf with no concept of how much space buf has. The bounded alternative, snprintf(), accepts a maximum byte count. Using sprintf() here means that even the intermediate buffer flt_nm_id could theoretically be overrun by a sufficiently large filter ID value.

3. The Loop Multiplies the Risk

Each iteration of the outer loop appends more data to cmp_sng_std. With enough filter entries (flt_nbr), enough parameters per filter (flt_prm_nbr), or long filter algorithm names, the cumulative writes will exceed NCO_FLT_SNG_LNG_MAX. There is no early exit, no length check, and no truncation — just unconditional appending.

How Could This Be Exploited?

NCO processes NetCDF files, which are commonly exchanged in scientific computing environments. An attacker could exploit this vulnerability in two ways:

Scenario 1: Crafted NetCDF File
An attacker crafts a NetCDF file with an unusually large number of filter specifications, or with filter algorithm names/IDs that are maximally long. When a victim processes this file with an NCO tool, the filter string construction loop overflows the buffer, potentially corrupting heap metadata or stack return addresses.

Scenario 2: Malicious Command-Line Input
A user (or automated pipeline script) passes a crafted -F filter specification string on the command line with many comma-separated filter parameters. Each iteration of the loop adds to the overflow.

Real-World Impact

Impact Category	Details
Memory Corruption	Heap or stack memory adjacent to `cmp_sng_std` gets overwritten
Crash / DoS	Corrupted heap metadata causes `malloc`/`free` to abort
Code Execution	In stack-based scenarios, return address overwrite enables arbitrary code execution
Data Integrity	Silent corruption may produce incorrect scientific output without crashing

This vulnerability is classified as CWE-121: Stack-based Buffer Overflow (or CWE-122 for heap-based, depending on how nco_malloc resolves) and carries a critical severity rating for good reason.

The Fix

What Changed

The fix is elegant and follows the industry-standard pattern for safe string construction in C: replace all strcat()/sprintf() calls with snprintf() and track the current write offset.

/* FIXED CODE (after patch) */
cmp_sng_std = (char *)nco_malloc(NCO_FLT_SNG_LNG_MAX * sizeof(char));
cmp_sng_std[0] = '\0';
size_t sng_off = 0; /* Current offset into cmp_sng_std */

for (flt_idx = 0; flt_idx < flt_nbr; flt_idx++) {
    if (flt_alg[flt_idx] != nco_flt_unk) {
        // ✅ snprintf with remaining capacity
        sng_off += (size_t)snprintf(cmp_sng_std + sng_off,
                                    NCO_FLT_SNG_LNG_MAX - sng_off,
                                    "%s",
                                    nco_flt_enm2nmid(flt_alg[flt_idx], NULL));
    } else {
        // ✅ Direct format into main buffer, bounded
        sng_off += (size_t)snprintf(cmp_sng_std + sng_off,
                                    NCO_FLT_SNG_LNG_MAX - sng_off,
                                    "%u", flt_id[flt_idx]);
    }

    if (flt_prm_nbr[flt_idx] > 0)
        sng_off += (size_t)snprintf(cmp_sng_std + sng_off,
                                    NCO_FLT_SNG_LNG_MAX - sng_off, ",");

    for (prm_idx = 0; prm_idx < flt_prm_nbr[flt_idx]; prm_idx++) {
        // ✅ Parameters written directly with bounds
        sng_off += (size_t)snprintf(cmp_sng_std + sng_off,
                                    NCO_FLT_SNG_LNG_MAX - sng_off,
                                    "%d%s",
                                    flt_prm[flt_idx][prm_idx],
                                    prm_idx < flt_prm_nbr[flt_idx] - 1 ? "," : "");
    }

    if (flt_idx < flt_nbr - 1)
        sng_off += (size_t)snprintf(cmp_sng_std + sng_off,
                                    NCO_FLT_SNG_LNG_MAX - sng_off,
                                    "%s", spr_sng);
}

Why This Fix Works

The `snprintf()` Guarantee

snprintf(buf, n, fmt, ...) writes at most n - 1 characters to buf and always null-terminates (when n > 0). Even if the formatted output would be 10,000 characters, snprintf() will write only as many characters as fit in the remaining space. The buffer cannot overflow.

The Offset Tracking Pattern

The key insight is the sng_off variable:

cmp_sng_std + sng_off   →  pointer to the next write position
NCO_FLT_SNG_LNG_MAX - sng_off  →  remaining bytes available

After each snprintf() call, sng_off is incremented by the number of characters written. The next call picks up exactly where the last one left off, and the capacity argument decreases accordingly. Once the buffer is full, NCO_FLT_SNG_LNG_MAX - sng_off reaches zero, and all subsequent snprintf() calls become no-ops (they write nothing but return the would-be length).

Note: When snprintf() returns a value ≥ the size argument, it means truncation occurred. Production-hardened code may want to check for this condition and emit a warning or error, rather than silently truncating the filter specification.

The Intermediate Buffer Is Eliminated

The original code used an intermediate buffer flt_nm_id / int_sng to format values before strcat()-ing them into the main buffer. The fix eliminates this two-step dance entirely — values are formatted directly into cmp_sng_std at the correct offset. This removes one entire class of potential bugs.

Before vs. After: Side-by-Side

Aspect	Before (Vulnerable)	After (Fixed)
String append function	`strcat()` — no bounds	`snprintf()` — bounded
Formatting function	`sprintf()` — no bounds	`snprintf()` — bounded
Bounds tracking	None	`sng_off` offset variable
Intermediate buffers	`flt_nm_id`, `int_sng`	Eliminated
Buffer overflow possible	✅ Yes	❌ No
Overflow on crafted input	✅ Yes	❌ No (truncation instead)

Prevention & Best Practices

1. Never Use `strcat()` or `sprintf()` on Fixed Buffers

These functions are considered unsafe in modern C development. Replace them:

Unsafe	Safe Alternative
`strcat(dst, src)`	`strncat(dst, src, n)` or `snprintf()`
`sprintf(buf, fmt, ...)`	`snprintf(buf, size, fmt, ...)`
`strcpy(dst, src)`	`strncpy(dst, src, n)` or `strlcpy()`
`gets(buf)`	`fgets(buf, size, stream)`

Many compilers and static analyzers will warn about these unsafe functions. Enable those warnings (-Wall -Wformat -Wformat-overflow in GCC/Clang).

2. Track Your Write Position

When building strings in a loop, always maintain an offset variable:

size_t offset = 0;
size_t remaining = BUFFER_SIZE;

for (int i = 0; i < count && remaining > 0; i++) {
    int written = snprintf(buf + offset, remaining, "%s,", items[i]);
    if (written > 0) {
        offset += (size_t)written;
        remaining = (offset < BUFFER_SIZE) ? BUFFER_SIZE - offset : 0;
    }
}

3. Consider Dynamic Allocation for Variable-Length Output

If the maximum output size is genuinely unknown, consider building the string dynamically:

// Using a growing buffer (pseudo-code)
char *result = NULL;
size_t result_len = 0;
FILE *stream = open_memstream(&result, &result_len);
for (int i = 0; i < count; i++) {
    fprintf(stream, "%s,", items[i]);
}
fclose(stream);
// result now holds the full string, dynamically allocated

Or use a string-building library appropriate to your platform.

4. Use Static Analysis Tools

Several tools can detect these vulnerabilities automatically:

Coverity — Detects buffer overflows, unsafe string functions
AddressSanitizer (ASan) — Runtime detection of buffer overflows (-fsanitize=address)
Valgrind — Memory error detection at runtime
Clang Static Analyzer — Compile-time analysis
Semgrep — Pattern-based detection of unsafe C functions
CodeQL — Semantic code analysis for security vulnerabilities

A simple Semgrep rule to catch strcat usage:

rules:
  - id: unsafe-strcat
    patterns:
      - pattern: strcat($DST, $SRC)
    message: "Unsafe strcat() call. Use snprintf() with bounds tracking instead."
    languages: [c, cpp]
    severity: ERROR

5. Compiler Hardening Flags

Enable compiler protections that can mitigate (though not prevent) buffer overflows:

# Stack canaries — detect stack overflows at runtime
-fstack-protector-strong

# Fortify source — adds bounds checking to string functions
-D_FORTIFY_SOURCE=2

# Position-independent executables — makes ROP harder
-fPIE -pie

# Full RELRO — hardens GOT against overwrites
-Wl,-z,relro,-z,now

6. Relevant Security Standards

CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer
CWE-121: Stack-based Buffer Overflow
CWE-122: Heap-based Buffer Overflow
CERT C Coding Standard STR31-C: Guarantee sufficient storage for strings
OWASP Buffer Overflow: OWASP guidance on buffer overflow prevention

Conclusion

This vulnerability is a textbook example of how a seemingly routine string-building loop can harbor a critical security flaw. The original code wasn't written by careless developers — it was a natural pattern in C that predates widespread awareness of buffer overflow risks. But strcat() and sprintf() without bounds checking are always dangerous when the input is of variable or attacker-controlled length.

The fix is clean, minimal, and idiomatic: replace unsafe functions with snprintf(), track the write offset, and pass the remaining capacity on every call. This pattern costs almost nothing in performance but provides a hard guarantee that the buffer cannot be overrun.

Key Takeaways

✅ Never use strcat() or sprintf() on fixed-size buffers when input length is variable
✅ Use snprintf() with explicit size limits and track your write offset
✅ Eliminate intermediate buffers when you can write directly to the destination
✅ Enable compiler warnings and sanitizers to catch these issues early
✅ Process external data (files, command-line args) with extra scrutiny — it's attacker-controlled
✅ Automated security scanning can catch these patterns before they reach production

Buffer overflows have been exploited since the Morris Worm of 1988. More than 35 years later, they remain in the OWASP Top 10 and the CWE Top 25. The only way to eliminate them is through disciplined use of bounds-aware APIs, automated tooling, and a security-conscious code review culture.

This vulnerability was identified and patched via automated security scanning. Automated tools like these help catch memory safety issues at scale — but they work best when paired with developer education about why these patterns are dangerous in the first place.

Found a security issue in your codebase? Consider integrating static analysis into your CI/CD pipeline as a first line of defense.

cwe	CWE-120 (Buffer Copy without Checking Size of Input)
fix	Replace with snprintf() that tracks remaining buffer capacity
risk	Remote code execution, denial of service, heap memory corruption
language	C
root cause	Use of strcat() and sprintf() without buffer bounds checking in filter parsing loop
vulnerability	Buffer overflow in filter string construction via unbounded strcat()/sprintf()

Critical Buffer Overflow in NCO Filter String Construction: How strcat() Without Bounds Checking Can Corrupt Memory

Answer Summary

Vulnerability at a Glance

Critical Buffer Overflow in NCO Filter String Construction: How strcat() Without Bounds Checking Can Corrupt Memory

Introduction

The Vulnerability Explained

What Is a Buffer Overflow?

What Went Wrong in nco_flt.c

1. strcat() Is Blind to Buffer Boundaries

2. sprintf() Without a Length Limit