Back to Blog
critical SEVERITY7 min read

Heap Buffer Overflow in yep.c: How sprintf() Broke the Resource Package Parser

A critical heap buffer overflow was discovered in `engine/src/yep.c` at line 448, where `sprintf()` copied an attacker-controlled file path into a fixed 64-byte `node->name` buffer with zero bounds checking. By crafting a malicious resource package file containing an oversized path, an attacker could corrupt adjacent heap memory — potentially enabling arbitrary code execution. The fix replaces the unbounded `sprintf()` call with `snprintf()`, enforcing the 64-byte limit at the call site.

O
By Orbis AppSec
Published June 1, 2026Reviewed June 3, 2026

Answer Summary

This is a heap buffer overflow vulnerability (CWE-122) in C, found in `engine/src/yep.c` at line 448. The root cause is an unbounded `sprintf()` call that copies an attacker-controlled file path into a fixed 64-byte `node->name` buffer without any length validation. The fix replaces `sprintf()` with `snprintf(node->name, 64, ...)`, which enforces the buffer size limit at the call site and prevents heap memory corruption.

Vulnerability at a Glance

cweCWE-122
fixReplace `sprintf()` with `snprintf()` specifying the 64-byte buffer limit
riskHeap memory corruption leading to potential arbitrary code execution
languageC
root cause`sprintf()` copies attacker-controlled path data into a fixed 64-byte buffer with no bounds checking
vulnerabilityHeap Buffer Overflow

Heap Buffer Overflow in yep.c: How sprintf() Broke the Resource Package Parser

Introduction

The engine/src/yep.c file is responsible for recursively walking resource package directories and building a linked list of file nodes — a foundational piece of the engine's asset loading system. But a single line in _recurse_dir_callback() turned every resource package load into a potential heap corruption event:

sprintf(node->name, "%s", relative_path);

At line 448, this call copies a relative_path string — sourced directly from the filesystem enumeration callback — into node->name, a fixed-size buffer, with no length limit whatsoever. This is a textbook CWE-120 (Buffer Copy Without Checking Size of Input), and in this context, it's confirmed exploitable: an attacker who controls the contents of a .yep resource package can craft paths long enough to overflow the buffer and corrupt the heap.


The Vulnerability Explained

What's Actually Happening at Line 448

The _recurse_dir_callback function is an SDL directory enumeration callback. Each time it encounters a file, it allocates a new node and populates it with path information:

// Vulnerable code — engine/src/yep.c:448
sprintf(node->name, "%s", relative_path);
node->name[strlen(relative_path)] = '\0'; // ensure null termination

Two problems leap out immediately:

  1. sprintf() has no buffer-size argument. It will write as many bytes as relative_path contains, regardless of how large node->name is. The buffer is 64 bytes. The path can be arbitrarily longer.

  2. The null-termination line below it is dangerous when overflow has already occurred. node->name[strlen(relative_path)] computes an index based on the input length, not the buffer size — so if relative_path is 300 characters, this line writes a null byte 300 bytes past the start of node->name, deep into adjacent heap memory.

The Heap Layout Makes This Worse

Because node is heap-allocated, node->name sits on the heap. Immediately adjacent to the 64-byte name buffer are other live heap objects: the node->fullpath pointer (set just three lines earlier with strdup(full_path)), the linked-list next pointer, and potentially other nodes' data. Overflowing name directly corrupts these pointers.

Concrete Attack Scenario

An attacker packages a malicious .yep resource file containing a directory entry with a path like:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA   64 bytes fills name
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB   overwrites node->fullpath
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC   overwrites node->next
...and so on

When the engine loads this package and _recurse_dir_callback processes the entry, sprintf() writes the full attacker-controlled string into node->name. The node->fullpath pointer — set moments earlier to a valid heap address — is now replaced with attacker-controlled bytes. When the engine later calls free(node->fullpath) or dereferences it for file I/O, the corrupted pointer is used, leading to a heap use-after-free, crash, or — with careful heap grooming — arbitrary code execution.

This is not a theoretical risk. Resource packages are a standard distribution mechanism for game engines, and loading untrusted .yep files (downloaded mods, community assets, etc.) is a completely normal user action.


The Fix

Before and After

The change is surgical and correct:

// BEFORE — engine/src/yep.c:448 (vulnerable)
sprintf(node->name, "%s", relative_path);
node->name[strlen(relative_path)] = '\0'; // ensure null termination

// AFTER — engine/src/yep.c:448 (fixed)
snprintf(node->name, 64, "%s", relative_path);
node->name[strlen(relative_path)] = '\0'; // ensure null termination

The single change — sprintfsnprintf with an explicit size of 64 — means the write is now bounded. snprintf will write at most 63 characters plus a null terminator, regardless of how long relative_path is. The buffer cannot overflow.

Why snprintf Is the Right Tool Here

snprintf(dst, n, fmt, ...) guarantees:
- At most n - 1 characters are written to dst
- dst[n-1] is always set to '\0'
- The return value tells you whether truncation occurred (if ret >= n, the input was longer than the buffer)

This is precisely the guarantee needed when copying attacker-influenced data into a fixed-size buffer.

One Remaining Concern Worth Noting

The null-termination line that follows is now redundant (and was previously dangerous):

node->name[strlen(relative_path)] = '\0'; // ensure null termination

After the snprintf fix, node->name is already null-terminated by snprintf itself. If relative_path is longer than 63 characters, strlen(relative_path) still returns the full input length, so this line would still write out of bounds. In practice, snprintf's own null terminator at position 63 means the string is valid C — but this line should ideally be removed or changed to node->name[63] = '\0' in a follow-up cleanup to eliminate any residual confusion.


Prevention & Best Practices

1. Ban sprintf() in Security-Sensitive Code

sprintf() has no place in code that processes external data. Most modern C codebases enforce this with compiler warnings or static analysis rules:

# GCC/Clang: treat sprintf as a warning
-Wdeprecated-declarations

# Or use a .clang-tidy rule:
# cppcoreguidelines-pro-type-vararg + cert-err33-c

Consider adding a project-wide #define sprintf BANNED_FUNCTION to catch any future regressions at compile time.

2. Always Use the Buffer Size, Not the Input Size

A common mistake is writing:

snprintf(buf, strlen(input), "%s", input);  // WRONG — uses input length, not buffer length
snprintf(buf, sizeof(buf), "%s", input);    // CORRECT — uses actual buffer size

In yep.c, the hardcoded 64 works, but sizeof(node->name) is more maintainable — if the buffer size ever changes, the bound automatically updates.

3. Check snprintf's Return Value for Truncation

If truncated paths cause downstream logic errors (e.g., a node with a silently-shortened name that doesn't match its fullpath), the caller should detect and handle truncation:

int ret = snprintf(node->name, sizeof(node->name), "%s", relative_path);
if (ret < 0 || ret >= (int)sizeof(node->name)) {
    // log warning: path truncated or encoding error
    // decide: skip this node, return error, or continue with truncated name
}

4. Validate Input Length Before Copying

For a resource parser that controls its own format, rejecting oversized paths early is cleaner than silently truncating them:

if (strlen(relative_path) >= sizeof(node->name)) {
    SDL_LogWarn(SDL_LOG_CATEGORY_APPLICATION,
        "Path too long for node->name, skipping: %s", relative_path);
    return SDL_ENUM_CONTINUE;
}
snprintf(node->name, sizeof(node->name), "%s", relative_path);

5. Use AddressSanitizer During Development

The regression test suite included with this fix uses a canary value (0xDEADBEEF) placed immediately after the buffer to detect overflow. In CI, running with ASan provides the same protection automatically:

clang -fsanitize=address -fsanitize=undefined engine/src/yep.c

ASan would have caught this overflow the first time a path longer than 64 bytes was processed in a test.

References


Key Takeaways

  • sprintf() in _recurse_dir_callback() was the root cause — not a missing validation somewhere upstream. The fix had to happen at the copy site itself.
  • The null-termination line after the copy was a second latent vulnerabilitynode->name[strlen(relative_path)] = '\0' would still write out of bounds for long inputs even after a partial fix.
  • Resource package parsers are a high-value attack surface — any code that reads attacker-supplied file paths into fixed buffers needs explicit bounds on every string operation.
  • snprintf with sizeof(buffer) is the correct pattern, not strlen(input) — the bound must reflect the destination, not the source.
  • A 64-byte name buffer is tight for filesystem paths — teams should evaluate whether this limit is sufficient for all valid use cases, or whether the struct definition needs revisiting alongside this security fix.

Conclusion

A single missing size argument in a sprintf() call inside _recurse_dir_callback() created a critical heap buffer overflow in engine/src/yep.c. Because relative_path comes from a resource package file that any user can craft, this was a straightforward path to heap corruption in production. The fix — replacing sprintf(node->name, "%s", relative_path) with snprintf(node->name, 64, "%s", relative_path) — closes the overflow by enforcing the buffer's actual size at the only place that matters: the write itself.

The broader lesson for C developers working on file parsers and resource loaders: every string copy from external data into a fixed buffer is a potential CWE-120. Make snprintf with sizeof(destination) your default, ban sprintf from your codebase, and run ASan in CI so that the next oversized path fails loudly in a test rather than silently in production.

Frequently Asked Questions

What is a heap buffer overflow?

A heap buffer overflow occurs when a program writes more data into a heap-allocated buffer than the buffer can hold, corrupting adjacent memory. In C, functions like sprintf() that do not enforce output length limits are a common cause.

How do you prevent heap buffer overflow in C?

Always use length-limited string functions such as snprintf() instead of sprintf(), validate input lengths before copying, and use static analysis tools to catch unbounded writes to fixed-size buffers.

What CWE is heap buffer overflow?

Heap buffer overflow is classified as CWE-122 (Heap-based Buffer Overflow), a subset of CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer).

Is input validation alone enough to prevent buffer overflow in C?

Input validation helps but is not sufficient on its own. You must also use bounds-enforcing functions like snprintf() at every copy site, because validation logic can be bypassed or missed in deeply nested parsing code.

Can static analysis detect heap buffer overflow from sprintf()?

Yes. Static analysis tools such as Semgrep, Coverity, and clang-analyzer can flag unbounded sprintf() calls writing to fixed-size buffers, especially when the source data originates from external input like file paths.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #21

Related Articles

high

How heap buffer overflow happens in C JMA archive extraction and how to fix it

A heap buffer overflow vulnerability in `jma/jma.cpp` allowed a crafted JMA ROM archive to trigger out-of-bounds memory writes during file extraction. The flaw existed at line 446, where `memcpy` was called with `first_chunk_offset` and `copy_amount` values derived directly from archive header metadata without any validation that those values stayed within the bounds of either the source or destination buffer. The fix adds a pre-copy bounds check that rejects malformed archives before the danger

critical

How unsafe buffer copying happens in C credential storage and how to fix it

A critical vulnerability in `lib/server.c` allowed attackers to trigger out-of-bounds memory reads when copying credentials via unsafe `memcpy()` calls. By replacing `memcpy()` with bounds-safe `strlcpy()`, the fix ensures credentials are safely stored without buffer overruns or null-termination issues.

critical

How buffer overflow happens in C Bluetooth device handling and how to fix it

A critical buffer overflow vulnerability in `src/wiiuse.c` allowed attackers within Bluetooth range to trigger heap corruption by sending specially crafted HID packets with oversized length values. The fix adds strict bounds checking to validate that data lengths don't exceed buffer capacity before performing memory operations, preventing exploitation by malicious or intercepted Bluetooth devices.

critical

How buffer overflow happens in C patches.c sprintf macros and how to fix it

A critical buffer overflow vulnerability was discovered in `src/patches.c` where the `_EPRINT_I`, `_EPRINT_F`, and `_EPRINT_COEF` macros used `sprintf()` to write formatted AMY event data into a fixed-size buffer without any bounds checking. By replacing every `sprintf()` call with `snprintf()` and tracking remaining buffer space using a `s_entry` base pointer, the fix ensures that formatting 22 event fields — even at maximum values — can never write beyond the buffer boundary.

critical

How buffer overflow happens in C dcraw_lz.c nikon_3700() and how to fix it

A critical buffer overflow vulnerability was discovered in `lightcrafts/coprocesses/dcraw/dcraw_lz.c` at line 1334, where the `nikon_3700()` function used `strcpy()` to copy camera make and model strings into fixed 64-byte buffers without any bounds checking. A crafted RAW image file with oversized make/model metadata could trigger a heap or stack corruption, potentially enabling arbitrary code execution. The fix replaces both `strcpy()` calls with `strncpy()` and explicit null-termination, enfo

critical

How buffer overflow in modxo_queue.c memcpy happens in C embedded systems and how to fix it

A critical buffer overflow vulnerability was discovered in `modxo/modxo_queue.c`, where two `memcpy` operations in the `modxo_queue_insert` and `modxo_queue_remove` functions used `queue->item_size` as the copy length without validating it against the destination buffer's bounds. If `item_size` was corrupted or maliciously set to an oversized value, both the enqueue (line 49) and dequeue (line 61) operations could overflow adjacent heap or stack memory on the embedded target. The fix adds bounds