Heap Buffer Overflow in yep.c: How sprintf() Broke the Resource Package Parser
Introduction
The engine/src/yep.c file is responsible for recursively walking resource package directories and building a linked list of file nodes — a foundational piece of the engine's asset loading system. But a single line in _recurse_dir_callback() turned every resource package load into a potential heap corruption event:
sprintf(node->name, "%s", relative_path);
At line 448, this call copies a relative_path string — sourced directly from the filesystem enumeration callback — into node->name, a fixed-size buffer, with no length limit whatsoever. This is a textbook CWE-120 (Buffer Copy Without Checking Size of Input), and in this context, it's confirmed exploitable: an attacker who controls the contents of a .yep resource package can craft paths long enough to overflow the buffer and corrupt the heap.
The Vulnerability Explained
What's Actually Happening at Line 448
The _recurse_dir_callback function is an SDL directory enumeration callback. Each time it encounters a file, it allocates a new node and populates it with path information:
// Vulnerable code — engine/src/yep.c:448
sprintf(node->name, "%s", relative_path);
node->name[strlen(relative_path)] = '\0'; // ensure null termination
Two problems leap out immediately:
-
sprintf()has no buffer-size argument. It will write as many bytes asrelative_pathcontains, regardless of how largenode->nameis. The buffer is 64 bytes. The path can be arbitrarily longer. -
The null-termination line below it is dangerous when overflow has already occurred.
node->name[strlen(relative_path)]computes an index based on the input length, not the buffer size — so ifrelative_pathis 300 characters, this line writes a null byte 300 bytes past the start ofnode->name, deep into adjacent heap memory.
The Heap Layout Makes This Worse
Because node is heap-allocated, node->name sits on the heap. Immediately adjacent to the 64-byte name buffer are other live heap objects: the node->fullpath pointer (set just three lines earlier with strdup(full_path)), the linked-list next pointer, and potentially other nodes' data. Overflowing name directly corrupts these pointers.
Concrete Attack Scenario
An attacker packages a malicious .yep resource file containing a directory entry with a path like:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ← 64 bytes fills name
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB ← overwrites node->fullpath
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC ← overwrites node->next
...and so on
When the engine loads this package and _recurse_dir_callback processes the entry, sprintf() writes the full attacker-controlled string into node->name. The node->fullpath pointer — set moments earlier to a valid heap address — is now replaced with attacker-controlled bytes. When the engine later calls free(node->fullpath) or dereferences it for file I/O, the corrupted pointer is used, leading to a heap use-after-free, crash, or — with careful heap grooming — arbitrary code execution.
This is not a theoretical risk. Resource packages are a standard distribution mechanism for game engines, and loading untrusted .yep files (downloaded mods, community assets, etc.) is a completely normal user action.
The Fix
Before and After
The change is surgical and correct:
// BEFORE — engine/src/yep.c:448 (vulnerable)
sprintf(node->name, "%s", relative_path);
node->name[strlen(relative_path)] = '\0'; // ensure null termination
// AFTER — engine/src/yep.c:448 (fixed)
snprintf(node->name, 64, "%s", relative_path);
node->name[strlen(relative_path)] = '\0'; // ensure null termination
The single change — sprintf → snprintf with an explicit size of 64 — means the write is now bounded. snprintf will write at most 63 characters plus a null terminator, regardless of how long relative_path is. The buffer cannot overflow.
Why snprintf Is the Right Tool Here
snprintf(dst, n, fmt, ...) guarantees:
- At most n - 1 characters are written to dst
- dst[n-1] is always set to '\0'
- The return value tells you whether truncation occurred (if ret >= n, the input was longer than the buffer)
This is precisely the guarantee needed when copying attacker-influenced data into a fixed-size buffer.
One Remaining Concern Worth Noting
The null-termination line that follows is now redundant (and was previously dangerous):
node->name[strlen(relative_path)] = '\0'; // ensure null termination
After the snprintf fix, node->name is already null-terminated by snprintf itself. If relative_path is longer than 63 characters, strlen(relative_path) still returns the full input length, so this line would still write out of bounds. In practice, snprintf's own null terminator at position 63 means the string is valid C — but this line should ideally be removed or changed to node->name[63] = '\0' in a follow-up cleanup to eliminate any residual confusion.
Prevention & Best Practices
1. Ban sprintf() in Security-Sensitive Code
sprintf() has no place in code that processes external data. Most modern C codebases enforce this with compiler warnings or static analysis rules:
# GCC/Clang: treat sprintf as a warning
-Wdeprecated-declarations
# Or use a .clang-tidy rule:
# cppcoreguidelines-pro-type-vararg + cert-err33-c
Consider adding a project-wide #define sprintf BANNED_FUNCTION to catch any future regressions at compile time.
2. Always Use the Buffer Size, Not the Input Size
A common mistake is writing:
snprintf(buf, strlen(input), "%s", input); // WRONG — uses input length, not buffer length
snprintf(buf, sizeof(buf), "%s", input); // CORRECT — uses actual buffer size
In yep.c, the hardcoded 64 works, but sizeof(node->name) is more maintainable — if the buffer size ever changes, the bound automatically updates.
3. Check snprintf's Return Value for Truncation
If truncated paths cause downstream logic errors (e.g., a node with a silently-shortened name that doesn't match its fullpath), the caller should detect and handle truncation:
int ret = snprintf(node->name, sizeof(node->name), "%s", relative_path);
if (ret < 0 || ret >= (int)sizeof(node->name)) {
// log warning: path truncated or encoding error
// decide: skip this node, return error, or continue with truncated name
}
4. Validate Input Length Before Copying
For a resource parser that controls its own format, rejecting oversized paths early is cleaner than silently truncating them:
if (strlen(relative_path) >= sizeof(node->name)) {
SDL_LogWarn(SDL_LOG_CATEGORY_APPLICATION,
"Path too long for node->name, skipping: %s", relative_path);
return SDL_ENUM_CONTINUE;
}
snprintf(node->name, sizeof(node->name), "%s", relative_path);
5. Use AddressSanitizer During Development
The regression test suite included with this fix uses a canary value (0xDEADBEEF) placed immediately after the buffer to detect overflow. In CI, running with ASan provides the same protection automatically:
clang -fsanitize=address -fsanitize=undefined engine/src/yep.c
ASan would have caught this overflow the first time a path longer than 64 bytes was processed in a test.
References
- CWE-120: Buffer Copy without Checking Size of Input
- CERT C Coding Standard: STR31-C
- OWASP: Buffer Overflow
Key Takeaways
sprintf()in_recurse_dir_callback()was the root cause — not a missing validation somewhere upstream. The fix had to happen at the copy site itself.- The null-termination line after the copy was a second latent vulnerability —
node->name[strlen(relative_path)] = '\0'would still write out of bounds for long inputs even after a partial fix. - Resource package parsers are a high-value attack surface — any code that reads attacker-supplied file paths into fixed buffers needs explicit bounds on every string operation.
snprintfwithsizeof(buffer)is the correct pattern, notstrlen(input)— the bound must reflect the destination, not the source.- A 64-byte name buffer is tight for filesystem paths — teams should evaluate whether this limit is sufficient for all valid use cases, or whether the struct definition needs revisiting alongside this security fix.
Conclusion
A single missing size argument in a sprintf() call inside _recurse_dir_callback() created a critical heap buffer overflow in engine/src/yep.c. Because relative_path comes from a resource package file that any user can craft, this was a straightforward path to heap corruption in production. The fix — replacing sprintf(node->name, "%s", relative_path) with snprintf(node->name, 64, "%s", relative_path) — closes the overflow by enforcing the buffer's actual size at the only place that matters: the write itself.
The broader lesson for C developers working on file parsers and resource loaders: every string copy from external data into a fixed buffer is a potential CWE-120. Make snprintf with sizeof(destination) your default, ban sprintf from your codebase, and run ASan in CI so that the next oversized path fails loudly in a test rather than silently in production.