How Integer Overflow in Buffer Size Calculations Happens in C and How to Fix It
Introduction
The find_config_path() function in src/api.c is responsible for one of the most routine tasks in systems programming: constructing a filesystem path from environment variables and a known suffix. It reads APPDATA on Windows, HOME on macOS, or XDG_DATA_HOME on Linux, appends a constant path suffix (DBC_PATH_SUFFIX), and returns the result. Simple, right?
Not quite. Hidden inside three separate buffer size calculations was a classic integer overflow waiting to be triggered. The vulnerable pattern looked like this:
size_t n = strlen(base) + strlen(DBC_PATH_SUFFIX) + 1;
char *path = (char *)malloc(n);
If an attacker can control the value of base — and they can, because base is derived from environment variables — they can supply a string long enough to make strlen(base) + strlen(DBC_PATH_SUFFIX) + 1 wrap around SIZE_MAX back to a small number. malloc() happily allocates that small buffer, returns a non-NULL pointer, and the subsequent strcpy or snprintf blows right past the end of it.
This is the kind of bug that passes code review because the logic looks correct. You're adding two lengths and a null terminator — what could go wrong? In C, quite a lot.
The Vulnerability Explained
What Actually Goes Wrong
On a 64-bit system, size_t can hold values up to SIZE_MAX (typically 18446744073709551615). If strlen(base) returns a value close to SIZE_MAX, adding even a small number to it causes the result to wrap around to near zero. This is integer overflow.
Here's the specific vulnerable code from src/api.c before the fix, appearing in three separate locations within find_config_path():
Location 1 (Windows path, ~line 53):
// VULNERABLE: no overflow check
size_t n = strlen(base) + strlen(DBC_PATH_SUFFIX) + 1;
char *path = (char *)malloc(n);
Location 2 (HOME-based path, ~line 74):
// VULNERABLE: no overflow check
size_t n = strlen(home) + strlen(DBC_PATH_SUFFIX) + 1;
char *path = (char *)malloc(n);
Location 3 (XDG path with prefix + infix, ~line 94):
// VULNERABLE: three-term addition with no overflow check
size_t n = strlen(prefix) + strlen(infix) + strlen(DBC_PATH_SUFFIX) + 1;
char *path = (char *)malloc(n);
Notice that Location 3 is the most dangerous: it adds three strlen() results together. Each addition is an independent opportunity for overflow.
The Attack Scenario
An attacker with the ability to set environment variables (a realistic capability in many deployment contexts — CI/CD pipelines, containerized environments, or any scenario where a less-privileged process can influence the environment of a more-privileged one) sets HOME to a string of length SIZE_MAX - 5. When find_config_path() runs:
strlen(home)returnsSIZE_MAX - 5strlen(DBC_PATH_SUFFIX)returns, say,10- The addition:
(SIZE_MAX - 5) + 10 + 1 = SIZE_MAX + 6→ wraps to5 malloc(5)succeeds and returns a valid 5-byte buffer- The subsequent path construction writes tens of bytes into a 5-byte buffer
- Heap corruption
On a real system, HOME values this long are unusual but not impossible to construct programmatically. In contexts where this library is consumed by a Node.js application (as noted in the PR's threat model), the attack surface for environment variable manipulation is broader.
Why the NULL Check Doesn't Save You
A common misconception is that checking if (!path) { return nullptr; } after malloc() is a sufficient safety net. It isn't. When the overflow produces a small non-zero value like 5, malloc(5) succeeds. The pointer is valid. The NULL check passes. The overflow happens silently.
The Fix
The fix introduces explicit overflow guards before every buffer size calculation, using SIZE_MAX as the upper bound. It also switches from malloc() to calloc() for the wide-character buffer to ensure zero-initialization.
Before and After: The Three Fixes
Fix 1 — Windows path (lines 53–58):
// BEFORE
size_t n = strlen(base) + strlen(DBC_PATH_SUFFIX) + 1;
// AFTER
size_t base_len = strlen(base);
if (base_len > SIZE_MAX - sizeof(DBC_PATH_SUFFIX)) {
free(base);
return nullptr;
}
size_t n = base_len + sizeof(DBC_PATH_SUFFIX);
Fix 2 — HOME-based path (lines 74–79):
// BEFORE
size_t n = strlen(home) + strlen(DBC_PATH_SUFFIX) + 1;
// AFTER
size_t home_len = strlen(home);
if (home_len > SIZE_MAX - sizeof(DBC_PATH_SUFFIX)) {
return nullptr; // LCOV_EXCL_LINE
}
size_t n = home_len + sizeof(DBC_PATH_SUFFIX);
Fix 3 — XDG three-term path (lines 104–110):
// BEFORE
size_t n = strlen(prefix) + strlen(infix) + strlen(DBC_PATH_SUFFIX) + 1;
// AFTER
size_t prefix_len = strlen(prefix);
size_t infix_len = strlen(infix);
if (prefix_len > SIZE_MAX - infix_len - sizeof(DBC_PATH_SUFFIX)) {
return nullptr; // LCOV_EXCL_LINE
}
size_t n = prefix_len + infix_len + sizeof(DBC_PATH_SUFFIX);
Fix 4 — Wide-character buffer (line 41):
// BEFORE
wchar_t *wbase = (wchar_t *)malloc((size_t)needed * sizeof(wchar_t));
// AFTER
wchar_t *wbase = (wchar_t *)calloc((size_t)needed, sizeof(wchar_t));
Why These Changes Work
The SIZE_MAX guard pattern is the idiomatic C way to check for size_t overflow before addition. By checking if (base_len > SIZE_MAX - sizeof(DBC_PATH_SUFFIX)), you're asking: "would adding sizeof(DBC_PATH_SUFFIX) to base_len exceed SIZE_MAX?" If yes, return safely. If no, the addition is safe to perform.
sizeof() instead of strlen() for the suffix is a subtle but important improvement. DBC_PATH_SUFFIX is a compile-time string literal. sizeof(DBC_PATH_SUFFIX) includes the null terminator and is evaluated at compile time, eliminating the runtime strlen() call and the need for the explicit + 1. This reduces the number of terms in the addition and makes the intent clearer.
calloc() instead of malloc() for the wide-character buffer ensures the allocated memory is zero-initialized. This prevents potential reads of uninitialized memory if the subsequent GetEnvironmentVariableW call doesn't fully populate the buffer.
The #include <stdint.h> added at the top of the file ensures that SIZE_MAX is available (it's defined in <stdint.h> and <limits.h>).
Prevention & Best Practices
1. Always Guard size_t Arithmetic Before malloc()
The canonical pattern for two-term addition:
if (a > SIZE_MAX - b) {
// overflow would occur
return NULL;
}
size_t n = a + b;
For three terms:
if (a > SIZE_MAX - b - c) {
return NULL;
}
size_t n = a + b + c;
2. Prefer sizeof() for Compile-Time String Literals
When appending a known constant suffix, use sizeof(SUFFIX) instead of strlen(SUFFIX) + 1. It's evaluated at compile time, includes the null terminator automatically, and reduces runtime arithmetic.
3. Use Safe String Functions
Functions like snprintf() with a known buffer size, or POSIX asprintf() which allocates the correct buffer automatically, eliminate manual size calculation entirely:
// asprintf handles allocation for you
char *path = NULL;
if (asprintf(&path, "%s%s", base, DBC_PATH_SUFFIX) == -1) {
return NULL;
}
4. Enable Compiler Warnings and Sanitizers
- Compile with
-Wall -Wextra -Wformatto catch common issues - Use AddressSanitizer (
-fsanitize=address) during testing to catch heap overflows at runtime - Use UndefinedBehaviorSanitizer (
-fsanitize=undefined) to catch integer overflows at runtime
5. Reference Standards
- CWE-120: Buffer Copy without Checking Size of Input — https://cwe.mitre.org/data/definitions/120.html
- CWE-190: Integer Overflow or Wraparound — https://cwe.mitre.org/data/definitions/190.html
- CWE-131: Incorrect Calculation of Buffer Size — https://cwe.mitre.org/data/definitions/131.html
- OWASP: Review the OWASP C-Based Toolchain Hardening Cheat Sheet
Key Takeaways
- Never add
strlen()results directly beforemalloc()without aSIZE_MAXguard — this pattern appeared three times infind_config_path()and each instance was independently exploitable. - Environment variables are attacker-controlled input —
HOME,APPDATA, andXDG_DATA_HOMEinfind_config_path()are all user-controllable, making this a realistic attack vector rather than a theoretical one. - A successful
malloc()return does not mean your size calculation was correct — overflow to a small non-zero value will produce a valid but undersized pointer that bypasses NULL checks. sizeof()is safer thanstrlen()for compile-time constants — switching fromstrlen(DBC_PATH_SUFFIX) + 1tosizeof(DBC_PATH_SUFFIX)eliminates a runtime calculation and makes overflow arithmetic simpler to reason about.calloc()overmalloc()for buffers that must be initialized — the switch for thewchar_tbuffer in the Windows code path removes a potential uninitialized-read risk at negligible cost.
How Orbis AppSec Detected This
- Source: Attacker-controlled environment variables (
APPDATA,HOME,XDG_DATA_HOME) read into thebase,home, andprefixvariables insidefind_config_path()insrc/api.c - Sink:
malloc(n)at lines ~56, ~77, and ~107, wherenis computed from uncheckedstrlen()additions - Missing control: No overflow check on
size_tarithmetic before the allocation; the sum ofstrlen()values could silently wrap aroundSIZE_MAX - CWE: CWE-120 — Buffer Copy without Checking Size of Input ("Classic Buffer Overflow")
- Fix: Added
SIZE_MAXsubtraction guards before each size calculation so that inputs causing overflow cause the function to returnnullptrsafely instead of allocating an undersized buffer
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
Integer overflow in buffer size calculations is one of those vulnerabilities that hides in plain sight. The arithmetic in find_config_path() looked reasonable at a glance — add a base path length, add a suffix length, add one for the null terminator. But without explicit overflow guards, that innocuous addition becomes a heap corruption primitive for any attacker who can influence environment variables.
The fix is surgical and clear: check that base_len > SIZE_MAX - sizeof(DBC_PATH_SUFFIX) before computing n, and return nullptr if the check fails. This pattern costs almost nothing in performance and completely eliminates the overflow risk. Combined with the switch from malloc() to calloc() for the wide-character buffer, the repaired find_config_path() function now handles adversarial inputs safely.
If you're writing C code that constructs paths or strings from external inputs, make SIZE_MAX guard checks part of your muscle memory. They're cheap, readable, and the difference between a secure path-construction function and a heap overflow.