How Integer Overflow in path_join() Happens in C and How to Fix It
Introduction
The opencstl/filesystem.h file provides path manipulation utilities for a C library, but a critical flaw in the __cstl_join function at line 103 created a dangerous security risk. When computing the total buffer size needed to join two file paths, the function performed l1 + (need_sep ? 1 : 0) + l2 without verifying that this addition wouldn't overflow the size_type64 type. An attacker supplying adversarially crafted path strings could cause this computation to wrap around to a small value, resulting in a tiny malloc allocation followed by memcpy operations that write far beyond the buffer's bounds.
This vulnerability is particularly dangerous because opencstl is a library — any application importing this code inherits the exploitable condition. Since path joining is a fundamental filesystem operation often fed by user input (file uploads, configuration paths, plugin directories), the attack surface is broad.
The Vulnerability Explained
Let's look at the vulnerable code in opencstl/filesystem.h:
bool need_sep = !__cstl_is_sep(path1[l1 - 1]);
size_type64 total = l1 + (need_sep ? 1 : 0) + l2;
char *ret = (char *) malloc(total + 1);
memcpy(ret, path1, l1);
size_type64 pos = l1;
Here's what happens step by step:
l1is thestrlen()ofpath1, andl2is thestrlen()ofpath2- The code computes
total = l1 + separator + l2— but ifl1andl2are both extremely large values (close toSIZE_MAX / 2), their sum wraps around to a small number malloc(total + 1)allocates a tiny buffer (perhaps just a few bytes)memcpy(ret, path1, l1)then copies potentially gigabytes of data into that tiny buffer
Concrete attack scenario: An attacker who can influence file paths passed to path_join — for example, through a configuration file, a plugin system, or a network protocol that specifies paths — crafts two strings where strlen(path1) + strlen(path2) + 2 overflows size_type64. On a 64-bit system, this requires paths totaling near 2^64 bytes (impractical for direct strings but possible via memory-mapped inputs or on 32-bit systems where size_type64 might be 32 bits). On a 32-bit system or if size_type64 is uint32_t, paths of ~2GB each would trigger the overflow, resulting in a heap buffer overflow that enables arbitrary code execution.
The subsequent memcpy operations at lines 105-108 copy l1 + l2 bytes total into a buffer that was allocated for the wrapped-around total + 1 bytes — a classic heap overflow that can corrupt heap metadata, overwrite adjacent allocations, and ultimately give an attacker control of program execution.
The Fix
The fix introduces an explicit overflow check before computing the total size:
Before (vulnerable):
bool need_sep = !__cstl_is_sep(path1[l1 - 1]);
size_type64 total = l1 + (need_sep ? 1 : 0) + l2;
char *ret = (char *) malloc(total + 1);
After (fixed):
bool need_sep = !__cstl_is_sep(path1[l1 - 1]);
size_type64 sep = need_sep ? 1 : 0;
if (l1 > (size_type64)-1 - sep - l2 - 1) { return NULL; }
size_type64 total = l1 + sep + l2;
char *ret = (char *) malloc(total + 1);
Let's break down the overflow check: (size_type64)-1 produces the maximum value for the type (all bits set to 1). By checking whether l1 > MAX - sep - l2 - 1, the code verifies that l1 + sep + l2 + 1 (the actual allocation size including the null terminator) won't exceed the representable range. If it would overflow, the function safely returns NULL instead of proceeding with a corrupted allocation.
The separation of sep into its own variable also improves readability and makes the overflow check cleaner — a secondary benefit of this refactoring.
A regression test was also added in tests/test_invariant_filesystem.h that exercises the function with adversarial path lengths, boundary cases (empty strings), and normal operation to ensure the fix holds under all conditions.
Prevention & Best Practices
1. Always check arithmetic before allocation
Any time you compute a buffer size from multiple inputs, verify the computation won't overflow:
// Safe pattern for computing allocation sizes
if (a > SIZE_MAX - b) {
return NULL; // or handle error
}
size_t total = a + b;
2. Use compiler overflow built-ins when available
GCC and Clang provide __builtin_add_overflow():
size_t total;
if (__builtin_add_overflow(l1, l2, &total)) {
return NULL;
}
3. Consider safe integer arithmetic libraries
For C projects with extensive arithmetic, libraries like SafeInt or compiler-specific checked arithmetic functions reduce the risk of missing an overflow check.
4. Fuzz test path manipulation functions
Use AFL, libFuzzer, or similar tools to generate adversarial inputs for functions that combine user-controlled lengths. Path manipulation functions are prime targets.
5. Static analysis integration
Run static analysis tools (Coverity, CodeQL, or AI-powered scanners) in CI/CD pipelines to catch integer overflow patterns automatically.
Key Takeaways
- The
__cstl_joinfunction infilesystem.htrusted thatl1 + sep + l2would not overflow — a dangerous assumption for any library function that accepts external input - The fix uses
(size_type64)-1 - sep - l2 - 1as a ceiling check — this idiom correctly validates that the full allocation (including null terminator) fits within the type's range - Returning NULL on overflow is the correct fail-safe — callers must handle NULL returns, but this is far safer than silent heap corruption
- Library code has amplified risk — since
opencstlis imported by other applications, this single overflow check protects every downstream consumer - The
+ 1for the null terminator must be included in the overflow check — the original code computedtotalthen allocatedtotal + 1, meaning eventotal == SIZE_MAX(without wrapping in the addition itself) would overflow atmalloc(total + 1)
How Orbis AppSec Detected This
- Source: User-controlled path strings passed as
path1andpath2parameters to__cstl_join()inopencstl/filesystem.h - Sink:
malloc(total + 1)at line 103 followed bymemcpy(ret, path1, l1)at line 105, wheretotalis computed from unchecked string lengths - Missing control: No validation that
l1 + sep + l2 + 1fits withinsize_type64before allocation - CWE: CWE-190 (Integer Overflow or Wraparound)
- Fix: Added an explicit overflow check (
if (l1 > (size_type64)-1 - sep - l2 - 1) { return NULL; }) before computing the allocation size
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
Integer overflow vulnerabilities in C remain one of the most dangerous classes of memory safety bugs, especially in library code that processes variable-length inputs. The __cstl_join function's path concatenation logic seemed straightforward — compute a length, allocate, copy — but the missing overflow check meant that adversarial inputs could corrupt the heap. The fix is minimal (three lines) but transforms an exploitable condition into a safe failure mode. When writing C code that computes allocation sizes from external inputs, always validate arithmetic before calling malloc.