What is an integer overflow leading to heap buffer overflow?

An integer overflow occurs when an arithmetic operation produces a value that exceeds the maximum representable by the data type, wrapping around to a small or zero value. When that wrapped value is then used as the size argument to a heap allocation function like `malloc()`, the resulting buffer is far too small to hold the data that will be written into it, causing a heap buffer overflow.

How do you prevent integer overflow before malloc() in C?

Always validate that the size argument — and any arithmetic performed on it — cannot wrap around before calling `malloc()` or a wrapper like `MALLOC()`. For a common pattern like `malloc(len + 1)`, check `if (len == SIZE_MAX) { /* handle error */ }` before the allocation. For more complex expressions, use compiler-provided checked-arithmetic builtins (e.g., `__builtin_add_overflow`) or a dedicated safe-integer library.

What CWE is integer overflow leading to buffer overflow?

CWE-190: Integer Overflow or Wraparound. It is closely related to CWE-122 (Heap-Based Buffer Overflow), which describes the downstream memory corruption that results when the overflowed value is used as an allocation size.

Is using a safe allocator wrapper like MALLOC() enough to prevent integer overflow?

No. A wrapper such as `MALLOC()` typically only adds an out-of-memory check after the allocation; it does not validate the arithmetic used to compute the size argument. The overflow happens *before* the allocator is called, so the guard must be placed before the `MALLOC()` call itself.

Can static analysis detect integer overflow before malloc()?

Yes. Tools such as Semgrep, Coverity, CodeQL, and clang's AddressSanitizer / UndefinedBehaviorSanitizer can flag patterns where a value is incremented and immediately used as a `malloc` size without an overflow check. Orbis AppSec detected this exact pattern automatically and opened a remediation pull request.

Heap Overflow in TOML Parser: How Integer Overflow Leads to Memory Corruption

Introduction

Configuration file parsers are everywhere — they quietly read your settings, load your preferences, and initialize your application. Because they sit at the boundary between untrusted data and trusted program logic, they are a historically rich target for attackers. When that parser is written in C and handles attacker-controlled input without careful bounds checking, the consequences can be severe.

This post breaks down a critical heap buffer overflow (CWE-190) discovered and patched in deps/centitoml/toml_api.c, a C-based TOML parser embedded in a game engine. The root cause is a classic integer overflow that produces an undersized heap allocation, followed by a memcpy that happily writes beyond the buffer's end. We'll walk through exactly how it works, how an attacker could exploit it, and what the fix does to close the door.

The Vulnerability Explained

What Is an Integer Overflow Leading to Heap Overflow?

Integer overflow bugs in C are deceptively simple: arithmetic on an integer type wraps around when it exceeds the type's maximum value. In the context of memory allocation, this means a calculation meant to produce a large buffer size can silently produce a tiny one — or even zero. The allocator happily returns a valid pointer to that undersized buffer, and any subsequent write that assumes the buffer is full-sized corrupts adjacent heap memory.

The Vulnerable Code

Here is the relevant function before the fix:

// deps/centitoml/toml_api.c (BEFORE)
static char *STRNDUP(const char *s, size_t n)
{
    size_t len = strnlen(s, n);
    char *p = MALLOC(len + 1);   // ← integer overflow possible here
    if (p)
    {
        memcpy(p, s, len);       // ← writes `len` bytes into undersized buffer
        p[len] = '\0';
    }
    return p;
}

Let's trace the problem step by step:

len comes from attacker-controlled TOML data. The n parameter passed to strnlen is derived from values parsed out of a TOML file — a file that could be a community-distributed mod or map.
len + 1 can overflow. size_t is an unsigned type. On a 64-bit system, SIZE_MAX is 0xFFFFFFFFFFFFFFFF. If len equals SIZE_MAX (i.e., (size_t)-1), then len + 1 wraps to 0.
MALLOC(0) returns a valid but tiny pointer. The C standard permits malloc(0) to return a non-NULL pointer. The buffer is effectively zero bytes (or implementation-defined minimum), but the pointer is valid.
memcpy(p, s, len) writes len bytes anyway. With len equal to SIZE_MAX, this attempts to copy an astronomically large number of bytes starting from the tiny allocation — smashing everything on the heap after it.

Real-World Attack Scenario

The game loads TOML configuration files from its config/ directory. Community channels distribute mod and map files, which may include or reference TOML configs. An attacker crafts a malicious TOML file containing a string value whose encoded length causes strnlen to return (size_t)-1. When the game loads this file:

The parser calls STRNDUP with the oversized string.
MALLOC(0) returns a small allocation.
memcpy overflows the heap, corrupting allocator metadata or adjacent objects.
The attacker can potentially achieve arbitrary code execution by carefully controlling heap layout — a well-known technique in heap exploitation.

Because this is triggered simply by opening a file, the attack requires no authentication, no network access, and no special privileges. A player downloading a popular-looking mod could silently execute attacker code.

CWE Reference: CWE-190: Integer Overflow or Wraparound

The Fix

What Changed

The patch adds a single guard clause that checks for the sentinel overflow value before any allocation occurs:

// deps/centitoml/toml_api.c (AFTER)
static char *STRNDUP(const char *s, size_t n)
{
    size_t len = strnlen(s, n);
+   if (len == (size_t)-1)       // ← guard: overflow sentinel check
+       return NULL;
    char *p = MALLOC(len + 1);
    if (p)
    {
        memcpy(p, s, len);
        p[len] = '\0';
    }
    return p;
}

How Does It Work?

strnlen(s, n) returns at most n. If n itself is (size_t)-1 (the maximum value of size_t), it signals that the input length is at or beyond the type boundary — a condition that should never occur with legitimate TOML data.

The guard if (len == (size_t)-1) return NULL; intercepts this before len + 1 can wrap to zero. The caller receives NULL, which the existing if (p) check already handles gracefully — no allocation, no copy, no overflow.

Why This Is the Right Fix

Concern	Before	After
Integer overflow on `len+1`	Possible	Prevented by early return
Undersized `MALLOC` call	Possible	Never reached
Heap corruption via `memcpy`	Possible	Never reached
Graceful failure on bad input	No	Yes — returns `NULL`

The fix is minimal, targeted, and does not change behavior for any legitimate input. It follows the fail-fast principle: detect the anomaly at the earliest possible point and return a safe error value rather than proceeding with corrupted state.

Prevention & Best Practices

1. Always Validate Sizes Before Arithmetic

Before any malloc(n + k) call, check that n + k does not overflow:

// Safe size addition pattern
if (n > SIZE_MAX - k) {
    // overflow would occur — handle error
    return NULL;
}
char *p = malloc(n + k);

Many projects use helper macros or inline functions for this:

static inline int size_add_overflow(size_t a, size_t b, size_t *result) {
    if (a > SIZE_MAX - b) return 1; // overflow
    *result = a + b;
    return 0;
}

2. Treat All Parser Inputs as Untrusted

Even "local" files like config files can be attacker-controlled in many threat models (malicious mods, path traversal, symlink attacks). Apply the same input validation you would to network data.

3. Use Safe String Functions

Where possible, prefer higher-level abstractions that track buffer sizes:

In C: use strndup() from the standard library (POSIX), which handles the null terminator internally.
In C++: use std::string or std::string_view.
In Rust: string types are bounds-checked by default.

4. Enable Compiler and Runtime Protections

Modern toolchains offer several mitigations:

# AddressSanitizer — catches heap overflows at runtime
clang -fsanitize=address -o myapp myapp.c

# UndefinedBehaviorSanitizer — catches integer overflows
clang -fsanitize=undefined -o myapp myapp.c

# Stack and heap hardening (GCC/Clang)
-D_FORTIFY_SOURCE=2 -fstack-protector-strong

These don't replace correct code, but they catch bugs during development and testing.

5. Use Static Analysis Tools

Tools that can detect this class of vulnerability:

Coverity — commercial, excellent at integer overflow and buffer overrun detection
CodeQL — GitHub's semantic analysis engine, has queries for CWE-190
clang-tidy — catches some arithmetic overflow patterns
Semgrep — customizable rules for unsafe malloc/memcpy patterns

6. Fuzz Your Parsers

Configuration file parsers are ideal fuzzing targets. Tools like libFuzzer or AFL++ can generate millions of malformed TOML files and report crashes:

# Example: fuzz the TOML parser entry point
clang -fsanitize=fuzzer,address -o toml_fuzz toml_fuzz_target.c toml_api.c
./toml_fuzz corpus/

A fuzzer would likely have caught this bug by generating a string that triggers the SIZE_MAX path.

Relevant Standards and References

Conclusion

This vulnerability is a textbook example of how a single missing bounds check in a low-level parser can open the door to critical heap corruption. The attack path is realistic — community mod files are a well-established vector for game engine exploits — and the impact is severe: heap corruption in C can escalate to arbitrary code execution in the hands of a skilled attacker.

The fix is elegantly simple: one if statement, two lines of code, zero behavior change for legitimate inputs. But reaching that fix requires understanding why len + 1 can overflow, what MALLOC(0) returns, and how memcpy blindly trusts its size argument.

Key takeaways for developers:

🔢 Treat size arithmetic as a security boundary — always check for overflow before allocation.
📂 Parser inputs are attacker-controlled — even local config files in user-modifiable directories.
🛡️ Fail fast and fail safely — return NULL early rather than proceeding with invalid state.
🔍 Fuzz your parsers — automated input generation finds these bugs faster than code review alone.
🧰 Use sanitizers in CI — AddressSanitizer and UBSan catch these issues before they reach production.

Memory safety bugs don't disappear on their own. Systematic validation, modern tooling, and a security-first mindset are the best defenses we have — and as this fix shows, applying them is often much simpler than the damage they prevent.

cwe	CWE-190
fix	One-line guard added before the `MALLOC` call to reject any `len` value that would overflow when incremented
risk	Remote code execution via crafted TOML configuration / mod files
language	C
root cause	`MALLOC(len+1)` called without validating that `len < SIZE_MAX`, allowing integer wraparound to produce an undersized allocation
vulnerability	Integer Overflow leading to Heap Buffer Overflow

Heap Overflow in TOML Parser: How Integer Overflow Leads to Memory Corruption

Answer Summary

Vulnerability at a Glance

Heap Overflow in TOML Parser: How Integer Overflow Leads to Memory Corruption

Introduction

The Vulnerability Explained

What Is an Integer Overflow Leading to Heap Overflow?

The Vulnerable Code

Real-World Attack Scenario

The Fix

What Changed

How Does It Work?

Why This Is the Right Fix

Prevention & Best Practices

1. Always Validate Sizes Before Arithmetic

2. Treat All Parser Inputs as Untrusted

3. Use Safe String Functions

4. Enable Compiler and Runtime Protections

5. Use Static Analysis Tools

6. Fuzz Your Parsers

Relevant Standards and References

Conclusion

Frequently Asked Questions

What is an integer overflow leading to heap buffer overflow?

How do you prevent integer overflow before malloc() in C?

What CWE is integer overflow leading to buffer overflow?

Is using a safe allocator wrapper like MALLOC() enough to prevent integer overflow?

Can static analysis detect integer overflow before malloc()?

View the Security Fix

Related Articles

How buffer overflow happens in C ieee80211_input() and how to fix it

How buffer overflow in FuzzIxml.c sprintf() happens in C and how to fix it

How buffer overflow happens in C libficus.c sprintf() and how to fix it

How buffer overflow via strcpy() happens in C Kconfig parsing and how to fix it

How integer overflow in malloc happens in C bipartite matching and how to fix it

How buffer overflow via sprintf() happens in C networking code and how to fix it