What is integer overflow in malloc?

Integer overflow in malloc occurs when the arithmetic used to compute the allocation size exceeds the maximum value for the integer type, wrapping around to a small number and causing an undersized buffer allocation.

How do you prevent integer overflow in malloc in C?

Use calloc() instead of malloc() with multiplication, as calloc() performs overflow-checked multiplication internally. Alternatively, explicitly check for overflow before the multiplication or use safe integer arithmetic functions.

What CWE is integer overflow in malloc?

This vulnerability maps to CWE-190 (Integer Overflow or Wraparound) and CWE-122 (Heap-based Buffer Overflow), as the overflow leads to undersized allocation and subsequent heap corruption.

Is using size_t enough to prevent integer overflow in malloc?

No, size_t only ensures the type can hold valid allocation sizes, but multiplication of two size_t values can still overflow. You must either use calloc() or explicitly check for overflow before multiplying.

Can static analysis detect integer overflow in malloc?

Yes, static analysis tools like Semgrep can detect patterns where multiplication is used to compute allocation sizes without overflow checks, flagging them as potential vulnerabilities.

Integer Overflow malloc in C — Safe Fix

Introduction

In QuickJS's regular expression library, we discovered a high-severity integer overflow vulnerability at line 3429 of libregexp.c. The main() function's test harness allocated memory for regex capture groups using an unsafe multiplication pattern that could be exploited through specially crafted regular expressions.

The vulnerable code computed allocation size by multiplying sizeof(capture[0]) by the return value of lre_get_alloc_count(bc):

capture = malloc(sizeof(capture[0]) * lre_get_alloc_count(bc));

When lre_get_alloc_count(bc) returns a sufficiently large value—controlled by the complexity of the compiled regex bytecode—this multiplication can overflow, wrapping around to a small number. The result? A tiny buffer is allocated, but subsequent regex execution writes capture group data as if the full size were available, corrupting the heap.

The Vulnerability Explained

How Integer Overflow Leads to Heap Corruption

In C, arithmetic operations on integers have no built-in overflow protection. When you multiply two values and the result exceeds the maximum representable value for that type, the result "wraps around" to a small number due to modular arithmetic.

Consider the vulnerable line:

capture = malloc(sizeof(capture[0]) * lre_get_alloc_count(bc));

Let's say sizeof(capture[0]) is 16 bytes (typical for a capture group structure containing two pointers). If an attacker crafts a regex with enough capture groups such that lre_get_alloc_count(bc) returns a value like 0x1000000000000001 on a 64-bit system, the multiplication becomes:

16 * 0x1000000000000001 = 0x10000000000000010

This exceeds 64 bits, so it wraps to just 0x10 (16 bytes). The malloc() call allocates only 16 bytes, but lre_exec() then attempts to write capture data for billions of groups into this tiny buffer.

Attack Scenario Specific to libregexp.c

An attacker targeting this vulnerability would:

Craft a malicious regex pattern with an enormous number of capture groups—the test case in the PR demonstrates this with thousands of nested parentheses (((((...)))))
Compile the regex through QuickJS's regex compilation, which generates bytecode where lre_get_alloc_count() returns a massive value
Trigger the allocation in the test harness's main() function, causing the overflow
Execute the regex against any input, causing lre_exec() to write beyond the allocated buffer

The heap overflow enables:
- Arbitrary code execution by overwriting function pointers or vtables in adjacent heap objects
- Information disclosure by corrupting heap metadata to leak memory contents
- Denial of service by crashing the application through heap corruption

The Fix

The fix replaces the unsafe malloc() with multiplication pattern with calloc():

Before (Vulnerable)

capture = malloc(sizeof(capture[0]) * lre_get_alloc_count(bc));

After (Fixed)

capture = calloc(lre_get_alloc_count(bc), sizeof(capture[0]));

Why calloc() Solves This Problem

The calloc() function takes two arguments—the number of elements and the size of each element—and performs the multiplication internally with overflow checking. Per the C standard and common implementations:

calloc() checks for overflow before allocating. If nmemb * size would overflow, calloc() returns NULL instead of allocating an undersized buffer.
Zero-initialization provides defense in depth. Even if there were edge cases, the buffer is zeroed, preventing information leakage from uninitialized memory.
Semantic clarity makes the intent obvious—we're allocating an array of lre_get_alloc_count(bc) elements, each of size sizeof(capture[0]).

The fix at line 3429 ensures that when an attacker provides a regex designed to trigger overflow, the allocation fails safely with a NULL return rather than succeeding with a corrupted size.

Prevention & Best Practices

Safe Allocation Patterns in C

Always prefer calloc() for arrays:

// UNSAFE
ptr = malloc(count * element_size);

// SAFE
ptr = calloc(count, element_size);

When malloc() is required, check for overflow explicitly:

// Safe multiplication check
if (count > 0 && element_size > SIZE_MAX / count) {
    // Overflow would occur
    return NULL;
}
ptr = malloc(count * element_size);

Use compiler built-ins when available:

size_t total;
if (__builtin_mul_overflow(count, element_size, &total)) {
    return NULL;
}
ptr = malloc(total);

Static Analysis Integration

Configure your CI/CD pipeline to flag dangerous allocation patterns:
- Semgrep rules can detect malloc() calls with multiplication in the argument
- Compiler warnings like -Walloc-size-larger-than catch some cases
- Memory sanitizers (ASan) detect the resulting heap overflow at runtime

Code Review Checklist

When reviewing C code that handles dynamic allocation:
- [ ] Is the allocation size computed safely?
- [ ] Could any input influence the size calculation?
- [ ] Is calloc() used for array allocations?
- [ ] Is the return value checked for NULL?

Key Takeaways

Never use malloc(a * b) for array allocation—the multiplication can overflow silently, and calloc(a, b) handles this safely
The lre_get_alloc_count() return value is influenced by regex complexity, making this a user-controllable attack vector in any application that compiles untrusted regexes
QuickJS's test harness in main() was vulnerable, demonstrating that even test code can have security implications if it processes untrusted input
calloc() provides two protections: overflow-checked multiplication AND zero-initialization
Regex engines are high-value targets because they process complex, attacker-controlled input—extra scrutiny on memory operations is essential

How Orbis AppSec Detected This

Source: The bc (bytecode) parameter passed to lre_get_alloc_count(), derived from compiling a user-provided regex pattern via argv[1]
Sink: malloc(sizeof(capture[0]) * lre_get_alloc_count(bc)) at quickjs/libregexp.c:3429
Missing control: No overflow check on the multiplication before allocation
CWE: CWE-190 (Integer Overflow or Wraparound) leading to CWE-122 (Heap-based Buffer Overflow)
Fix: Replaced malloc() with calloc() which performs overflow-checked multiplication internally

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

Integer overflow vulnerabilities in memory allocation are among the most dangerous bugs in C programs. They're subtle—the code looks correct at first glance—but can lead to complete system compromise through heap corruption.

The fix in QuickJS's libregexp.c demonstrates the simplest and most effective mitigation: use calloc() instead of malloc() with multiplication. This single-line change transforms a high-severity vulnerability into a safe allocation that fails gracefully when given malicious input.

When working with C code that handles untrusted input—especially complex parsers like regex engines—always assume that any size calculation could be manipulated. Design your allocation strategy to fail safely rather than corrupt memory silently.

cwe	CWE-190 (Integer Overflow) / CWE-122 (Heap-based Buffer Overflow)
fix	Replace malloc() with calloc() which performs safe multiplication internally
risk	Heap corruption enabling arbitrary code execution or denial of service
language	C
root cause	Unchecked multiplication in allocation size computation can wrap to small value
vulnerability	Integer Overflow Leading to Heap Overflow

How integer overflow in malloc happens in C libregexp and how to fix it

Answer Summary

Vulnerability at a Glance

Introduction

The Vulnerability Explained

How Integer Overflow Leads to Heap Corruption

Attack Scenario Specific to libregexp.c

The Fix

Before (Vulnerable)

After (Fixed)

Why calloc() Solves This Problem

Prevention & Best Practices

Safe Allocation Patterns in C

Static Analysis Integration

Code Review Checklist

Key Takeaways

How Orbis AppSec Detected This

Conclusion

References

Frequently Asked Questions

What is integer overflow in malloc?

How do you prevent integer overflow in malloc in C?

What CWE is integer overflow in malloc?

Is using size_t enough to prevent integer overflow in malloc?

Can static analysis detect integer overflow in malloc?

View the Security Fix

Related Articles

How buffer overflow via sprintf() happens in C++ settings parsing and how to fix it

How integer overflow in bounds checking happens in C and how to fix it

How buffer overflow in strcat() happens in C and how to fix it

How command injection happens in Node.js subprocess and how to fix it

How GitHub token exposure happens in TypeScript CLI utilities and how to fix it

How buffer overflow happens in C string operations with strcpy/strncpy and how to fix it