Back to Blog
high SEVERITY5 min read

How integer overflow in malloc happens in C libregexp and how to fix it

A high-severity integer overflow vulnerability was discovered in QuickJS's libregexp.c where multiplication to compute allocation size could wrap around, causing a heap overflow. The fix replaces the unsafe `malloc(sizeof(capture[0]) * lre_get_alloc_count(bc))` pattern with `calloc(lre_get_alloc_count(bc), sizeof(capture[0]))`, which safely handles the multiplication internally and prevents exploitation.

O
By Orbis AppSec
Published June 28, 2026Reviewed June 28, 2026

Answer Summary

Integer overflow in malloc (CWE-190/CWE-122) occurs in C when arithmetic multiplication to compute buffer size wraps around, allocating a too-small buffer that leads to heap overflow. In QuickJS's libregexp.c, the vulnerable pattern `malloc(sizeof(capture[0]) * lre_get_alloc_count(bc))` was fixed by replacing it with `calloc(lre_get_alloc_count(bc), sizeof(capture[0]))`, which performs overflow-checked multiplication internally.

Vulnerability at a Glance

cweCWE-190 (Integer Overflow) / CWE-122 (Heap-based Buffer Overflow)
fixReplace malloc() with calloc() which performs safe multiplication internally
riskHeap corruption enabling arbitrary code execution or denial of service
languageC
root causeUnchecked multiplication in allocation size computation can wrap to small value
vulnerabilityInteger Overflow Leading to Heap Overflow

Introduction

In QuickJS's regular expression library, we discovered a high-severity integer overflow vulnerability at line 3429 of libregexp.c. The main() function's test harness allocated memory for regex capture groups using an unsafe multiplication pattern that could be exploited through specially crafted regular expressions.

The vulnerable code computed allocation size by multiplying sizeof(capture[0]) by the return value of lre_get_alloc_count(bc):

capture = malloc(sizeof(capture[0]) * lre_get_alloc_count(bc));

When lre_get_alloc_count(bc) returns a sufficiently large value—controlled by the complexity of the compiled regex bytecode—this multiplication can overflow, wrapping around to a small number. The result? A tiny buffer is allocated, but subsequent regex execution writes capture group data as if the full size were available, corrupting the heap.

The Vulnerability Explained

How Integer Overflow Leads to Heap Corruption

In C, arithmetic operations on integers have no built-in overflow protection. When you multiply two values and the result exceeds the maximum representable value for that type, the result "wraps around" to a small number due to modular arithmetic.

Consider the vulnerable line:

capture = malloc(sizeof(capture[0]) * lre_get_alloc_count(bc));

Let's say sizeof(capture[0]) is 16 bytes (typical for a capture group structure containing two pointers). If an attacker crafts a regex with enough capture groups such that lre_get_alloc_count(bc) returns a value like 0x1000000000000001 on a 64-bit system, the multiplication becomes:

16 * 0x1000000000000001 = 0x10000000000000010

This exceeds 64 bits, so it wraps to just 0x10 (16 bytes). The malloc() call allocates only 16 bytes, but lre_exec() then attempts to write capture data for billions of groups into this tiny buffer.

Attack Scenario Specific to libregexp.c

An attacker targeting this vulnerability would:

  1. Craft a malicious regex pattern with an enormous number of capture groups—the test case in the PR demonstrates this with thousands of nested parentheses (((((...)))))
  2. Compile the regex through QuickJS's regex compilation, which generates bytecode where lre_get_alloc_count() returns a massive value
  3. Trigger the allocation in the test harness's main() function, causing the overflow
  4. Execute the regex against any input, causing lre_exec() to write beyond the allocated buffer

The heap overflow enables:
- Arbitrary code execution by overwriting function pointers or vtables in adjacent heap objects
- Information disclosure by corrupting heap metadata to leak memory contents
- Denial of service by crashing the application through heap corruption

The Fix

The fix replaces the unsafe malloc() with multiplication pattern with calloc():

Before (Vulnerable)

capture = malloc(sizeof(capture[0]) * lre_get_alloc_count(bc));

After (Fixed)

capture = calloc(lre_get_alloc_count(bc), sizeof(capture[0]));

Why calloc() Solves This Problem

The calloc() function takes two arguments—the number of elements and the size of each element—and performs the multiplication internally with overflow checking. Per the C standard and common implementations:

  1. calloc() checks for overflow before allocating. If nmemb * size would overflow, calloc() returns NULL instead of allocating an undersized buffer.

  2. Zero-initialization provides defense in depth. Even if there were edge cases, the buffer is zeroed, preventing information leakage from uninitialized memory.

  3. Semantic clarity makes the intent obvious—we're allocating an array of lre_get_alloc_count(bc) elements, each of size sizeof(capture[0]).

The fix at line 3429 ensures that when an attacker provides a regex designed to trigger overflow, the allocation fails safely with a NULL return rather than succeeding with a corrupted size.

Prevention & Best Practices

Safe Allocation Patterns in C

Always prefer calloc() for arrays:

// UNSAFE
ptr = malloc(count * element_size);

// SAFE
ptr = calloc(count, element_size);

When malloc() is required, check for overflow explicitly:

// Safe multiplication check
if (count > 0 && element_size > SIZE_MAX / count) {
    // Overflow would occur
    return NULL;
}
ptr = malloc(count * element_size);

Use compiler built-ins when available:

size_t total;
if (__builtin_mul_overflow(count, element_size, &total)) {
    return NULL;
}
ptr = malloc(total);

Static Analysis Integration

Configure your CI/CD pipeline to flag dangerous allocation patterns:
- Semgrep rules can detect malloc() calls with multiplication in the argument
- Compiler warnings like -Walloc-size-larger-than catch some cases
- Memory sanitizers (ASan) detect the resulting heap overflow at runtime

Code Review Checklist

When reviewing C code that handles dynamic allocation:
- [ ] Is the allocation size computed safely?
- [ ] Could any input influence the size calculation?
- [ ] Is calloc() used for array allocations?
- [ ] Is the return value checked for NULL?

Key Takeaways

  • Never use malloc(a * b) for array allocation—the multiplication can overflow silently, and calloc(a, b) handles this safely
  • The lre_get_alloc_count() return value is influenced by regex complexity, making this a user-controllable attack vector in any application that compiles untrusted regexes
  • QuickJS's test harness in main() was vulnerable, demonstrating that even test code can have security implications if it processes untrusted input
  • calloc() provides two protections: overflow-checked multiplication AND zero-initialization
  • Regex engines are high-value targets because they process complex, attacker-controlled input—extra scrutiny on memory operations is essential

How Orbis AppSec Detected This

  • Source: The bc (bytecode) parameter passed to lre_get_alloc_count(), derived from compiling a user-provided regex pattern via argv[1]
  • Sink: malloc(sizeof(capture[0]) * lre_get_alloc_count(bc)) at quickjs/libregexp.c:3429
  • Missing control: No overflow check on the multiplication before allocation
  • CWE: CWE-190 (Integer Overflow or Wraparound) leading to CWE-122 (Heap-based Buffer Overflow)
  • Fix: Replaced malloc() with calloc() which performs overflow-checked multiplication internally

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

Integer overflow vulnerabilities in memory allocation are among the most dangerous bugs in C programs. They're subtle—the code looks correct at first glance—but can lead to complete system compromise through heap corruption.

The fix in QuickJS's libregexp.c demonstrates the simplest and most effective mitigation: use calloc() instead of malloc() with multiplication. This single-line change transforms a high-severity vulnerability into a safe allocation that fails gracefully when given malicious input.

When working with C code that handles untrusted input—especially complex parsers like regex engines—always assume that any size calculation could be manipulated. Design your allocation strategy to fail safely rather than corrupt memory silently.

References

Frequently Asked Questions

What is integer overflow in malloc?

Integer overflow in malloc occurs when the arithmetic used to compute the allocation size exceeds the maximum value for the integer type, wrapping around to a small number and causing an undersized buffer allocation.

How do you prevent integer overflow in malloc in C?

Use calloc() instead of malloc() with multiplication, as calloc() performs overflow-checked multiplication internally. Alternatively, explicitly check for overflow before the multiplication or use safe integer arithmetic functions.

What CWE is integer overflow in malloc?

This vulnerability maps to CWE-190 (Integer Overflow or Wraparound) and CWE-122 (Heap-based Buffer Overflow), as the overflow leads to undersized allocation and subsequent heap corruption.

Is using size_t enough to prevent integer overflow in malloc?

No, size_t only ensures the type can hold valid allocation sizes, but multiplication of two size_t values can still overflow. You must either use calloc() or explicitly check for overflow before multiplying.

Can static analysis detect integer overflow in malloc?

Yes, static analysis tools like Semgrep can detect patterns where multiplication is used to compute allocation sizes without overflow checks, flagging them as potential vulnerabilities.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #24

Related Articles

critical

How buffer overflow via sprintf() happens in C++ settings parsing and how to fix it

A critical buffer overflow vulnerability was discovered in `app/src/main/cpp/samp/settings.cpp` where `sprintf()` writes to a fixed 127-byte buffer (`char buff[0x7F]`) without bounds checking. If the `g_pszStorage` global variable contains a string longer than ~107 bytes, the formatted output exceeds the buffer, enabling stack corruption. The fix replaces `sprintf()` with `snprintf()` using `sizeof(buff)` to guarantee writes never exceed the declared buffer length.

medium

How integer overflow in bounds checking happens in C and how to fix it

A critical integer overflow vulnerability was discovered in the W_Read function of DOOM/w_file.c that allowed attackers to bypass bounds checking by crafting WAD files with malicious offset values near UINT_MAX. The fix implements a two-step validation approach that first checks if the offset exceeds the file length, then safely calculates the remaining bytes without risk of overflow.

critical

How buffer overflow in strcat() happens in C and how to fix it

A critical buffer overflow vulnerability was discovered in the `daemonize()` function of `tpl.c`, where command-line arguments are concatenated into a fixed-size 8192-byte buffer using `strcat()` without any bounds checking. An attacker who controls command-line arguments can overflow this buffer to corrupt adjacent memory and potentially achieve arbitrary code execution. The fix adds a buffer-length check before each concatenation to ensure writes never exceed the declared buffer size.

critical

How command injection happens in Node.js subprocess and how to fix it

A critical command injection vulnerability in `tools/dev/src/index.ts` allowed attackers to execute arbitrary shell commands through unsanitized subprocess arguments. The fix was simple but essential: explicitly setting `shell: false` in the `spawn()` call to prevent shell metacharacter interpretation. This vulnerability demonstrates why subprocess handling requires explicit security controls in Node.js.

critical

How GitHub token exposure happens in TypeScript CLI utilities and how to fix it

A critical credential exposure vulnerability was discovered in `cli/src/utils/github.ts`, where three GitHub API fetch calls were made without any safe token-loading mechanism, risking accidental hardcoding or token leakage in logs and CI/CD pipelines. The fix introduces a centralized `getAuthHeaders()` function that reads the token exclusively from the `GITHUB_TOKEN` environment variable and safely injects it into all outbound API requests. This ensures credentials never touch source code, buil

high

How buffer overflow happens in C string operations with strcpy/strncpy and how to fix it

A critical buffer overflow vulnerability in `src/pomoc.c` was discovered where `strncpy()` was used unsafely to copy a socket path into a fixed-size buffer. The fix replaces the dangerous string copy with `snprintf()`, which provides automatic bounds checking and null-termination. This prevents attackers from exploiting the CLI tool through oversized input arguments.