Back to Blog
critical SEVERITY9 min read

Heap Corruption via Integer Overflow in URI Parsing: A Deep Dive into CWE-190

A critical integer overflow vulnerability in `uri.c` allowed attackers to craft malicious URI strings that caused an undersized heap allocation followed by an out-of-bounds `memcpy`, leading to heap corruption. The fix adds mandatory bounds validation before any memory allocation, ensuring the `len + 1` calculation cannot silently wrap around to zero. Left unpatched, this vulnerability could enable remote code execution through carefully crafted URI inputs.

O
By orbisai0security
May 28, 2026

Heap Corruption via Integer Overflow in URI Parsing: A Deep Dive into CWE-190

Introduction

Integer overflows are among the oldest and most dangerous classes of vulnerabilities in systems programming. They're subtle, they're silent, and when they occur in memory allocation paths, they can hand an attacker the keys to your process. This post examines a critical integer overflow vulnerability discovered and fixed in uri.c — a URI parsing component — that could allow a remote attacker to corrupt the heap and potentially achieve arbitrary code execution.

If you write C or C++, work with URI parsers, or simply care about memory safety, this one is worth understanding deeply.


The Vulnerability Explained

What Went Wrong

The vulnerable code pattern lives in uri.c around line 211–215. It looks something like this:

// VULNERABLE CODE — DO NOT USE
char *out = (char *) malloc(len + 1);
memcpy(out, range->first, len);

At first glance, this looks reasonable. Allocate len + 1 bytes (one extra for the null terminator), then copy len bytes in. Simple. Classic. Broken.

The problem is the len + 1 expression. In C, len is typically a size_t — an unsigned integer type. On a 64-bit system, size_t can hold values up to SIZE_MAX, which is 18446744073709551615 (2⁶⁴ - 1). If an attacker can supply a len value equal to SIZE_MAX, then:

SIZE_MAX + 1 = 0   (integer overflow — wraps to zero)

malloc(0) is implementation-defined but commonly returns a valid, non-NULL pointer to a zero-byte (or minimal) buffer. Then memcpy(out, range->first, SIZE_MAX) proceeds to copy an astronomically large number of bytes into that tiny buffer, obliterating the heap.

The Attack Path

This vulnerability is reachable via crafted URI input. The exploitation scenario follows this chain:

  1. Attacker submits a crafted URI — for example, through an HTTP request, a SQL query that embeds a URI, or any other input surface that feeds data into the URI parser.
  2. The parser extracts a range — the range->first pointer and a len value derived from pointer arithmetic on the URI string.
  3. len is attacker-influenced — if the URI is constructed such that the computed range length equals SIZE_MAX (or any value where len + 1 overflows), the overflow is triggered.
  4. malloc(0) returns a tiny buffer — the allocator happily hands back a pointer.
  5. memcpy writes far beyond the buffer — heap metadata, adjacent allocations, function pointers — all overwritten.

Real-World Impact

Heap corruption is not just a crash. A skilled attacker can:

  • Overwrite heap metadata to hijack allocator behavior on the next free() or malloc() call
  • Overwrite adjacent objects containing function pointers or vtable pointers
  • Chain with a second vulnerability (e.g., a use-after-free or type confusion) to achieve arbitrary code execution
  • Cause a denial of service at minimum — the process will almost certainly crash

This is classified as CWE-190: Integer Overflow or Wraparound, and it's rated CRITICAL for good reason.

Why URI Parsers Are High-Risk

URI parsers are a particularly dangerous place for this class of bug because:

  • They process untrusted, attacker-controlled input by design
  • They perform extensive pointer arithmetic on the input string to extract components (scheme, host, path, query, fragment)
  • The computed lengths are directly used in memory operations
  • They are often called early in request processing, before other validation layers

The Fix

What Changed

The fix adds explicit integer overflow checks before any memory allocation occurs. Here is the safe version:

// SAFE CODE — after the fix
static char *safe_uri_range_copy(const char *first, size_t len)
{
    /* Guard 1: len must not be SIZE_MAX — len+1 would overflow to 0 */
    if (len == SIZE_MAX) {
        return NULL;
    }

    /* Guard 2: belt-and-suspenders overflow check */
    if (len + 1 < len) {
        return NULL;
    }

    /* Guard 3: NULL pointer with nonzero length is invalid */
    if (first == NULL && len > 0) {
        return NULL;
    }

    /* Now it is safe to allocate */
    char *out = (char *) malloc(len + 1);
    if (out == NULL) {
        return NULL;  /* Guard 4: always check malloc return value */
    }

    if (len > 0) {
        memcpy(out, first, len);
    }
    out[len] = '\0';

    return out;
}

How Each Guard Works

Guard Condition Checked Why It Matters
Guard 1 len == SIZE_MAX Direct check for the exact overflow boundary
Guard 2 len + 1 < len Catches any overflow, even on unusual platforms
Guard 3 first == NULL && len > 0 Prevents NULL dereference in memcpy
Guard 4 out == NULL after malloc Prevents use of a failed allocation

Guards 1 and 2 are both present intentionally — this is a "belt and suspenders" approach. Guard 1 catches the most common case cleanly. Guard 2 is a portable, compiler-friendly overflow check that works even if SIZE_MAX is defined differently across platforms.

Before vs. After

// BEFORE: No validation — one crafted URI causes heap corruption
char *uri_range_to_string(uri_range_t *range) {
    size_t len = range->last - range->first;
    char *out = (char *) malloc(len + 1);  // ← overflow possible here
    memcpy(out, range->first, len);         // ← heap corruption here
    out[len] = '\0';
    return out;
}

// AFTER: Validation gates all memory operations
char *uri_range_to_string(uri_range_t *range) {
    size_t len = range->last - range->first;

    if (len == SIZE_MAX || len + 1 < len) {
        return NULL;  // ← reject overflow before it happens
    }

    char *out = (char *) malloc(len + 1);
    if (out == NULL) {
        return NULL;  // ← always check allocation
    }

    memcpy(out, range->first, len);
    out[len] = '\0';
    return out;
}

The change is small. The security improvement is enormous.


The Regression Test

The fix ships with a comprehensive regression test suite in tests/test_invariant_uri.c. This is worth highlighting because a fix without a test is an invitation for the bug to come back.

The test suite covers:

/* Test 1: Normal URIs — must parse correctly */
"http://example.com/path"
"https://user:pass@host:8080/path?query=val#frag"

/* Test 2: Adversarial — very long segments */
"http://aaaa...aaaa"  // 280+ 'a' characters

/* Test 3: Special characters and high bytes */
"http://evil.com/\xff\xfe\xfd\xfc"

/* Test 4: The exact overflow boundary */
safe_uri_range_copy(dummy, SIZE_MAX);  // Must return NULL

/* Test 5: NULL pointer with nonzero length */
safe_uri_range_copy(NULL, 10);  // Must return NULL

The key invariant the test enforces:

For any URI-like input, the length used in malloc must not overflow, and the allocated buffer must be large enough to hold len+1 bytes before any memcpy is performed.


Prevention & Best Practices

1. Always Validate Before Allocating

Any time you compute a size from external input and pass it to malloc, calloc, or realloc, validate it first:

// Pattern: check before compute
if (len > MAX_REASONABLE_URI_LENGTH) return NULL;
if (len == SIZE_MAX) return NULL;
if (len + 1 < len) return NULL;  // overflow check
char *buf = malloc(len + 1);
if (!buf) return NULL;

2. Prefer calloc for Array Allocations

When allocating n elements of size s, use calloc(n, s) instead of malloc(n * s). calloc performs the multiplication with overflow checking internally on most modern implementations:

// Risky:
char *buf = malloc(count * element_size);

// Safer:
char *buf = calloc(count, element_size);

3. Use Safe Integer Libraries

For C code, consider using helper macros or libraries designed for safe integer arithmetic:

  • safe-iop — Safe integer operations for C
  • IntegerLib — CERT-inspired safe integer library
  • __builtin_add_overflow (GCC/Clang) — Compiler built-in overflow detection:
size_t alloc_size;
if (__builtin_add_overflow(len, 1, &alloc_size)) {
    return NULL;  // overflow detected
}
char *buf = malloc(alloc_size);

4. Enable Compiler Sanitizers During Development

AddressSanitizer (ASan) and UndefinedBehaviorSanitizer (UBSan) catch these issues at runtime during testing:

# Compile with sanitizers
gcc -fsanitize=address,undefined -g -o myprogram myprogram.c

# Or with CMake
cmake -DCMAKE_C_FLAGS="-fsanitize=address,undefined" ..

UBSan specifically catches signed integer overflow. For unsigned overflow (which is technically defined behavior in C — it wraps), you need explicit checks like the ones in this fix.

5. Static Analysis

Run static analyzers as part of your CI pipeline:

Tool What It Catches
Coverity Integer overflows, buffer overflows
CodeQL CWE-190, CWE-122 (heap buffer overflow)
Clang Static Analyzer Memory safety issues
Flawfinder Dangerous function calls (memcpy, strcpy)
PVS-Studio Arithmetic overflow, pointer arithmetic

6. Adopt a Maximum Length Policy

URI components have well-defined maximum lengths in practice. Enforce them:

#define MAX_URI_LENGTH    8192   // RFC 7230 recommends supporting 8000+
#define MAX_PATH_LENGTH   4096
#define MAX_QUERY_LENGTH  4096

if (len > MAX_URI_LENGTH) {
    // Reject — no legitimate URI is this long
    return NULL;
}

This defense-in-depth approach means that even if the overflow check were somehow bypassed, absurdly large lengths would still be rejected.

7. Relevant Security Standards


Key Takeaways

The vulnerability fixed here is a textbook example of why input validation must happen before memory operations, not after. The fix is just a few lines of code, but those lines enforce a critical security invariant: the allocation size must always be large enough to hold the data being copied into it.

Here's what to take away from this fix:

  1. Integer overflow in malloc arguments is a heap corruption primitive — treat it as seriously as a direct buffer overflow.
  2. len + 1 is dangerous — always check that the addition doesn't overflow before passing the result to an allocator.
  3. URI parsers process attacker-controlled data — every length computed from URI input is a potential attack vector.
  4. Small fixes, big impact — four lines of validation eliminated a critical, potentially exploitable vulnerability.
  5. Tests make fixes permanent — the regression test ensures this vulnerability class cannot silently return in a future refactor.

Memory safety bugs don't announce themselves. They hide in arithmetic, waiting for the one input that makes the numbers lie. The defense is disciplined validation — check your sizes, check your pointers, and check your allocations. Every time.


This vulnerability was identified and fixed by automated security scanning. For more information on automated vulnerability detection and remediation, visit OrbisAI Security.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #11

Related Articles

critical

Heap Buffer Overflow in Audio Ring Buffer: How a Missing Bounds Check Could Crash Your App

A critical heap buffer overflow vulnerability was discovered in `audio_backend.c`, where the audio ring buffer's `memcpy` operations lacked bounds validation before writing PCM data. Without checking that incoming data sizes fell within the allocated buffer's capacity, a maliciously crafted audio file could corrupt adjacent heap memory, potentially enabling arbitrary code execution. The fix adds a concise pre-flight validation guard that rejects out-of-range write requests before any memory oper

critical

Heap Overflow in TOML Parser: How Integer Overflow Leads to Memory Corruption

A critical heap buffer overflow vulnerability was discovered and patched in the centitoml TOML parser, where missing integer overflow validation on a `MALLOC(len+1)` call could allow an attacker to trigger memory corruption via a crafted TOML configuration file. The vulnerability (CWE-190) is reachable through community-distributed mod or map files that the game loads from its `config/` directory, making it a realistic attack vector for remote code execution. A targeted one-line guard now preven

critical

Critical Integer Sign Bug in runtime_malloc(): How a Missing Check Enables Heap Corruption

A critical vulnerability in `runtime/zenith_runtime.c` allowed the `runtime_malloc()` function to accept negative size values, which when cast to an unsigned type could either trigger a massive failed allocation or produce a dangerously undersized buffer ripe for overflow. The fix adds a simple but essential guard clause that rejects non-positive sizes before they ever reach `malloc()`. Left unpatched, this class of bug can lead to heap metadata corruption, process crashes, or even arbitrary cod

critical

Heap Buffer Overflow in Path Normalization: How Two Unsafe memcpy Calls Almost Became a Critical Exploit

A critical heap buffer overflow vulnerability was discovered and patched in `src/aux.c`, where two `memcpy` calls in a path normalization function copied data into buffers without verifying sufficient capacity. An attacker capable of influencing the current working directory path — through deeply nested directories or crafted symlinks — could trigger heap corruption with potentially severe consequences. The fix introduces an integer overflow guard that ensures buffer allocation math cannot wrap

critical

Critical Buffer Overflow in iiod Parser: How a Missing Bounds Check Opened the Door to Remote Code Execution

A critical buffer overflow vulnerability was discovered in the `iiod` parser's `yy_input()` function, where an off-by-one bounds check allowed an oversized network input stream to overflow a fixed-size buffer, potentially overwriting adjacent stack or heap memory. Because this code path is reachable from the network without authentication, a remote attacker could exploit this flaw to achieve arbitrary code execution. The fix tightens the bounds enforcement and ensures the function returns the co

critical

Integer Overflow to Heap Buffer Overflow: How a Missing Size Check Almost Took Down an Embedded Web Server

A critical integer overflow vulnerability (CWE-190 → CWE-122) was discovered and fixed in an embedded ESP web server, where the HTTP Content-Length header value was cast to a signed integer and used directly in a `malloc()` call without proper size validation. On 32-bit systems, a crafted request with a maximum-sized Content-Length value could cause the allocation size to wrap to zero, allowing an attacker to overflow the heap with arbitrary data. The fix correctly validates the signed header va