Back to Blog
critical SEVERITY8 min read

Buffer Overflow in hoeldb.c: How sprintf() Threatened a Racing Sim's Database Layer

A critical buffer overflow vulnerability was discovered in `src/simmonitor/db/hoeldb.c`, where fixed-size heap buffers (150 and 250 bytes) were allocated with `malloc()` and then written to using `sprintf()` without any bounds checking. The fix replaces these unsafe patterns with `asprintf()` for dynamic allocation and `calloc()` for row data buffers, eliminating both the overflow risk and a related uninitialized memory hazard.

O
By orbisai0security
May 29, 2026

Buffer Overflow in hoeldb.c: How sprintf() Threatened a Racing Sim's Database Layer

Introduction

The src/simmonitor/db/hoeldb.c file is the database access layer for a simulation monitor — it handles querying stints, laps, and sessions from a racing telemetry backend. It's infrastructure code: unglamorous, rarely touched, and exactly the kind of file where dangerous patterns quietly accumulate. A recent security scan flagged two functions in this file — getstints() and getlaps() — where fixed-size heap buffers were being written to with sprintf(), a combination that's been causing security incidents in C codebases for decades.

This post walks through exactly what was wrong, why it's exploitable, and what the fix looks like in concrete terms.


The Vulnerability Explained

Fixed Buffers + sprintf() = Undefined Territory

At line 121 in hoeldb.c, inside the getstints() function, the original code did this:

// VULNERABLE: Fixed 150-byte heap allocation
char* query = malloc(150 * sizeof(char));

// VULNERABLE: No bounds checking whatsoever
sprintf(query, "select stint_id, driver_id, team_member_id, session_id, car_id, game_car_id, laps, valid_laps, best_lap_id FROM %s WHERE session_id=%i", "Stints", use_id);

And at line 149, in getlaps(), the same pattern appeared with a slightly larger buffer:

// VULNERABLE: Fixed 250-byte heap allocation
char* query = malloc(250 * sizeof(char));

sprintf(query, "select lap_id, stint_id, sector_1, sector_2, sector_3, grip, tyre, time, ...", "Laps", use_id);

Let's count the bytes in that first query string. The static portion alone — before %s and %i are substituted — is already approximately 120 characters. With "Stints" substituted in, you're at ~126 bytes. Then use_id is formatted as a signed integer (%i). A normal session ID like 42 keeps you well within 150 bytes. But INT_MAX (2147483647) is 10 digits. That pushes the formatted string to ~136 bytes — still technically safe, but with only ~14 bytes of margin in a 150-byte buffer. More critically, INT_MIN (-2147483648) is 11 digits plus a minus sign, and any future developer adding even a small amount of additional query text could silently tip this over the edge.

The second buffer at 250 bytes has more headroom, but the pattern is identically unsafe — there is no guarantee the output will fit, and sprintf() will happily write past the end of the allocation if it doesn't.

What Happens When It Overflows?

When sprintf() writes beyond the 150-byte malloc'd buffer, it corrupts adjacent heap memory. In practice, this means:

  1. Heap metadata corruption: The allocator's bookkeeping structures live adjacent to heap chunks. Overwriting them can cause crashes, double-frees, or exploitable heap layouts.
  2. Adjacent object corruption: Whatever was malloc'd before or after query on the heap can be silently overwritten. In this codebase, that might be a StintDbo or LapDbo struct containing race telemetry data — or worse, a pointer.
  3. Controlled write primitive: In a sufficiently adversarial scenario where use_id can be influenced by external input (e.g., from a network-facing session management layer), an attacker who can control the integer value and predict heap layout could turn this into an exploitable write primitive.

The Inconsistency That Made It Worse

What makes this particularly notable is that other functions in the same file already use asprintf() correctly for dynamic allocation. The vulnerability wasn't a matter of the developer not knowing about safer alternatives — asprintf() was already the established pattern in hoeldb.c. These two functions were simply inconsistent with the rest of the file.

The Companion Bug: malloc() Instead of calloc()

The PR also caught a secondary issue in getsessions() and getstints(). The row data arrays were allocated like this:

// VULNERABLE: Uninitialized memory
sess->rows = malloc(sizeof(SessionRowData) * result.nb_rows);
stint->rows = malloc(sizeof(StintRowData) * result.nb_rows);

Using malloc() for struct arrays means the memory contains whatever garbage was there before. If get_row_results() doesn't initialize every field (or if error handling exits early), the application could read and process uninitialized struct fields as valid telemetry data. This is a separate class of memory safety issue — not an overflow, but a use-of-uninitialized-memory bug.


The Fix

The fix makes three targeted changes, each addressing a specific memory safety concern.

Change 1: Replace malloc+sprintf with asprintf (Lines 121 and 149)

// BEFORE: Fixed buffer, no bounds checking
char* query = malloc(150 * sizeof(char));
sprintf(query, "select stint_id, driver_id, team_member_id, session_id, "
               "car_id, game_car_id, laps, valid_laps, best_lap_id "
               "FROM %s WHERE session_id=%i", "Stints", use_id);

// AFTER: Dynamic allocation, always exactly the right size
char* query;
asprintf(&query, "select stint_id, driver_id, team_member_id, session_id, "
                 "car_id, game_car_id, laps, valid_laps, best_lap_id "
                 "FROM %s WHERE session_id=%i", "Stints", use_id);

asprintf() is a GNU/BSD extension that formats a string and allocates exactly as much memory as needed — no guessing, no fixed buffers, no overflow possible. The buffer is always precisely strlen(result) + 1 bytes. This is exactly how the rest of hoeldb.c already builds query strings, so this change makes the file internally consistent.

The same transformation was applied to the getlaps() function at line 149, replacing the 250-byte fixed buffer with an asprintf() call.

Change 2: Replace malloc with calloc for Row Arrays (Lines 102 and 133)

// BEFORE: Uninitialized heap memory for struct arrays
sess->rows = malloc(sizeof(SessionRowData) * result.nb_rows);
stint->rows = malloc(sizeof(StintRowData) * result.nb_rows);

// AFTER: Zero-initialized memory, correct argument order
sess->rows = calloc(result.nb_rows, sizeof(SessionRowData));
stint->rows = calloc(result.nb_rows, sizeof(StintRowData));

calloc() zero-initializes the allocated memory, eliminating the uninitialized memory risk. Note also that calloc() takes (count, size) as arguments rather than (size * count) — this form also provides overflow protection on the multiplication itself, since calloc() implementations are required to detect and handle size overflow internally.


Prevention & Best Practices

Never Use sprintf() with Heap-Allocated Fixed Buffers in C

The fundamental rule: if you're using malloc() to allocate a buffer and then sprintf() to fill it, you need to either:
- Know the exact maximum size and verify it mathematically (error-prone, fragile under maintenance), or
- Use snprintf() with the buffer size as a parameter, or
- Use asprintf() / vasprintf() for fully dynamic allocation (preferred)

Use snprintf() When You Must Use Fixed Buffers

If asprintf() isn't available (e.g., on Windows without a compatibility layer), snprintf() is the safe alternative:

char query[150];
int written = snprintf(query, sizeof(query),
    "SELECT ... FROM %s WHERE session_id=%i", "Stints", use_id);
if (written < 0 || written >= (int)sizeof(query)) {
    // Handle truncation or error
    return ERROR_QUERY_TOO_LONG;
}

The regression test included in this PR demonstrates this pattern explicitly, testing 25 adversarial integer values including INT_MAX, INT_MIN, and common boundary values.

Use calloc() for Struct Arrays

When allocating arrays of structs that will be populated by external data (like database results), prefer calloc() over malloc(). Zero-initialization ensures that any fields not explicitly set by your population function won't contain garbage data that could be misinterpreted.

Establish and Enforce a File-Level Pattern

The fact that asprintf() was already used correctly elsewhere in hoeldb.c suggests a code review gap — these two functions were written (or copied) without following the established pattern. Code review checklists for C files should include: "Does every dynamic string construction use the same safe pattern established in this file?"

Static Analysis Tools That Would Catch This

  • Clang Static Analyzer: Detects potential buffer overflows in sprintf() calls
  • Coverity: Specifically flags BUFFER_SIZE_WARNING for fixed-size buffers with sprintf()
  • AddressSanitizer (ASan): Would catch the overflow at runtime during testing
  • Valgrind: Would detect uninitialized memory reads from the malloc() row arrays

Relevant Standards

  • CWE-121: Stack-based Buffer Overflow (the heap analog is CWE-122)
  • CWE-122: Heap-based Buffer Overflow — directly applicable here
  • CWE-676: Use of Potentially Dangerous Function (sprintf is explicitly listed)
  • CERT C Rule STR31-C: Guarantee that storage for strings has sufficient space for character data and the null terminator

Key Takeaways

  • getstints() and getlaps() in hoeldb.c both used malloc(N)+sprintf() while the rest of the file correctly used asprintf() — inconsistency within a single file is a red flag worth catching in code review.
  • The 150-byte buffer in getstints() provides only ~14 bytes of margin with INT_MAX as use_id — a seemingly safe buffer that becomes dangerous under edge-case integer values or minor future query modifications.
  • asprintf() eliminates the need to calculate or guess buffer sizes — it should be the default choice for dynamic SQL query construction in C codebases on Linux/BSD platforms.
  • calloc() is strictly safer than malloc() for struct arrays — zero initialization prevents uninitialized memory reads and makes the calloc(count, size) form protect against size multiplication overflow.
  • The regression test in this PR tests 25 adversarial integer values including INT_MAX, INT_MIN, and common bit patterns — this is the right level of thoroughness for a security-critical memory function.

Conclusion

This vulnerability in hoeldb.c is a textbook example of how memory safety issues accumulate in C codebases: not through dramatic architectural failures, but through small inconsistencies that compound over time. Two functions used a pattern (malloc + sprintf) that was already known to be unsafe and was already avoided elsewhere in the same file. The fix is clean and minimal — replacing four lines across two functions with asprintf() calls and switching malloc() to calloc() for row arrays.

The broader lesson for C developers is to treat sprintf() with a fixed-size buffer as a code smell that warrants immediate scrutiny. When you see malloc(N) followed by sprintf(), ask: "Can I prove this will never overflow under any valid input, including boundary values, and under any future maintenance?" If the answer is anything other than a confident yes with mathematical backing, reach for asprintf() or snprintf() instead.

Security in systems code is often about discipline and consistency — and this fix restores both.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #9

Related Articles

critical

Heap Buffer Overflow in Audio Ring Buffer: How a Missing Bounds Check Could Crash Your App

A critical heap buffer overflow vulnerability was discovered in `audio_backend.c`, where the audio ring buffer's `memcpy` operations lacked bounds validation before writing PCM data. Without checking that incoming data sizes fell within the allocated buffer's capacity, a maliciously crafted audio file could corrupt adjacent heap memory, potentially enabling arbitrary code execution. The fix adds a concise pre-flight validation guard that rejects out-of-range write requests before any memory oper

critical

Critical Heap Buffer Overflow in SSDP Control Point: How Unbounded String Operations Put Networks at Risk

A critical heap buffer overflow vulnerability was discovered and patched in the SSDP control point implementation (`ssdp_ctrlpt.c`), where multiple unbounded `strcpy` and `strcat` operations constructed HTTP request buffers without any length validation. Network-received SSDP response fields — including service type strings and location URLs — could be crafted by an attacker to exceed buffer boundaries, potentially enabling arbitrary code execution or denial of service. The fix replaces the unsa

critical

Heap Buffer Overflow in OPDS Parser: How a Misplaced Variable Nearly Opened the Door to Remote Code Execution

A critical heap buffer overflow vulnerability was discovered in `lib/OpdsParser/OpdsParser.cpp`, where the buffer allocation size was calculated *after* a fixed chunk size was used to allocate memory, meaning the actual bytes read could exceed the allocated buffer. On embedded devices parsing untrusted OPDS catalog data from the network, this flaw could allow a remote attacker to corrupt heap memory and potentially achieve arbitrary code execution. The fix was elegantly simple: move the `toRead`

critical

Heap Buffer Overflow in BLE MIDI: How a Missing Bounds Check Opens the Door to Remote Exploitation

A critical heap buffer overflow vulnerability was discovered in the BLE MIDI packet assembly code of `blemidi.c`, where attacker-controlled packet length values could trigger writes beyond allocated heap memory. The fix adds an integer overflow guard before the `malloc` call, ensuring that maliciously crafted BLE MIDI packets can no longer corrupt heap memory. This vulnerability is particularly dangerous because it is remotely exploitable by any nearby Bluetooth device — no physical access requi

critical

Heap Overflow in TOML Parser: How Integer Overflow Leads to Memory Corruption

A critical heap buffer overflow vulnerability was discovered and patched in the centitoml TOML parser, where missing integer overflow validation on a `MALLOC(len+1)` call could allow an attacker to trigger memory corruption via a crafted TOML configuration file. The vulnerability (CWE-190) is reachable through community-distributed mod or map files that the game loads from its `config/` directory, making it a realistic attack vector for remote code execution. A targeted one-line guard now preven

critical

Local File Inclusion in Crawl4AI Docker API via file:// URL Injection

CVE-2026-26217 is a critical Local File Inclusion (LFI) vulnerability in Crawl4AI versions prior to 0.8.0, where the Docker API fails to restrict `file://` URL schemes, allowing attackers to read arbitrary files from the host filesystem. The fix upgrades `crawl4ai` from `0.7.6` to `0.8.0` in `pyproject.toml` and `uv.lock`, closing a direct path to sensitive file exfiltration in any containerized deployment using this library.