Buffer Overflow in hoeldb.c: How sprintf() Threatened a Racing Sim's Database Layer
Introduction
The src/simmonitor/db/hoeldb.c file is the database access layer for a simulation monitor — it handles querying stints, laps, and sessions from a racing telemetry backend. It's infrastructure code: unglamorous, rarely touched, and exactly the kind of file where dangerous patterns quietly accumulate. A recent security scan flagged two functions in this file — getstints() and getlaps() — where fixed-size heap buffers were being written to with sprintf(), a combination that's been causing security incidents in C codebases for decades.
This post walks through exactly what was wrong, why it's exploitable, and what the fix looks like in concrete terms.
The Vulnerability Explained
Fixed Buffers + sprintf() = Undefined Territory
At line 121 in hoeldb.c, inside the getstints() function, the original code did this:
// VULNERABLE: Fixed 150-byte heap allocation
char* query = malloc(150 * sizeof(char));
// VULNERABLE: No bounds checking whatsoever
sprintf(query, "select stint_id, driver_id, team_member_id, session_id, car_id, game_car_id, laps, valid_laps, best_lap_id FROM %s WHERE session_id=%i", "Stints", use_id);
And at line 149, in getlaps(), the same pattern appeared with a slightly larger buffer:
// VULNERABLE: Fixed 250-byte heap allocation
char* query = malloc(250 * sizeof(char));
sprintf(query, "select lap_id, stint_id, sector_1, sector_2, sector_3, grip, tyre, time, ...", "Laps", use_id);
Let's count the bytes in that first query string. The static portion alone — before %s and %i are substituted — is already approximately 120 characters. With "Stints" substituted in, you're at ~126 bytes. Then use_id is formatted as a signed integer (%i). A normal session ID like 42 keeps you well within 150 bytes. But INT_MAX (2147483647) is 10 digits. That pushes the formatted string to ~136 bytes — still technically safe, but with only ~14 bytes of margin in a 150-byte buffer. More critically, INT_MIN (-2147483648) is 11 digits plus a minus sign, and any future developer adding even a small amount of additional query text could silently tip this over the edge.
The second buffer at 250 bytes has more headroom, but the pattern is identically unsafe — there is no guarantee the output will fit, and sprintf() will happily write past the end of the allocation if it doesn't.
What Happens When It Overflows?
When sprintf() writes beyond the 150-byte malloc'd buffer, it corrupts adjacent heap memory. In practice, this means:
- Heap metadata corruption: The allocator's bookkeeping structures live adjacent to heap chunks. Overwriting them can cause crashes, double-frees, or exploitable heap layouts.
- Adjacent object corruption: Whatever was malloc'd before or after
queryon the heap can be silently overwritten. In this codebase, that might be aStintDboorLapDbostruct containing race telemetry data — or worse, a pointer. - Controlled write primitive: In a sufficiently adversarial scenario where
use_idcan be influenced by external input (e.g., from a network-facing session management layer), an attacker who can control the integer value and predict heap layout could turn this into an exploitable write primitive.
The Inconsistency That Made It Worse
What makes this particularly notable is that other functions in the same file already use asprintf() correctly for dynamic allocation. The vulnerability wasn't a matter of the developer not knowing about safer alternatives — asprintf() was already the established pattern in hoeldb.c. These two functions were simply inconsistent with the rest of the file.
The Companion Bug: malloc() Instead of calloc()
The PR also caught a secondary issue in getsessions() and getstints(). The row data arrays were allocated like this:
// VULNERABLE: Uninitialized memory
sess->rows = malloc(sizeof(SessionRowData) * result.nb_rows);
stint->rows = malloc(sizeof(StintRowData) * result.nb_rows);
Using malloc() for struct arrays means the memory contains whatever garbage was there before. If get_row_results() doesn't initialize every field (or if error handling exits early), the application could read and process uninitialized struct fields as valid telemetry data. This is a separate class of memory safety issue — not an overflow, but a use-of-uninitialized-memory bug.
The Fix
The fix makes three targeted changes, each addressing a specific memory safety concern.
Change 1: Replace malloc+sprintf with asprintf (Lines 121 and 149)
// BEFORE: Fixed buffer, no bounds checking
char* query = malloc(150 * sizeof(char));
sprintf(query, "select stint_id, driver_id, team_member_id, session_id, "
"car_id, game_car_id, laps, valid_laps, best_lap_id "
"FROM %s WHERE session_id=%i", "Stints", use_id);
// AFTER: Dynamic allocation, always exactly the right size
char* query;
asprintf(&query, "select stint_id, driver_id, team_member_id, session_id, "
"car_id, game_car_id, laps, valid_laps, best_lap_id "
"FROM %s WHERE session_id=%i", "Stints", use_id);
asprintf() is a GNU/BSD extension that formats a string and allocates exactly as much memory as needed — no guessing, no fixed buffers, no overflow possible. The buffer is always precisely strlen(result) + 1 bytes. This is exactly how the rest of hoeldb.c already builds query strings, so this change makes the file internally consistent.
The same transformation was applied to the getlaps() function at line 149, replacing the 250-byte fixed buffer with an asprintf() call.
Change 2: Replace malloc with calloc for Row Arrays (Lines 102 and 133)
// BEFORE: Uninitialized heap memory for struct arrays
sess->rows = malloc(sizeof(SessionRowData) * result.nb_rows);
stint->rows = malloc(sizeof(StintRowData) * result.nb_rows);
// AFTER: Zero-initialized memory, correct argument order
sess->rows = calloc(result.nb_rows, sizeof(SessionRowData));
stint->rows = calloc(result.nb_rows, sizeof(StintRowData));
calloc() zero-initializes the allocated memory, eliminating the uninitialized memory risk. Note also that calloc() takes (count, size) as arguments rather than (size * count) — this form also provides overflow protection on the multiplication itself, since calloc() implementations are required to detect and handle size overflow internally.
Prevention & Best Practices
Never Use sprintf() with Heap-Allocated Fixed Buffers in C
The fundamental rule: if you're using malloc() to allocate a buffer and then sprintf() to fill it, you need to either:
- Know the exact maximum size and verify it mathematically (error-prone, fragile under maintenance), or
- Use snprintf() with the buffer size as a parameter, or
- Use asprintf() / vasprintf() for fully dynamic allocation (preferred)
Use snprintf() When You Must Use Fixed Buffers
If asprintf() isn't available (e.g., on Windows without a compatibility layer), snprintf() is the safe alternative:
char query[150];
int written = snprintf(query, sizeof(query),
"SELECT ... FROM %s WHERE session_id=%i", "Stints", use_id);
if (written < 0 || written >= (int)sizeof(query)) {
// Handle truncation or error
return ERROR_QUERY_TOO_LONG;
}
The regression test included in this PR demonstrates this pattern explicitly, testing 25 adversarial integer values including INT_MAX, INT_MIN, and common boundary values.
Use calloc() for Struct Arrays
When allocating arrays of structs that will be populated by external data (like database results), prefer calloc() over malloc(). Zero-initialization ensures that any fields not explicitly set by your population function won't contain garbage data that could be misinterpreted.
Establish and Enforce a File-Level Pattern
The fact that asprintf() was already used correctly elsewhere in hoeldb.c suggests a code review gap — these two functions were written (or copied) without following the established pattern. Code review checklists for C files should include: "Does every dynamic string construction use the same safe pattern established in this file?"
Static Analysis Tools That Would Catch This
- Clang Static Analyzer: Detects potential buffer overflows in
sprintf()calls - Coverity: Specifically flags
BUFFER_SIZE_WARNINGfor fixed-size buffers withsprintf() - AddressSanitizer (ASan): Would catch the overflow at runtime during testing
- Valgrind: Would detect uninitialized memory reads from the
malloc()row arrays
Relevant Standards
- CWE-121: Stack-based Buffer Overflow (the heap analog is CWE-122)
- CWE-122: Heap-based Buffer Overflow — directly applicable here
- CWE-676: Use of Potentially Dangerous Function (
sprintfis explicitly listed) - CERT C Rule STR31-C: Guarantee that storage for strings has sufficient space for character data and the null terminator
Key Takeaways
getstints()andgetlaps()inhoeldb.cboth usedmalloc(N)+sprintf()while the rest of the file correctly usedasprintf()— inconsistency within a single file is a red flag worth catching in code review.- The 150-byte buffer in
getstints()provides only ~14 bytes of margin withINT_MAXasuse_id— a seemingly safe buffer that becomes dangerous under edge-case integer values or minor future query modifications. asprintf()eliminates the need to calculate or guess buffer sizes — it should be the default choice for dynamic SQL query construction in C codebases on Linux/BSD platforms.calloc()is strictly safer thanmalloc()for struct arrays — zero initialization prevents uninitialized memory reads and makes thecalloc(count, size)form protect against size multiplication overflow.- The regression test in this PR tests 25 adversarial integer values including
INT_MAX,INT_MIN, and common bit patterns — this is the right level of thoroughness for a security-critical memory function.
Conclusion
This vulnerability in hoeldb.c is a textbook example of how memory safety issues accumulate in C codebases: not through dramatic architectural failures, but through small inconsistencies that compound over time. Two functions used a pattern (malloc + sprintf) that was already known to be unsafe and was already avoided elsewhere in the same file. The fix is clean and minimal — replacing four lines across two functions with asprintf() calls and switching malloc() to calloc() for row arrays.
The broader lesson for C developers is to treat sprintf() with a fixed-size buffer as a code smell that warrants immediate scrutiny. When you see malloc(N) followed by sprintf(), ask: "Can I prove this will never overflow under any valid input, including boundary values, and under any future maintenance?" If the answer is anything other than a confident yes with mathematical backing, reach for asprintf() or snprintf() instead.
Security in systems code is often about discipline and consistency — and this fix restores both.