Back to Blog
high SEVERITY7 min read

Thread-Safe Tokenization: Fixing strtok() Reentrancy in Game Script Parsing

A high-severity vulnerability was discovered in `lvl_script_commands.c` where the use of the non-reentrant `strtok()` function during level script parsing created conditions for memory corruption and potential arbitrary code execution. The fix replaces all `strtok()` calls with the thread-safe `strtok_r()` variant, eliminating shared global state that could be exploited through maliciously crafted level files. This change is part of a broader effort to harden the game's script parsing pipeline a

O
By orbisai0security
May 28, 2026

Thread-Safe Tokenization: Fixing a Hidden strtok() Reentrancy Bug in Game Script Parsing

Introduction

At first glance, a call to strtok() looks harmless — it's a standard C library function taught in introductory programming courses. But lurking beneath its simple interface is a well-known trap: it uses global, shared state. In a game engine that parses user-supplied level scripts, this design flaw can become a serious security liability.

This post explores a high-severity vulnerability found in src/lvl_script_commands.c, where the use of strtok() in script command parsing created conditions for memory corruption. We'll walk through what went wrong, how it was fixed, and what every C developer should know about safe string tokenization.


The Vulnerability Explained

What is strtok() and Why Is It Dangerous?

strtok() is a C standard library function used to split strings into tokens based on a delimiter. Here's the catch: it maintains internal state using a static (global) pointer. This means:

  1. Only one tokenization operation can safely be "in progress" at any time.
  2. If a second call to strtok() occurs — from a nested function, a signal handler, or another thread — it corrupts the state of the first operation.
  3. In a complex parsing pipeline, this is almost impossible to reason about at a glance.
// Dangerous: strtok() stores state in a hidden global variable
char *flag = strtok(new_value, " ");
while (flag != NULL) {
    // If anything inside here calls strtok() again (directly or indirectly),
    // the outer loop's state is silently destroyed
    flag = strtok(NULL, " ");
}

The Attack Surface: Malicious Level Files

This vulnerability lives inside set_power_configuration_check(), a function responsible for parsing configuration flags from level script commands. The game loads .lvl files supplied by users (e.g., custom maps or mods), and these files are parsed directly by this code.

An attacker who crafts a malicious level file can:

  • Trigger reentrancy: Cause nested parsing calls that corrupt strtok()'s internal pointer mid-loop.
  • Cause use-after-free or out-of-bounds reads: The corrupted pointer can point to already-freed or unintended memory regions.
  • Chain with other vulnerabilities: When combined with the integer overflow in allocation (V-004) and a potential use-after-free (V-005) in the same codebase, this reentrancy bug becomes a stepping stone toward arbitrary code execution.

Real-World Impact

Consider this scenario:

  1. A player downloads a community-made level file from an untrusted source.
  2. The level file contains a specially crafted SET_POWER_CONFIGURATION command with a malformed flag string.
  3. During parsing, a nested call to strtok() corrupts the tokenizer's internal state.
  4. The loop reads a garbage pointer, accessing memory it shouldn't.
  5. Combined with heap layout manipulation from V-004, this becomes a controlled write primitive.

The result? A game mod that executes attacker-supplied code on the player's machine — a classic drive-by code execution scenario through a trusted-looking game file.


The Fix

Replacing strtok() with strtok_r()

The fix is clean and surgical: every call to strtok() is replaced with strtok_r(), the reentrant, thread-safe variant. Instead of relying on hidden global state, strtok_r() stores its progress in a caller-supplied pointer (saveptr), making the state explicit and local.

Before (Vulnerable)

// BEFORE: Global state, not reentrant
char *flag = strtok(new_value, " ");
while (flag != NULL)
{
    j = get_long_id(powermodel_castability_commands, flag);
    if (j < 0)
    {
        DEALLOCATE_SCRIPT_VALUE
        return;
    }
    flag = strtok(NULL, " ");  // Relies on hidden global pointer
}

After (Fixed)

// AFTER: Explicit local state, fully reentrant
char *saveptr = NULL;
char *flag = strtok_r(new_value, " ", &saveptr);
while (flag != NULL)
{
    j = get_long_id(powermodel_castability_commands, flag);
    if (j < 0)
    {
        DEALLOCATE_SCRIPT_VALUE
        return;
    }
    flag = strtok_r(NULL, " ", &saveptr);  // Uses our local saveptr
}

Why This Works

Property strtok() strtok_r()
State storage Global static variable Caller-provided pointer
Reentrant ❌ No ✅ Yes
Thread-safe ❌ No ✅ Yes
Nested call safe ❌ No ✅ Yes
POSIX standard ✅ Yes ✅ Yes

By making the tokenizer state explicit (saveptr), the code becomes immune to reentrancy corruption. Each parsing operation owns its own state, and no amount of nested calls or concurrent execution can interfere with it.

The fix was applied in two separate locations within set_power_configuration_check() — one for castability flags and one for properties flags — ensuring complete coverage of the vulnerable code paths.


Prevention & Best Practices

1. Ban strtok() in Security-Sensitive Code

The simplest rule: treat strtok() as deprecated in any code that handles untrusted input or operates in a multithreaded context. Add a linter rule or compiler warning to flag its use.

# Example: grep for strtok() usage excluding strtok_r()
grep -rn '\bstrtok\b' src/ | grep -v 'strtok_r'

2. Always Use Reentrant Alternatives

Avoid Use Instead
strtok() strtok_r() (POSIX) or strtok_s() (C11)
strerror() strerror_r()
localtime() localtime_r()
rand() rand_r() or platform CSPRNG

3. Validate All Script/Config Input Before Parsing

Before tokenizing user-supplied strings, enforce constraints:

// Check length before parsing
if (strlen(new_value) > MAX_CONFIG_VALUE_LENGTH) {
    SCRIPT_ERRORF("Configuration value too long");
    return;
}

// Whitelist allowed characters
if (strspn(new_value, ALLOWED_FLAG_CHARS) != strlen(new_value)) {
    SCRIPT_ERRORF("Invalid characters in configuration value");
    return;
}

4. Treat Level/Mod Files as Untrusted Input

Game engines often treat local files as implicitly trusted. This is a dangerous assumption in an era of:
- Modding communities sharing files on third-party platforms
- Malicious mods distributed through compromised accounts
- Social engineering attacks targeting gamers

Apply the same rigor to file parsing that you would to network input.

5. Use Static Analysis Tools

Tools that can catch strtok() misuse and related issues:

  • Clang Static Analyzer — detects use of non-reentrant functions
  • Coverity — flags thread-safety issues
  • cppcheck — general C/C++ static analysis
  • Semgrep — custom rules for banning specific function calls
  • AddressSanitizer (ASan) — runtime detection of memory corruption

6. Understand the CWE Landscape

This vulnerability relates to:

  • CWE-190: Integer Overflow or Wraparound (referenced in the PR for the broader vulnerability class)
  • CWE-364: Signal Handler Race Condition (same class of reentrancy issues)
  • CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer
  • CWE-416: Use After Free (chained vulnerability V-005)

Understanding how these CWEs chain together is critical for threat modeling complex parsers.

7. Fuzz Your Parsers

Level script parsers are a prime target for fuzzing. Tools like AFL++ or libFuzzer can generate malformed input that triggers exactly the kind of edge cases that lead to reentrancy corruption:

# Example: fuzz the script parser with AFL++
afl-fuzz -i seed_scripts/ -o findings/ -- ./game --parse-script @@

Conclusion

The strtok()strtok_r() migration might look like a minor code quality improvement, but in the context of a game engine parsing untrusted level files, it closes a real attack vector. The key lessons from this vulnerability are:

  • Global state is a security risk — functions that hide state in static variables are inherently dangerous in complex, reentrant codebases.
  • Parser code deserves adversarial scrutiny — any code that processes user-controlled files should be treated with the same care as a web application handling HTTP requests.
  • Vulnerability chaining is real — this reentrancy bug alone might seem low-risk, but combined with integer overflows and use-after-free conditions in the same file, it becomes a pathway to code execution.
  • The fix is simple; the discovery is hard — two-line changes like this one are easy to make once you know where to look. Automated security scanning and code review are essential to surface these issues before attackers do.

Secure coding in C isn't about avoiding the language — it's about knowing which functions carry hidden risks and choosing safer alternatives consistently. When in doubt, reach for the reentrant variant.


This vulnerability was identified and fixed by automated security scanning. Automated tools can catch subtle issues like non-reentrant function usage at scale — consider integrating security scanning into your CI/CD pipeline.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #4800

Related Articles

critical

Heap Corruption via Unchecked memcpy: How Integer Overflow Bugs Corrupt Memory in Windows File Operations

A critical buffer overflow vulnerability was discovered in `phlib/nativefile.c`, where multiple `memcpy` calls copied filename and extended-attribute data into fixed-size structures without verifying that source lengths didn't exceed destination buffer boundaries. An attacker supplying an oversized filename or EA name could corrupt adjacent heap memory, potentially enabling arbitrary code execution. The fix replaces unchecked arithmetic with Windows' safe integer helpers (`RtlULongAdd`, `RtlULon

critical

Critical DHCP Heap Overflow: How a Missing Bounds Check Opens the Door to Memory Corruption

A critical heap buffer overflow vulnerability was discovered in a DHCP server implementation where the hardware address length field (`hlen`) from an attacker-controlled packet was trusted without validation, allowing up to 239 bytes of heap corruption. The fix adds a simple bounds check before the memory copy, ensuring the copy length never exceeds the destination buffer size. This type of vulnerability can lead to remote code execution, denial of service, or full system compromise in network-f

critical

Stack Buffer Overflow in Kernel HAL: How vsprintf Almost Became a Ring-0 Exploit

A critical stack buffer overflow vulnerability was discovered in the ARM Hardware Abstraction Layer (HAL) initialization code, where an unchecked `vsprintf()` call could allow an attacker to overwrite the stack frame and achieve arbitrary code execution at the kernel level (ring-0). The fix replaces `vsprintf()` with `vsnprintf()` — a single-character change with enormous security implications. Left unpatched, this vulnerability could have allowed malicious hardware enumeration data or boot-time

critical

Critical Buffer Overflow in RC Device Parser: How One Missing Bounds Check Opens the Door to Memory Corruption

A critical buffer overflow vulnerability was discovered in the RC device request parser (`rcdevice.c`), where incoming packet data was written to a fixed-size buffer using an attacker-controlled length field as the only guard. Because the expected data length was parsed directly from the packet without being validated against the actual allocated buffer size, a malicious packet could overflow the buffer and overwrite adjacent stack or heap memory with arbitrary bytes. The fix adds a single, esse

high

Buffer Overflow in RS-232 Serial Input: How a Missing Length Check Put Embedded Systems at Risk

A critical buffer overflow vulnerability was discovered in `serial.c`, where the `rs232_buffered_input` function could write more bytes than the destination buffer `rs232_ibuff` could hold — with no size limit to stop it. An attacker with access to the RS-232 serial port could exploit this to overwrite adjacent OS memory, including return addresses and critical data structures. The fix adds a simple but essential bounds check that clamps the returned byte count to the actual buffer size.

high

Shell Injection via Unsafe String Concatenation in gRPCurl Command Generation

A high-severity vulnerability was discovered in PaddleOCR's deployment configuration where model download URLs were specified using unencrypted `http://`, exposing users to man-in-the-middle attacks that could allow an attacker to intercept and replace model files with malicious ones. The fix upgrades all model download URLs to use `https://`, ensuring encrypted transmission and integrity of the downloaded files. This change is a critical security baseline for any application that downloads bina