Back to Blog
critical SEVERITY8 min read

Critical Buffer Overflow in OJ's fast.c: How an Unsafe strcpy Nearly Opened the Door to RCE

A critical buffer overflow vulnerability was discovered and patched in the popular OJ Ruby JSON library's fast.c parser, where an unbounded strcpy call allowed attacker-controlled JSON input to overwrite adjacent memory. Left unpatched, this classic CWE-120 flaw could enable arbitrary code execution in any application parsing untrusted JSON with the affected library. The fix eliminates the unsafe copy operation, closing a potential remote code execution vector that affected countless Ruby applic

O
By orbisai0security
•May 15, 2026

Critical Buffer Overflow in OJ's fast.c: How an Unsafe strcpy Nearly Opened the Door to RCE

Severity: šŸ”“ Critical | CVE Class: CWE-120 (Buffer Copy Without Checking Size of Input) | Affected Component: ext/oj/fast.c:92


Introduction

If your Ruby application parses JSON — and chances are it does — you may be using OJ (Optimized JSON), one of the most widely adopted JSON libraries in the Ruby ecosystem. OJ is beloved for its blazing-fast C-extension parser, but that speed comes with a responsibility: C code lives close to the metal, and a single unsafe memory operation can turn a JSON parser into an attacker's playground.

A critical vulnerability was recently discovered and patched in OJ's fast.c parser: an unbounded strcpy call at line 92 that blindly copies attacker-controlled JSON data into a fixed-size buffer with zero bounds checking. This is a textbook buffer overflow — the kind of bug that has haunted C codebases for decades and continues to be one of the most dangerous classes of vulnerabilities in existence.

This post breaks down exactly what went wrong, how an attacker could have exploited it, and what the fix looks like — so you can write safer code and understand why memory safety matters even in high-level language ecosystems.


The Vulnerability Explained

What Is a Buffer Overflow?

A buffer overflow occurs when a program writes more data into a memory buffer than it was allocated to hold. The excess data spills into adjacent memory regions, potentially overwriting critical data structures, return addresses, or function pointers.

In C, the strcpy function is a notorious offender:

// DANGEROUS: strcpy copies until it hits a null terminator.
// It has NO idea how big the destination buffer is.
strcpy(destination, source);

strcpy will copy every byte from source into destination until it encounters a null byte (\0). If source is longer than the space allocated for destination, it keeps writing anyway — straight into whatever memory comes next.

The Vulnerable Code Pattern

At line 92 of ext/oj/fast.c, the parser contained an unsafe strcpy call where the source string was derived directly from attacker-controlled JSON input:

// VULNERABLE (simplified representation)
char dest_buffer[256];  // Fixed-size destination buffer

// 'source' comes from parsed JSON — attacker controls this!
strcpy(dest_buffer, source);  // āŒ No bounds check whatsoever

The critical problem here is the trust boundary violation: data flowing in from a JSON payload (which any external user can craft) is being copied directly into a fixed-size stack or heap buffer without any length validation.

How Could an Attacker Exploit This?

Let's walk through a concrete attack scenario:

Step 1 — Craft a malicious payload

An attacker constructs a JSON document with an abnormally long string value designed to overflow the buffer:

{
  "key": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[SHELLCODE_HERE]"
}

Step 2 — Trigger the vulnerable code path

The attacker sends this payload to any endpoint in your application that calls Oj.load(), Oj.safe_load(), or any wrapper that eventually invokes the fast parser.

Step 3 — Memory corruption occurs

When strcpy copies the oversized string:

Memory layout (before overflow):
[dest_buffer: 256 bytes][saved_frame_pointer][return_address][...]

Memory layout (after overflow):
[dest_buffer: 256 bytes][AAAAAAAAAA...][ATTACKER_CONTROLLED_DATA]
                                       ↑
                              Return address overwritten!

Step 4 — Arbitrary code execution

By carefully crafting the overflow payload, a sophisticated attacker can:
- Overwrite the return address on the stack to redirect execution to attacker-supplied shellcode
- Corrupt heap metadata to manipulate future memory allocations
- Overwrite function pointers to hijack control flow
- Achieve Remote Code Execution (RCE) — running arbitrary commands on the server

Real-World Impact

The impact of this vulnerability is severe because:

  1. JSON parsing is ubiquitous — virtually every modern web application parses JSON from external sources (API requests, webhooks, file uploads)
  2. OJ is widely deployed — with tens of millions of downloads, the blast radius is enormous
  3. No authentication required — any unauthenticated endpoint that accepts JSON could be a vector
  4. Full system compromise — successful exploitation means the attacker runs code with the same privileges as your application process

Consider a typical Rails API endpoint:

# This innocent-looking code could trigger the vulnerability
class ApiController < ApplicationController
  def create
    # params parsed via OJ under the hood
    data = Oj.load(request.body.read)
    # process data...
  end
end

An attacker hitting this endpoint with a crafted payload could potentially own the entire server.


The Fix

What Changed

The fix in ext/oj/fast.c removes the unsafe strcpy call and replaces it with a bounds-aware alternative. The correct approach in C is to use strncpy or, better yet, strlcpy (where available), combined with explicit length validation:

// SAFE: Always validate length before copying
size_t src_len = strlen(source);

if (src_len >= sizeof(dest_buffer)) {
    // Handle error: input too long
    rb_raise(rb_eArgError, "string value too long");
    return Qnil;
}

// Now safe — we've verified source fits in destination
strncpy(dest_buffer, source, sizeof(dest_buffer) - 1);
dest_buffer[sizeof(dest_buffer) - 1] = '\0';  // Guarantee null termination

Or using the even safer strlcpy pattern:

// strlcpy always null-terminates and returns the length
// that *would* have been copied — use this to detect truncation
size_t copied = strlcpy(dest_buffer, source, sizeof(dest_buffer));

if (copied >= sizeof(dest_buffer)) {
    // Truncation occurred — handle appropriately
    rb_raise(rb_eArgError, "string value exceeds maximum length");
    return Qnil;
}

Why This Fix Works

The fix addresses the root cause by establishing a trust boundary between external input and internal memory operations:

Aspect Before (Vulnerable) After (Fixed)
Length check āŒ None āœ… Explicit validation
Overflow possible āœ… Yes āŒ No
Attacker control āœ… Full āŒ Bounded
Error handling āŒ Silent corruption āœ… Raises exception

The key insight is simple but profound: never trust the length of externally supplied data. Always validate before copying.


Prevention & Best Practices

1. Ban Unsafe String Functions in C Code

Establish a coding standard that prohibits known-unsafe C functions. Many teams use compiler warnings or static analysis to enforce this:

// āŒ NEVER use these without bounds checking:
strcpy(dst, src);
strcat(dst, src);
sprintf(buf, fmt, ...);
gets(buf);

// āœ… USE these safer alternatives:
strncpy(dst, src, sizeof(dst) - 1);
strncat(dst, src, sizeof(dst) - strlen(dst) - 1);
snprintf(buf, sizeof(buf), fmt, ...);
fgets(buf, sizeof(buf), stdin);

2. Validate All External Input Before Processing

Every byte that crosses a trust boundary (network, file, user input) must be validated:

// Validate length BEFORE any memory operation
if (input_length > MAX_ALLOWED_LENGTH) {
    return ERROR_INPUT_TOO_LONG;
}

3. Use Memory-Safe Languages Where Possible

Consider whether the performance-critical path actually needs to be in C. Languages like Rust provide memory safety guarantees at the language level:

// Rust makes buffer overflows impossible by design
fn process_json_string(input: &str) -> Result<String, Error> {
    if input.len() > MAX_LENGTH {
        return Err(Error::InputTooLong);
    }
    // Safe string operations — no manual memory management
    Ok(input.to_string())
}

4. Enable Compiler Protections

Modern compilers and operating systems provide multiple layers of protection:

# Enable stack canaries, ASLR, and other protections
gcc -fstack-protector-strong \
    -D_FORTIFY_SOURCE=2 \
    -pie -fPIE \
    -Wl,-z,relro,-z,now \
    your_code.c

These don't prevent the vulnerability, but they make exploitation significantly harder.

5. Use Static Analysis Tools

Integrate static analysis into your CI/CD pipeline to catch these issues automatically:

# Example: GitHub Actions with CodeQL
- name: Initialize CodeQL
  uses: github/codeql-action/init@v2
  with:
    languages: cpp
    queries: security-extended

- name: Perform CodeQL Analysis
  uses: github/codeql-action/analyze@v2

Other excellent tools for C/C++ analysis:
- Coverity — deep static analysis for C/C++
- AddressSanitizer (ASan) — runtime memory error detection
- Valgrind — memory debugging and leak detection
- Clang Static Analyzer — built into the LLVM toolchain

6. Fuzz Test Your Parsers

Parsers are especially vulnerable to malformed input. Fuzzing automatically generates edge-case inputs:

# Using AFL++ for fuzzing
afl-fuzz -i input_corpus/ -o findings/ -- ./your_parser @@

Security Standards Reference

This vulnerability maps to several well-known security standards:

  • CWE-120: Buffer Copy without Checking Size of Input ('Classic Buffer Overflow')
  • CWE-121: Stack-based Buffer Overflow
  • OWASP A03:2021: Injection (includes memory injection)
  • CERT C Coding Standard STR31-C: Guarantee that storage for strings has sufficient space for character data and the null terminator

Conclusion

This vulnerability in OJ's fast.c is a stark reminder that security vulnerabilities don't respect language boundaries. You might write beautiful, safe Ruby code all day long, but if a C extension underneath you has an unsafe strcpy, your entire application's security posture is compromised.

The key takeaways from this vulnerability:

  1. Unsafe C functions like strcpy are never acceptable when handling externally-supplied data — full stop
  2. Trust boundaries matter: data from JSON payloads is attacker-controlled and must be treated with suspicion
  3. Defense in depth works: compiler protections like stack canaries and ASLR raise the bar for exploitation even when bugs slip through
  4. Automated tooling catches what humans miss: static analysis and fuzzing would have flagged this issue before it reached production
  5. Memory safety is a first-class concern: when performance allows, prefer memory-safe languages for parsing untrusted input

The security community's ability to find, responsibly disclose, and patch vulnerabilities like this one is what keeps the open-source ecosystem trustworthy. If you maintain C extensions or native libraries, consider auditing your code for similar patterns today — your users are counting on you.


Found a security vulnerability? Practice responsible disclosure by contacting the project maintainers privately before going public. Most projects have a SECURITY.md or a security contact in their repository.

This post was generated as part of an automated security fix workflow by OrbisAI Security. Automated detection + human review = faster, safer software for everyone.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #1011

Related Articles

critical

Heap Buffer Overflow in Audio Ring Buffer: How a Missing Bounds Check Could Crash Your App

A critical heap buffer overflow vulnerability was discovered in `audio_backend.c`, where the audio ring buffer's `memcpy` operations lacked bounds validation before writing PCM data. Without checking that incoming data sizes fell within the allocated buffer's capacity, a maliciously crafted audio file could corrupt adjacent heap memory, potentially enabling arbitrary code execution. The fix adds a concise pre-flight validation guard that rejects out-of-range write requests before any memory oper

critical

Critical Heap Buffer Overflow in SSDP Control Point: How Unbounded String Operations Put Networks at Risk

A critical heap buffer overflow vulnerability was discovered and patched in the SSDP control point implementation (`ssdp_ctrlpt.c`), where multiple unbounded `strcpy` and `strcat` operations constructed HTTP request buffers without any length validation. Network-received SSDP response fields — including service type strings and location URLs — could be crafted by an attacker to exceed buffer boundaries, potentially enabling arbitrary code execution or denial of service. The fix replaces the unsa

critical

Heap Buffer Overflow in OPDS Parser: How a Misplaced Variable Nearly Opened the Door to Remote Code Execution

A critical heap buffer overflow vulnerability was discovered in `lib/OpdsParser/OpdsParser.cpp`, where the buffer allocation size was calculated *after* a fixed chunk size was used to allocate memory, meaning the actual bytes read could exceed the allocated buffer. On embedded devices parsing untrusted OPDS catalog data from the network, this flaw could allow a remote attacker to corrupt heap memory and potentially achieve arbitrary code execution. The fix was elegantly simple: move the `toRead`

critical

Heap Buffer Overflow in BLE MIDI: How a Missing Bounds Check Opens the Door to Remote Exploitation

A critical heap buffer overflow vulnerability was discovered in the BLE MIDI packet assembly code of `blemidi.c`, where attacker-controlled packet length values could trigger writes beyond allocated heap memory. The fix adds an integer overflow guard before the `malloc` call, ensuring that maliciously crafted BLE MIDI packets can no longer corrupt heap memory. This vulnerability is particularly dangerous because it is remotely exploitable by any nearby Bluetooth device — no physical access requi

critical

Heap Overflow in TOML Parser: How Integer Overflow Leads to Memory Corruption

A critical heap buffer overflow vulnerability was discovered and patched in the centitoml TOML parser, where missing integer overflow validation on a `MALLOC(len+1)` call could allow an attacker to trigger memory corruption via a crafted TOML configuration file. The vulnerability (CWE-190) is reachable through community-distributed mod or map files that the game loads from its `config/` directory, making it a realistic attack vector for remote code execution. A targeted one-line guard now preven

critical

Heap Corruption via Unchecked memcpy: How Integer Overflow Bugs Corrupt Memory in Windows File Operations

A critical buffer overflow vulnerability was discovered in `phlib/nativefile.c`, where multiple `memcpy` calls copied filename and extended-attribute data into fixed-size structures without verifying that source lengths didn't exceed destination buffer boundaries. An attacker supplying an oversized filename or EA name could corrupt adjacent heap memory, potentially enabling arbitrary code execution. The fix replaces unchecked arithmetic with Windows' safe integer helpers (`RtlULongAdd`, `RtlULon