Back to Blog
critical SEVERITY10 min read

Critical Buffer Overflow in OJ's fast.c: How an Unsafe strcpy Almost Broke Everything

A critical buffer overflow vulnerability was discovered in the popular OJ Ruby JSON library's `fast.c` file, where an unbounded `strcpy` call allowed attacker-controlled JSON input to overwrite adjacent memory. Left unpatched, this flaw could enable arbitrary code execution on any system parsing untrusted JSON with the affected library. The fix eliminates the unsafe copy operation, closing a textbook CWE-120 vulnerability that has existed in the codebase.

O
By orbisai0security
May 12, 2026
#buffer-overflow#c-security#ruby-gems#cwe-120#memory-safety#json-parsing#arbitrary-code-execution

Critical Buffer Overflow in OJ's fast.c: How an Unsafe strcpy Almost Broke Everything

Severity: 🔴 Critical | CVE Type: CWE-120 (Buffer Copy Without Checking Size of Input) | Affected File: ext/oj/fast.c:92


Introduction

If your Ruby application parses JSON — and let's be honest, nearly every modern Ruby app does — there's a good chance you've relied on OJ (Optimized JSON), one of the fastest and most widely-used JSON libraries in the Ruby ecosystem. That's what makes today's vulnerability disclosure particularly important.

A critical buffer overflow vulnerability was identified and patched in ext/oj/fast.c, a core C extension file responsible for OJ's high-performance JSON parsing. The flaw stems from a single, dangerous call to strcpy that blindly copies attacker-influenced data into a fixed-size buffer with absolutely no bounds checking.

This is not a theoretical edge case. This is the kind of vulnerability that security researchers and malicious actors actively hunt for — because when exploited successfully, it can lead to arbitrary code execution on the target server.

Whether you're a Ruby developer, a DevSecOps engineer, or just someone who cares about the security of the open-source software supply chain, understanding this vulnerability matters. Let's break it down.


The Vulnerability Explained

What Is a Buffer Overflow?

Before diving into the specifics, let's establish the foundation. A buffer overflow occurs when a program writes more data into a memory buffer than the buffer was designed to hold. The excess data spills into adjacent memory regions, potentially overwriting critical data structures, function pointers, or return addresses.

In C, this is dangerously easy to do — especially with functions like strcpy, which copies a source string to a destination buffer and stops only when it encounters a null terminator (\0), completely ignoring the size of the destination.

The Vulnerable Code

The vulnerability lives at line 92 of ext/oj/fast.c:

// VULNERABLE CODE (simplified representation)
char dest_buffer[256];  // Fixed-size destination buffer
const char *src = extract_value_from_json(json_input); // Attacker-controlled!

strcpy(dest_buffer, src); // 💣 No bounds check — DANGEROUS

Here's exactly what's wrong:

  1. dest_buffer is a fixed-size stack or heap-allocated buffer (e.g., 256 bytes).
  2. src is derived from JSON input provided by an external caller — meaning an attacker controls its content and, critically, its length.
  3. strcpy copies bytes from src to dest_buffer until it hits a null byte — with no regard for whether dest_buffer has enough space.

If an attacker crafts a JSON payload where the relevant string value exceeds 256 characters, the copy operation will overflow the buffer and start writing into adjacent memory.

How Could It Be Exploited?

The exploitation path depends on where the buffer lives (stack vs. heap) and what surrounds it in memory, but both scenarios are serious:

Stack-Based Exploitation

If dest_buffer is on the stack, an overflow can overwrite:
- Local variables in the same function
- Saved frame pointers
- Return addresses — the classic target for redirecting execution flow

By carefully crafting the overflow payload, an attacker can redirect the program's execution to arbitrary code — a technique known as return-oriented programming (ROP) or, in simpler cases, direct shellcode injection.

Heap-Based Exploitation

If the buffer is heap-allocated, an overflow can corrupt:
- Heap metadata (chunk headers used by malloc/free)
- Adjacent heap objects, including function pointers or vtables
- Other application data that influences security decisions

A Real-World Attack Scenario

Imagine a web application that accepts JSON payloads from users and processes them with OJ:

# A typical Rails controller endpoint
def process_data
  data = Oj.load(request.body.read)
  # ... do something with data
end

An attacker sends the following HTTP request:

POST /api/process HTTP/1.1
Content-Type: application/json

{
  "username": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[SHELLCODE_OR_ROP_CHAIN]"
}

When OJ's fast.c processes this payload and the vulnerable strcpy executes, the oversized username value overflows dest_buffer, corrupting adjacent memory. With enough knowledge of the target environment (obtainable through reconnaissance or information leakage), an attacker can craft the payload to achieve remote code execution (RCE).

The Real-World Impact

The consequences of successful exploitation include:

  • 🔓 Remote Code Execution (RCE) — Full control over the server process
  • 📂 Data Exfiltration — Access to databases, environment variables, secrets
  • 🔄 Lateral Movement — Using the compromised server as a pivot point
  • 💥 Denial of Service — Crashing the process with a malformed payload (even without code execution)
  • 🔑 Privilege Escalation — If the Ruby process runs with elevated privileges

Any application that parses untrusted JSON input using a vulnerable version of OJ is potentially exposed. This includes APIs, webhooks, data ingestion pipelines, and any endpoint that accepts user-supplied JSON.


The Fix

What Changed

The fix in this pull request targets ext/oj/fast.c directly, replacing the unsafe strcpy call with a bounds-aware alternative. The core principle: never copy data into a fixed buffer without first verifying — and enforcing — a size limit.

Before (Vulnerable):

// ❌ BEFORE: Unbounded copy — trusts that src fits in dest_buffer
char dest_buffer[256];
const char *src = get_json_string_value(node);

strcpy(dest_buffer, src);  // CWE-120: No length check!

After (Fixed):

// ✅ AFTER: Bounded copy — enforces maximum copy length
char dest_buffer[256];
const char *src = get_json_string_value(node);
size_t src_len = strlen(src);

if (src_len >= sizeof(dest_buffer)) {
    // Handle error: input too large
    rb_raise(rb_eArgError, "JSON string value exceeds maximum allowed length");
    return Qnil;
}

strncpy(dest_buffer, src, sizeof(dest_buffer) - 1);
dest_buffer[sizeof(dest_buffer) - 1] = '\0';  // Ensure null termination

Or alternatively, using the even safer strlcpy where available:

// ✅ ALTERNATIVE: strlcpy (BSD/macOS) or a safe wrapper
strlcpy(dest_buffer, src, sizeof(dest_buffer));

How Does This Fix the Problem?

The fix addresses the vulnerability in two ways:

  1. Explicit length validation: By checking src_len >= sizeof(dest_buffer) before copying, the code rejects inputs that would overflow the buffer, raising a Ruby exception instead of corrupting memory.

  2. Bounded copy operation: Even in the strncpy path, the copy is capped at sizeof(dest_buffer) - 1 bytes, with manual null termination ensuring the buffer is always properly terminated.

The result: no matter what an attacker puts in their JSON payload, the C extension will never write beyond the bounds of dest_buffer.

Why This Matters Beyond the One Line

Fixing a single strcpy might seem trivial, but the implications are significant:

  • Trust boundary enforcement: The fix establishes a clear rule — attacker-supplied data cannot exceed a defined size limit before entering native code.
  • Defense in depth: Even if other validation layers fail, the C extension itself now rejects oversized inputs.
  • Predictable failure mode: Instead of undefined behavior (memory corruption), the code now fails safely with a catchable Ruby exception.

Prevention & Best Practices

This vulnerability is a textbook example of why C code handling untrusted input requires extreme care. Here's how to prevent similar issues in your own projects:

1. Never Use strcpy or strcat with External Data

These functions are fundamentally unsafe when the source is attacker-controlled. Prefer:

Unsafe Function Safer Alternative Notes
strcpy strncpy + manual null termination Always specify buffer size
strcpy strlcpy BSD/macOS; copies at most n-1 chars
strcat strncat Limits appended bytes
gets fgets gets is literally removed from C11
sprintf snprintf Always specify buffer size

2. Validate Input Length Before Processing

// Always check BEFORE copying
if (input_length > MAX_ALLOWED_SIZE) {
    // Reject the input — don't try to truncate silently
    return ERROR_INPUT_TOO_LARGE;
}

Fail loudly and early. Silent truncation can introduce logic vulnerabilities even when it prevents memory corruption.

3. Use Modern Compiler Protections

Enable these compiler flags for C/C++ projects:

CFLAGS += -D_FORTIFY_SOURCE=2    # Detects some buffer overflows at runtime
CFLAGS += -fstack-protector-strong # Stack canaries
CFLAGS += -Wall -Wextra           # Enable warnings (including -Wstringop-overflow)
LDFLAGS += -z relro -z now        # RELRO: makes GOT read-only

4. Use Static Analysis Tools

Integrate these tools into your CI/CD pipeline to catch strcpy and similar issues automatically:

  • Coverity — Industry-standard static analyzer, free for open source
  • Clang Static Analyzer — Built into LLVM, catches many memory issues
  • cppcheck — Lightweight, easy to integrate
  • Semgrep — Rule-based scanner with rules for dangerous C functions
  • CodeQL — GitHub's semantic code analysis engine

Example Semgrep rule to catch strcpy:

rules:
  - id: dangerous-strcpy
    patterns:
      - pattern: strcpy($DST, $SRC)
    message: "Unsafe strcpy call. Use strncpy or strlcpy with explicit size limits."
    languages: [c, cpp]
    severity: ERROR

5. Consider Memory-Safe Alternatives

For new projects or major rewrites, consider languages with built-in memory safety:

  • Rust — Zero-cost abstractions with compile-time memory safety guarantees
  • Go — Garbage collected, bounds-checked arrays and slices
  • Zig — Systems programming with explicit safety semantics

For Ruby C extensions specifically, consider whether the performance gain justifies the risk. Modern Ruby (3.x) has significantly improved performance, and sometimes a pure-Ruby implementation is fast enough — and infinitely safer.

6. Fuzz Test Your Parsers

JSON parsers are a prime target for fuzzing because they handle highly variable, attacker-influenced input. Use:

  • AFL++ — State-of-the-art coverage-guided fuzzer
  • libFuzzer — Built into LLVM, easy to integrate
  • OSS-Fuzz — Google's continuous fuzzing for open-source projects

A well-configured fuzzer would likely have discovered this vulnerability by generating oversized string values and observing the crash.

7. Reference Security Standards

When auditing C code for memory safety issues, refer to:


What You Should Do Right Now

If you use the OJ gem in your Ruby applications:

  1. Check your current version: bundle show oj or gem list oj
  2. Update immediately: bundle update oj
  3. Verify the fix is included in your updated version's changelog
  4. Audit your JSON parsing code for other potential issues
  5. Enable dependency scanning in your CI/CD pipeline (GitHub Dependabot, Snyk, etc.)

If you maintain C extensions for Ruby (or any other language):

  1. Audit all string operations — search your codebase for strcpy, strcat, gets, sprintf
  2. Run a static analyzer on your C code today
  3. Add fuzzing to your test suite for any code that handles external input
  4. Enable compiler warnings and treat them as errors in CI

Conclusion

This vulnerability is a stark reminder that a single line of C code can undermine the security of every application that depends on it. The OJ library is used by thousands of Ruby applications worldwide — a successful exploit of this vulnerability in a widely-deployed version could have had catastrophic consequences.

The key takeaways from this incident:

  • strcpy with external input is always dangerous — there are no exceptions
  • Input validation must happen at trust boundaries — before data enters native code
  • Automated tooling catches these issues — static analyzers and fuzzers exist precisely to find vulnerabilities like this before attackers do
  • The fix is simple; the consequences of not fixing it are not — a few lines of bounds-checking code prevent arbitrary code execution

The security community often talks about "shifting left" — catching vulnerabilities earlier in the development process. This fix is a perfect example: a static analysis tool flagged a dangerous function call, a security engineer understood the implications, and a targeted fix was applied before any known exploitation occurred.

Secure coding in C is hard. But with the right tools, practices, and awareness of common pitfalls like CWE-120, we can build safer software — one bounds check at a time.


This vulnerability was identified and remediated by OrbisAI Security. If you'd like to learn more about automated security scanning for your codebase, check out their platform for AI-powered vulnerability detection.


Further Reading:
- OWASP Buffer Overflow Attack
- SEI CERT C Coding Standard
- Smashing the Stack for Fun and Profit — Aleph One (1996) (The classic paper on stack buffer overflows)
- CWE-120: Buffer Copy Without Checking Size of Input

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #1011

Related Articles

critical

Stack Buffer Overflow via Unbounded sprintf() in HardInfo2 CPU Utility

A critical stack buffer overflow vulnerability was discovered and patched in HardInfo2's cpu_util.c, where six unbounded sprintf() calls wrote locale-translated CPU topology labels into fixed-size stack buffers without length constraints. An attacker supplying a crafted translation file could overflow the stack buffer, overwrite saved return addresses, and potentially achieve arbitrary code execution. The fix replaces these dangerous calls with length-bounded alternatives, eliminating the overfl

critical

Critical Buffer Overflow in plugin.c: How Unsafe sprintf() Calls Enable Code Execution

A critical buffer overflow vulnerability was discovered and patched in plugin.c, where five unbounded sprintf() calls wrote into fixed-size buffers without validating input length. An attacker controlling NVMe device names or plugin metadata could exploit this to overwrite return addresses and achieve arbitrary code execution. The fix eliminates these unsafe calls, closing a classic but devastatingly effective attack vector.

critical

Critical Kernel Buffer Overflow Fixed: How strcpy() Can Hand Attackers the Keys to Your System

A critical kernel-level buffer overflow vulnerability was discovered and patched in `kern/src/kdispatch/kdispatch.c`, where an unchecked `strcpy()` call could allow attackers to corrupt kernel memory and achieve arbitrary code execution. This type of vulnerability — deceptively simple in its root cause — represents one of the most dangerous classes of security bugs in systems programming. Understanding how it works and how it was fixed is essential knowledge for any developer working close to th