Critical Buffer Overflow in OJ's fast.c: How an Unsafe strcpy Almost Broke Everything
Severity: 🔴 Critical | CVE Type: CWE-120 (Buffer Copy Without Checking Size of Input) | Affected File:
ext/oj/fast.c:92
Introduction
If your Ruby application parses JSON — and let's be honest, nearly every modern Ruby app does — there's a good chance you've relied on OJ (Optimized JSON), one of the fastest and most widely-used JSON libraries in the Ruby ecosystem. That's what makes today's vulnerability disclosure particularly important.
A critical buffer overflow vulnerability was identified and patched in ext/oj/fast.c, a core C extension file responsible for OJ's high-performance JSON parsing. The flaw stems from a single, dangerous call to strcpy that blindly copies attacker-influenced data into a fixed-size buffer with absolutely no bounds checking.
This is not a theoretical edge case. This is the kind of vulnerability that security researchers and malicious actors actively hunt for — because when exploited successfully, it can lead to arbitrary code execution on the target server.
Whether you're a Ruby developer, a DevSecOps engineer, or just someone who cares about the security of the open-source software supply chain, understanding this vulnerability matters. Let's break it down.
The Vulnerability Explained
What Is a Buffer Overflow?
Before diving into the specifics, let's establish the foundation. A buffer overflow occurs when a program writes more data into a memory buffer than the buffer was designed to hold. The excess data spills into adjacent memory regions, potentially overwriting critical data structures, function pointers, or return addresses.
In C, this is dangerously easy to do — especially with functions like strcpy, which copies a source string to a destination buffer and stops only when it encounters a null terminator (\0), completely ignoring the size of the destination.
The Vulnerable Code
The vulnerability lives at line 92 of ext/oj/fast.c:
// VULNERABLE CODE (simplified representation)
char dest_buffer[256]; // Fixed-size destination buffer
const char *src = extract_value_from_json(json_input); // Attacker-controlled!
strcpy(dest_buffer, src); // 💣 No bounds check — DANGEROUS
Here's exactly what's wrong:
dest_bufferis a fixed-size stack or heap-allocated buffer (e.g., 256 bytes).srcis derived from JSON input provided by an external caller — meaning an attacker controls its content and, critically, its length.strcpycopies bytes fromsrctodest_bufferuntil it hits a null byte — with no regard for whetherdest_bufferhas enough space.
If an attacker crafts a JSON payload where the relevant string value exceeds 256 characters, the copy operation will overflow the buffer and start writing into adjacent memory.
How Could It Be Exploited?
The exploitation path depends on where the buffer lives (stack vs. heap) and what surrounds it in memory, but both scenarios are serious:
Stack-Based Exploitation
If dest_buffer is on the stack, an overflow can overwrite:
- Local variables in the same function
- Saved frame pointers
- Return addresses — the classic target for redirecting execution flow
By carefully crafting the overflow payload, an attacker can redirect the program's execution to arbitrary code — a technique known as return-oriented programming (ROP) or, in simpler cases, direct shellcode injection.
Heap-Based Exploitation
If the buffer is heap-allocated, an overflow can corrupt:
- Heap metadata (chunk headers used by malloc/free)
- Adjacent heap objects, including function pointers or vtables
- Other application data that influences security decisions
A Real-World Attack Scenario
Imagine a web application that accepts JSON payloads from users and processes them with OJ:
# A typical Rails controller endpoint
def process_data
data = Oj.load(request.body.read)
# ... do something with data
end
An attacker sends the following HTTP request:
POST /api/process HTTP/1.1
Content-Type: application/json
{
"username": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[SHELLCODE_OR_ROP_CHAIN]"
}
When OJ's fast.c processes this payload and the vulnerable strcpy executes, the oversized username value overflows dest_buffer, corrupting adjacent memory. With enough knowledge of the target environment (obtainable through reconnaissance or information leakage), an attacker can craft the payload to achieve remote code execution (RCE).
The Real-World Impact
The consequences of successful exploitation include:
- 🔓 Remote Code Execution (RCE) — Full control over the server process
- 📂 Data Exfiltration — Access to databases, environment variables, secrets
- 🔄 Lateral Movement — Using the compromised server as a pivot point
- 💥 Denial of Service — Crashing the process with a malformed payload (even without code execution)
- 🔑 Privilege Escalation — If the Ruby process runs with elevated privileges
Any application that parses untrusted JSON input using a vulnerable version of OJ is potentially exposed. This includes APIs, webhooks, data ingestion pipelines, and any endpoint that accepts user-supplied JSON.
The Fix
What Changed
The fix in this pull request targets ext/oj/fast.c directly, replacing the unsafe strcpy call with a bounds-aware alternative. The core principle: never copy data into a fixed buffer without first verifying — and enforcing — a size limit.
Before (Vulnerable):
// ❌ BEFORE: Unbounded copy — trusts that src fits in dest_buffer
char dest_buffer[256];
const char *src = get_json_string_value(node);
strcpy(dest_buffer, src); // CWE-120: No length check!
After (Fixed):
// ✅ AFTER: Bounded copy — enforces maximum copy length
char dest_buffer[256];
const char *src = get_json_string_value(node);
size_t src_len = strlen(src);
if (src_len >= sizeof(dest_buffer)) {
// Handle error: input too large
rb_raise(rb_eArgError, "JSON string value exceeds maximum allowed length");
return Qnil;
}
strncpy(dest_buffer, src, sizeof(dest_buffer) - 1);
dest_buffer[sizeof(dest_buffer) - 1] = '\0'; // Ensure null termination
Or alternatively, using the even safer strlcpy where available:
// ✅ ALTERNATIVE: strlcpy (BSD/macOS) or a safe wrapper
strlcpy(dest_buffer, src, sizeof(dest_buffer));
How Does This Fix the Problem?
The fix addresses the vulnerability in two ways:
-
Explicit length validation: By checking
src_len >= sizeof(dest_buffer)before copying, the code rejects inputs that would overflow the buffer, raising a Ruby exception instead of corrupting memory. -
Bounded copy operation: Even in the
strncpypath, the copy is capped atsizeof(dest_buffer) - 1bytes, with manual null termination ensuring the buffer is always properly terminated.
The result: no matter what an attacker puts in their JSON payload, the C extension will never write beyond the bounds of dest_buffer.
Why This Matters Beyond the One Line
Fixing a single strcpy might seem trivial, but the implications are significant:
- Trust boundary enforcement: The fix establishes a clear rule — attacker-supplied data cannot exceed a defined size limit before entering native code.
- Defense in depth: Even if other validation layers fail, the C extension itself now rejects oversized inputs.
- Predictable failure mode: Instead of undefined behavior (memory corruption), the code now fails safely with a catchable Ruby exception.
Prevention & Best Practices
This vulnerability is a textbook example of why C code handling untrusted input requires extreme care. Here's how to prevent similar issues in your own projects:
1. Never Use strcpy or strcat with External Data
These functions are fundamentally unsafe when the source is attacker-controlled. Prefer:
| Unsafe Function | Safer Alternative | Notes |
|---|---|---|
strcpy |
strncpy + manual null termination |
Always specify buffer size |
strcpy |
strlcpy |
BSD/macOS; copies at most n-1 chars |
strcat |
strncat |
Limits appended bytes |
gets |
fgets |
gets is literally removed from C11 |
sprintf |
snprintf |
Always specify buffer size |
2. Validate Input Length Before Processing
// Always check BEFORE copying
if (input_length > MAX_ALLOWED_SIZE) {
// Reject the input — don't try to truncate silently
return ERROR_INPUT_TOO_LARGE;
}
Fail loudly and early. Silent truncation can introduce logic vulnerabilities even when it prevents memory corruption.
3. Use Modern Compiler Protections
Enable these compiler flags for C/C++ projects:
CFLAGS += -D_FORTIFY_SOURCE=2 # Detects some buffer overflows at runtime
CFLAGS += -fstack-protector-strong # Stack canaries
CFLAGS += -Wall -Wextra # Enable warnings (including -Wstringop-overflow)
LDFLAGS += -z relro -z now # RELRO: makes GOT read-only
4. Use Static Analysis Tools
Integrate these tools into your CI/CD pipeline to catch strcpy and similar issues automatically:
- Coverity — Industry-standard static analyzer, free for open source
- Clang Static Analyzer — Built into LLVM, catches many memory issues
- cppcheck — Lightweight, easy to integrate
- Semgrep — Rule-based scanner with rules for dangerous C functions
- CodeQL — GitHub's semantic code analysis engine
Example Semgrep rule to catch strcpy:
rules:
- id: dangerous-strcpy
patterns:
- pattern: strcpy($DST, $SRC)
message: "Unsafe strcpy call. Use strncpy or strlcpy with explicit size limits."
languages: [c, cpp]
severity: ERROR
5. Consider Memory-Safe Alternatives
For new projects or major rewrites, consider languages with built-in memory safety:
- Rust — Zero-cost abstractions with compile-time memory safety guarantees
- Go — Garbage collected, bounds-checked arrays and slices
- Zig — Systems programming with explicit safety semantics
For Ruby C extensions specifically, consider whether the performance gain justifies the risk. Modern Ruby (3.x) has significantly improved performance, and sometimes a pure-Ruby implementation is fast enough — and infinitely safer.
6. Fuzz Test Your Parsers
JSON parsers are a prime target for fuzzing because they handle highly variable, attacker-influenced input. Use:
- AFL++ — State-of-the-art coverage-guided fuzzer
- libFuzzer — Built into LLVM, easy to integrate
- OSS-Fuzz — Google's continuous fuzzing for open-source projects
A well-configured fuzzer would likely have discovered this vulnerability by generating oversized string values and observing the crash.
7. Reference Security Standards
When auditing C code for memory safety issues, refer to:
- CWE-120: Buffer Copy Without Checking Size of Input
- CWE-121: Stack-based Buffer Overflow
- CWE-122: Heap-based Buffer Overflow
- OWASP: Buffer Overflow
- SEI CERT C Coding Standard: STR31-C: Guarantee sufficient storage for strings
What You Should Do Right Now
If you use the OJ gem in your Ruby applications:
- Check your current version:
bundle show ojorgem list oj - Update immediately:
bundle update oj - Verify the fix is included in your updated version's changelog
- Audit your JSON parsing code for other potential issues
- Enable dependency scanning in your CI/CD pipeline (GitHub Dependabot, Snyk, etc.)
If you maintain C extensions for Ruby (or any other language):
- Audit all string operations — search your codebase for
strcpy,strcat,gets,sprintf - Run a static analyzer on your C code today
- Add fuzzing to your test suite for any code that handles external input
- Enable compiler warnings and treat them as errors in CI
Conclusion
This vulnerability is a stark reminder that a single line of C code can undermine the security of every application that depends on it. The OJ library is used by thousands of Ruby applications worldwide — a successful exploit of this vulnerability in a widely-deployed version could have had catastrophic consequences.
The key takeaways from this incident:
strcpywith external input is always dangerous — there are no exceptions- Input validation must happen at trust boundaries — before data enters native code
- Automated tooling catches these issues — static analyzers and fuzzers exist precisely to find vulnerabilities like this before attackers do
- The fix is simple; the consequences of not fixing it are not — a few lines of bounds-checking code prevent arbitrary code execution
The security community often talks about "shifting left" — catching vulnerabilities earlier in the development process. This fix is a perfect example: a static analysis tool flagged a dangerous function call, a security engineer understood the implications, and a targeted fix was applied before any known exploitation occurred.
Secure coding in C is hard. But with the right tools, practices, and awareness of common pitfalls like CWE-120, we can build safer software — one bounds check at a time.
This vulnerability was identified and remediated by OrbisAI Security. If you'd like to learn more about automated security scanning for your codebase, check out their platform for AI-powered vulnerability detection.
Further Reading:
- OWASP Buffer Overflow Attack
- SEI CERT C Coding Standard
- Smashing the Stack for Fun and Profit — Aleph One (1996) (The classic paper on stack buffer overflows)
- CWE-120: Buffer Copy Without Checking Size of Input