Integer Overflow in Shared Memory Bounds Check: How a Missing Cast Opened the Door to Arbitrary Memory Writes
Introduction
There's a class of bugs that security researchers find particularly insidious: vulnerabilities that look correct at first glance. The code has a bounds check. The developer clearly intended to validate input. And yet, due to a subtle arithmetic property, that check can be made to silently fail — leaving the system wide open.
That's exactly what happened in lib/rpmi_shmem.c, a component of the RPMI (RISC-V Platform Management Interface) shared memory implementation. A missing type cast meant that a bounds check designed to prevent out-of-range memory access could be trivially bypassed using integer overflow arithmetic. The result? An OS-level or hypervisor-level caller could instruct firmware to write to any memory address — including interrupt vector tables, firmware code regions, and security-critical configuration structures.
This post breaks down the vulnerability, explains the exploit path, and walks through the elegant one-line fix that closes the door.
What Is This Vulnerability?
At its core, this is a 32-bit integer overflow vulnerability in a memory bounds check. The affected code lives in three functions — rpmi_shmem_read, rpmi_shmem_write, and rpmi_shmem_fill — all of which perform the same flawed comparison:
// Vulnerable code (before fix)
if ((offset + len) > shmem->size) {
return RPMI_ERR_BAD_RANGE;
}
Both offset and len are rpmi_uint32_t (32-bit unsigned integers). When you add two 32-bit unsigned integers together in C, the result is also computed as a 32-bit unsigned integer. If the sum exceeds 0xFFFFFFFF, it wraps around to zero — and the bounds check passes even though the actual access would be far outside the permitted range.
This is a classic CWE-190: Integer Overflow or Wraparound vulnerability, and it's one of the most common root causes of security bugs in low-level systems code.
The Vulnerability Explained
The RPMI Shared Memory Context
RPMI is a platform management interface used in RISC-V systems. It allows OS-level software and firmware to communicate through a shared memory window. The shared memory region has a defined size, and operations that read, write, or fill portions of it are supposed to be constrained to that window.
The write path is particularly sensitive. Looking at the broader context described in the vulnerability report, the write operation ultimately calls:
rpmi_env_memcpy((void *)(unsigned long)addr, out, len);
Here, addr is derived from values in the shared memory region — values that are writable by the OS or hypervisor. If the bounds check can be bypassed, an attacker controls both the destination address and the length of a memcpy into firmware memory.
How the Integer Overflow Works
Let's make this concrete. Suppose shmem->size is 0x1000 (4KB — a typical small shared memory window).
An attacker crafts the following values:
- offset = 0xFFFFF100 (a very large 32-bit value)
- len = 0x00000F00
Now observe what happens in the vulnerable check:
// 32-bit arithmetic:
// 0xFFFFF100 + 0x00000F00 = 0x100000000
// But in 32-bit: 0x100000000 mod 2^32 = 0x00000000
if ((0xFFFFF100 + 0x00000F00) > 0x1000) {
// Becomes: if (0x00000000 > 0x1000)
// Which is: if (false)
// Bounds check PASSES! No error returned.
}
The sum overflows back to 0x00000000, which is less than shmem->size. The bounds check passes, and the operation proceeds with an offset value that is nearly 4GB beyond the start of the shared memory window.
Real-World Impact
In the context of firmware running on a RISC-V platform, this is severe:
- Interrupt Vector Table Corruption: Writing to the IVT could redirect all interrupts to attacker-controlled code, effectively hijacking the firmware's control flow.
- Firmware Code Overwrite: Overwriting firmware instructions allows persistent code execution at the highest privilege level.
- Security Configuration Tampering: Structures controlling secure boot, memory protection units (MPUs), or cryptographic key storage could be silently modified.
- Privilege Escalation: An OS-level caller (normally unprivileged relative to firmware) could escalate to firmware-level control.
The trust boundary violation here is significant: the shared memory interface is designed so that OS software cannot directly access firmware memory. This bug punches a hole straight through that boundary.
The Fix
The fix is a single-character change per affected line, but its implications are profound. The solution is to widen the arithmetic to 64 bits before performing the addition, preventing the overflow from occurring.
Before (Vulnerable)
// rpmi_shmem_read
if ((offset + len) > shmem->size) {
// rpmi_shmem_write
if ((offset + len) > shmem->size) {
// rpmi_shmem_fill
if ((offset + len) > shmem->size) {
After (Fixed)
// rpmi_shmem_read
if (((rpmi_uint64_t)offset + len) > shmem->size) {
// rpmi_shmem_write
if (((rpmi_uint64_t)offset + len) > shmem->size) {
// rpmi_shmem_fill
if (((rpmi_uint64_t)offset + len) > shmem->size) {
Why This Works
By casting offset to rpmi_uint64_t (a 64-bit unsigned integer) before the addition, C's type promotion rules ensure the entire expression is evaluated in 64-bit arithmetic. The maximum possible sum of two 32-bit values is 0x1FFFFFFFE — well within the range of a 64-bit integer — so no overflow can occur.
Let's revisit the attack scenario with the fix applied:
// 64-bit arithmetic:
// (uint64_t)0xFFFFF100 + 0x00000F00 = 0x100000000
if ((0x100000000ULL) > 0x1000) {
// if (true) -> RPMI_ERR_BAD_RANGE is returned
// Attack blocked!
}
The check now correctly identifies that the access would be out of range and returns an error before any memory operation occurs.
The Full Diff
- if ((offset + len) > shmem->size) {
+ if (((rpmi_uint64_t)offset + len) > shmem->size) {
Applied consistently across all three functions (rpmi_shmem_read, rpmi_shmem_write, rpmi_shmem_fill), this change closes the vulnerability completely.
Prevention & Best Practices
This vulnerability is a textbook example of why integer arithmetic in security-sensitive code deserves extra scrutiny. Here's how to prevent similar issues:
1. Always Consider Arithmetic Width in Bounds Checks
When performing addition or multiplication on values that will be compared against a larger type or used as memory offsets, explicitly widen the operands first.
// Risky: result computed in 32-bit
if ((uint32_t_a + uint32_t_b) > limit) { ... }
// Safe: result computed in 64-bit
if (((uint64_t)uint32_t_a + uint32_t_b) > limit) { ... }
// Also safe: use a safe addition macro/function
if (safe_add_u32(a, b, &result) || result > limit) { ... }
2. Use Compiler Warnings and Sanitizers
Modern compilers and sanitizers can catch many integer overflow issues:
# Enable overflow sanitizer during testing
gcc -fsanitize=undefined,integer ...
# Enable relevant warnings
gcc -Wall -Wextra -Wconversion -Wsign-conversion ...
The UndefinedBehaviorSanitizer (UBSan) with -fsanitize=integer will trap unsigned overflow at runtime during testing, helping surface these bugs before they reach production.
3. Use Safe Integer Libraries
For C/C++ projects, consider using safe integer arithmetic helpers:
__builtin_add_overflow()(GCC/Clang): Returns true if overflow would occurintsafe.h(Windows): Provides safe arithmetic functionssafe_iop: A portable safe integer operations library for C
// Using GCC/Clang built-in overflow detection
uint32_t result;
if (__builtin_add_overflow(offset, len, &result) || result > shmem->size) {
return RPMI_ERR_BAD_RANGE;
}
4. Validate All Externally-Supplied Values
Any value that crosses a trust boundary — from OS to firmware, from user to kernel, from network to application — must be treated as potentially adversarial. In this case:
offsetandlencome from shared memory writable by the OS- They should be validated individually before being combined
- Consider adding upper-bound checks on each value separately
// Defense in depth: validate individually AND combined
if (offset >= shmem->size || len > shmem->size ||
((rpmi_uint64_t)offset + len) > shmem->size) {
return RPMI_ERR_BAD_RANGE;
}
5. Code Review Checklist for Memory Operations
When reviewing code that performs memory reads, writes, or copies, always ask:
- [ ] Are all arithmetic operations on sizes/offsets performed in a type wide enough to hold the result?
- [ ] Can any combination of valid-looking inputs cause wraparound?
- [ ] Are all inputs from untrusted sources validated before use in arithmetic?
- [ ] Is the validation performed before any memory operation?
Relevant Security Standards
- CWE-190: Integer Overflow or Wraparound
- CWE-787: Out-of-bounds Write
- CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer
- CERT C Rule INT30-C: Ensure that unsigned integer operations do not wrap
- OWASP: Input Validation, Memory Management
Conclusion
This vulnerability is a powerful reminder that the presence of a bounds check is not the same as the correctness of a bounds check. A developer clearly intended to prevent out-of-range access — the check was there. But a subtle arithmetic property meant that the check could be silently defeated with crafted input values.
The key takeaways:
-
Integer overflow in bounds checks is a real and exploitable vulnerability class, particularly in firmware and low-level systems code where trust boundaries are enforced through software rather than hardware.
-
The fix was minimal but meaningful: casting one operand to a wider type before addition costs nothing in performance and completely closes the attack surface.
-
Consistency matters: the same pattern appeared in three functions. Finding and fixing all instances — not just the one initially reported — is essential for a complete remediation.
-
Trust boundaries demand rigorous validation: any value that crosses from a less-trusted to a more-trusted context must be treated as potentially adversarial, even when it "looks like" a simple integer.
Security in firmware is particularly high-stakes: bugs at this level can undermine every security guarantee built on top of it. Taking the time to reason carefully about integer arithmetic in memory operations is not premature optimization — it's fundamental correctness.
This vulnerability was identified and fixed as part of an automated security scanning process. Automated tools are increasingly effective at finding this class of bug — but understanding why the fix works is what enables developers to write secure code from the start.