What is a buffer overflow in memcpy?

A buffer overflow via memcpy occurs when the number of bytes copied exceeds the bounds of the destination or source buffer, allowing reads or writes into adjacent memory regions.

How do you prevent buffer overflows in C++ serialization code?

Always validate that `position + bytes_to_copy <= buffer_size` before calling memcpy, and throw or return an error if the check fails.

What CWE is a missing bounds check before memcpy?

CWE-125 (Out-of-bounds Read) for reads and CWE-787 (Out-of-bounds Write) for writes; both stem from CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer).

Is using `size_t` for buffer sizes enough to prevent this overflow?

No. Using `size_t` prevents signed integer issues but does not automatically enforce that the computed offset stays within the allocated buffer — an explicit bounds check is still required.

Can static analysis detect a missing bounds check before memcpy?

Yes. Tools like Semgrep, Coverity, and AddressSanitizer (ASan) can flag memcpy calls where the size argument is not validated against the available buffer length.

memcpy Bounds Overflow in C MemStream.h — Fix

Q: What CWE is a missing bounds check before memcpy?

CWE-125 (Out-of-bounds Read) for reads and CWE-787 (Out-of-bounds Write) for writes; both stem from CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer).

Q: Is using `size_t` for buffer sizes enough to prevent this overflow?

No. Using `size_t` prevents signed integer issues but does not automatically enforce that the computed offset stays within the allocated buffer — an explicit bounds check is still required.

Q: Can static analysis detect a missing bounds check before memcpy?

Yes. Tools like Semgrep, Coverity, and AddressSanitizer (ASan) can flag memcpy calls where the size argument is not validated against the available buffer length.

How Buffer Overflow Happens in C MemStream.h and How to Fix It

Introduction

The MemStream class in src/avt/IVP/MemStream.h is a foundational serialization primitive used throughout the IVP (Integral Vector Pipeline) subsystem. It provides read() and write() template methods that move data in and out of an internal byte buffer _data. Because virtually every serialized object in the IVP subsystem passes through this class, a flaw here doesn't stay local — it propagates to every caller.

The flaw? The read() method at line 125 called memcpy(pt, &_data[_pos], nBytes) without first checking whether _pos + nBytes exceeded the buffer's length _len. An attacker who could supply crafted serialized integral curve data — where encoded size fields specify nBytes values larger than the remaining buffer — could trigger out-of-bounds memory access, potentially corrupting the heap or exposing sensitive memory contents.

This post walks through exactly how the vulnerability works, what the fix does, and how to prevent the same pattern from appearing in your own serialization code.

The Vulnerability Explained

The Vulnerable Code

Here is the read() template method as it existed before the fix:

// src/avt/IVP/MemStream.h — BEFORE fix (line 122–128)
template <typename T> inline void MemStream::read(T *pt, const size_t &num)
{
    size_t nBytes = sizeof(T) * num;
    // ❌ No bounds check here!
    memcpy(pt, &_data[_pos], nBytes);
    _pos += nBytes;
}

nBytes is calculated as sizeof(T) * num, where num comes from the deserialized data stream — meaning it is ultimately attacker-controlled when the input comes from an untrusted source. There is no check that _pos + nBytes <= _len before the memcpy executes.

Why This Is Dangerous

When MemStream reads serialized integral curve data, it trusts the size fields embedded in the stream. If an attacker crafts a stream where a size field claims there are, say, 1024 bytes remaining but the actual buffer only has 8 bytes left starting at _pos, the memcpy will happily read 1016 bytes beyond the end of _data.

In C++, _data is a heap-allocated array. Reading past its end means reading from whatever happens to follow it in memory — potentially:

Heap metadata (allocator bookkeeping structures)
Other objects' private data (passwords, keys, pointers)
Unmapped memory (causing a segmentation fault / crash)

The write path had the same problem at line 169:

// src/avt/IVP/MemStream.h — BEFORE fix (line 165–171)
template <typename T> inline void MemStream::write(const T *pt, const size_t &num)
{
    size_t nBytes = sizeof(T) * num;
    // ❌ No bounds check here either!
    memcpy(&_data[_pos], pt, nBytes);
    _pos += nBytes;
}

An out-of-bounds write is typically worse than a read: it enables heap corruption, which sophisticated attackers can leverage for arbitrary code execution.

Concrete Attack Scenario

Consider a workflow where a user loads an integral curve dataset from a file or network source:

Attacker crafts a .ivp file where a record header claims num = 65536 elements of type double (8 bytes each = 512 KB).
The actual buffer allocated for this record is only 64 bytes.
MemStream::read() computes nBytes = 524288 and calls memcpy(pt, &_data[14], 524288).
_data is only 64 bytes; the memcpy reads 524,222 bytes past the end of the allocation.
Depending on the platform and heap layout, this can crash the application, leak memory, or — in a write scenario — corrupt adjacent heap objects.

Because MemStream is described as "a fundamental serialization primitive used throughout the IVP subsystem," every deserialization path that calls read() or write() is affected.

The Fix

What Changed

The fix adds a single pre-condition guard immediately before the memcpy in read():

// src/avt/IVP/MemStream.h — AFTER fix
template <typename T> inline void MemStream::read(T *pt, const size_t &num)
{
    size_t nBytes = sizeof(T) * num;
    if (_pos + nBytes > _len)          // ✅ Bounds check added
        EXCEPTION0(ImproperUseException);
    memcpy(pt, &_data[_pos], nBytes);
    _pos += nBytes;
}

Before vs. After

	Before	After
Bounds check	None	`if (_pos + nBytes > _len)`
On overflow	Silent out-of-bounds memcpy	Throws `ImproperUseException`
Memory safety	❌ Unsafe	✅ Safe

Why This Fix Works

The invariant that must hold for any safe memcpy from a bounded buffer is:

source_start + bytes_to_copy <= buffer_end

Translated to MemStream's fields:

_pos + nBytes <= _len

By checking _pos + nBytes > _len and throwing before the memcpy executes, the fix ensures that memcpy is only ever called when the entire operation fits within the allocated buffer. The EXCEPTION0(ImproperUseException) macro propagates the error up the call stack, allowing callers to handle malformed input gracefully rather than silently corrupting memory.

The PR also notes that src/avt/IVP/MemStream.h:127 and src/avt/IVP/MemStream.h:171 follow the same pattern and should receive equivalent treatment — specifically the write() path, which needs the analogous check:

// Recommended fix for write() at line 169
template <typename T> inline void MemStream::write(const T *pt, const size_t &num)
{
    size_t nBytes = sizeof(T) * num;
    if (_pos + nBytes > _len)          // ✅ Bounds check needed here too
        EXCEPTION0(ImproperUseException);
    memcpy(&_data[_pos], pt, nBytes);
    _pos += nBytes;
}

Prevention & Best Practices

1. Treat Deserialized Size Fields as Untrusted Input

Any num or nBytes value that originates from a file, network stream, or user-provided data is attacker-controlled. Validate it against the known buffer size before using it in a memory operation.

2. Centralize Buffer Bounds Enforcement

Because MemStream is used as a primitive throughout IVP, fixing the bounds check in read() and write() once protects all callers automatically. This is the right architectural approach: enforce invariants at the lowest level rather than asking every caller to remember to check.

3. Use AddressSanitizer During Development and Testing

Compile with -fsanitize=address during development and CI:

clang++ -fsanitize=address -g -O1 src/avt/IVP/MemStream.h ...

ASan will catch out-of-bounds memcpy accesses immediately, even if no exception is thrown.

4. Consider `std::span` or Bounded Buffer Wrappers (C++20)

In modern C++, std::span<T> carries both a pointer and a size, making it harder to accidentally pass a pointer without its bounds. Wrapping _data in a std::span would make the bounds check implicit in many operations.

5. Fuzz the Deserialization Path

Use a fuzzer (libFuzzer, AFL++) targeting MemStream::read() with randomly mutated size fields. The regression test included in the PR is an excellent starting point for a property-based test suite.

Relevant Standards

CWE-125: Out-of-bounds Read — https://cwe.mitre.org/data/definitions/125.html
CWE-787: Out-of-bounds Write — https://cwe.mitre.org/data/definitions/787.html
CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer — https://cwe.mitre.org/data/definitions/119.html
OWASP Memory Safety: https://cheatsheetseries.owasp.org/cheatsheets/Memory_Management_Cheat_Sheet.html

Key Takeaways

MemStream::read() trusted attacker-controlled num values without a bounds check — a single missing if statement exposed every IVP deserialization path to heap corruption.
_pos + nBytes > _len is the exact invariant that must hold before any memcpy from _data; the fix encodes this invariant directly in the code.
The write() path at line 169 carries the same vulnerability and needs the same treatment — fixing only read() leaves half the attack surface open.
Serialization primitives are high-value targets because a single flaw in a foundational class like MemStream affects every caller throughout the subsystem.
Throwing an exception on bounds violation is the correct response — it surfaces malformed input explicitly rather than silently producing undefined behavior.

How Orbis AppSec Detected This

Source: Attacker-controlled size fields (num) embedded in serialized integral curve data fed into MemStream::read()
Sink: memcpy(pt, &_data[_pos], nBytes) at src/avt/IVP/MemStream.h:125, called with an unchecked nBytes derived from the untrusted num parameter
Missing control: No validation that _pos + nBytes <= _len before the memcpy executes
CWE: CWE-125 (Out-of-bounds Read) and CWE-787 (Out-of-bounds Write)
Fix: Added if (_pos + nBytes > _len) EXCEPTION0(ImproperUseException); immediately before the memcpy call in MemStream::read()

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

The MemStream buffer overflow is a textbook example of why serialization code deserves the same security scrutiny as network-facing code. The vulnerability was subtle — nBytes looks like an innocuous computed value, but it ultimately derives from data in the stream, which is attacker-controlled. A single missing bounds check before memcpy turned a trusted internal primitive into a potential heap corruption vector.

The fix is equally concise: one if statement and one exception throw. But its impact is broad, because MemStream underpins the entire IVP serialization subsystem. This is the power of fixing security invariants at the right abstraction level — protect the primitive, and every caller inherits the protection.

When writing C++ serialization code, make it a habit: every memcpy from a bounded buffer must be preceded by a bounds check. Treat size fields from external data the same way you'd treat user input in a web application — validate before use.

cwe	CWE-125 (Out-of-bounds Read) / CWE-787 (Out-of-bounds Write)
fix	Added `if (_pos + nBytes > _len) EXCEPTION0(ImproperUseException);` before each `memcpy`
risk	Heap corruption, memory disclosure, or crash when processing crafted serialized data
language	C++
root cause	`memcpy` called with `nBytes` derived from untrusted input without checking `_pos + nBytes <= _len`
vulnerability	Buffer Overflow (Out-of-Bounds Read/Write via memcpy)

How buffer overflow happens in C MemStream.h and how to fix it

Answer Summary

Vulnerability at a Glance

How Buffer Overflow Happens in C MemStream.h and How to Fix It

Introduction

The Vulnerability Explained

The Vulnerable Code

Why This Is Dangerous

Concrete Attack Scenario

The Fix

What Changed

Before vs. After

Why This Fix Works

Prevention & Best Practices

1. Treat Deserialized Size Fields as Untrusted Input

2. Centralize Buffer Bounds Enforcement

3. Use AddressSanitizer During Development and Testing

4. Consider `std::span` or Bounded Buffer Wrappers (C++20)

5. Fuzz the Deserialization Path

Relevant Standards

Key Takeaways

How Orbis AppSec Detected This

Conclusion

References

Frequently Asked Questions

What is a buffer overflow in memcpy?

How do you prevent buffer overflows in C++ serialization code?

What CWE is a missing bounds check before memcpy?

Is using `size_t` for buffer sizes enough to prevent this overflow?

Can static analysis detect a missing bounds check before memcpy?

View the Security Fix

Related Articles

How buffer overflow via insecure strcpy/strncpy happens in C textbox widgets and how to fix it

How buffer overflow via sprintf happens in C++ fuzzer code and how to fix it

How buffer overflow in memcpy happens in C bios_disk.h and how to fix it

How buffer overflow happens in C RTSPSession.h and how to fix it

How buffer overflow happens in C kernel driver (qcom_usbnet_main.c) and how to fix it

How Denial of Service happens in Node.js devalue and how to fix it

How buffer overflow happens in C MemStream.h and how to fix it

Answer Summary

Vulnerability at a Glance

How Buffer Overflow Happens in C MemStream.h and How to Fix It

Introduction

The Vulnerability Explained

The Vulnerable Code

Why This Is Dangerous

Concrete Attack Scenario

The Fix

What Changed

Before vs. After

Why This Fix Works

Prevention & Best Practices

1. Treat Deserialized Size Fields as Untrusted Input

2. Centralize Buffer Bounds Enforcement

3. Use AddressSanitizer During Development and Testing

4. Consider std::span or Bounded Buffer Wrappers (C++20)

5. Fuzz the Deserialization Path

Relevant Standards

Key Takeaways

How Orbis AppSec Detected This

Conclusion

References

Frequently Asked Questions

What is a buffer overflow in memcpy?

How do you prevent buffer overflows in C++ serialization code?

What CWE is a missing bounds check before memcpy?

Is using `size_t` for buffer sizes enough to prevent this overflow?

Can static analysis detect a missing bounds check before memcpy?

View the Security Fix

Related Articles

How buffer overflow via insecure strcpy/strncpy happens in C textbox widgets and how to fix it

How buffer overflow via sprintf happens in C++ fuzzer code and how to fix it

How buffer overflow in memcpy happens in C bios_disk.h and how to fix it

How buffer overflow happens in C RTSPSession.h and how to fix it

How buffer overflow happens in C kernel driver (qcom_usbnet_main.c) and how to fix it

How Denial of Service happens in Node.js devalue and how to fix it

4. Consider `std::span` or Bounded Buffer Wrappers (C++20)