What is integer overflow in memory operations?

Integer overflow occurs when arithmetic operations on integer size values exceed their maximum representable value, wrapping around to a small number. When this wrapped value is used in `memcpy()`, it can cause the function to copy far less data than intended, corrupting heap structures.

How do you prevent integer overflow in C binary parsing?

Always validate size fields from untrusted input before arithmetic operations. Use safe integer arithmetic functions, check for overflow conditions explicitly, and verify that calculated sizes don't exceed buffer capacities before memory operations.

What CWE is integer overflow in memcpy?

CWE-190 (Integer Overflow or Wraparound) is the primary classification, often chained with CWE-680 (Integer Overflow to Buffer Overflow) when the overflow leads to memory corruption.

Is bounds checking on the destination buffer enough?

No. You must also validate the source size value itself before using it in calculations. An attacker can craft a size value that overflows integer arithmetic, resulting in a small wrapped value that passes destination buffer checks but corrupts memory.

Can static analysis detect this vulnerability?

Yes. Static analysis tools can identify `memcpy` calls where the size parameter comes from untrusted input without validation, and can flag potential integer overflow conditions in size calculations.

Integer Overflow & Memory Corruption in Binary Signing: A Deep Dive into CVE-Class Vulnerabilities in ZSign

Introduction

Binary signing tools occupy a uniquely dangerous position in the software supply chain. They run with elevated privileges, they consume untrusted input (the binaries and metadata submitted by users), and they are often trusted implicitly because their output — a signed binary — is treated as authoritative. When a vulnerability exists in a signing pipeline, the blast radius extends far beyond a single application: it can compromise the integrity of every artifact that pipeline touches.

This post examines a critical-severity memory corruption vulnerability (tracked as V-009, mapped to CWE-190: Integer Overflow or Wraparound) discovered in archo.cpp, a core component of the ZSign binary signing pipeline embedded in the Nyxian/LindChain/LiveContainer project. We'll walk through what made this vulnerability dangerous, how it could be exploited, and the principles behind fixing it — along with actionable guidance for developers writing similar low-level C++ code.

Note on severity discrepancy: The original scanner classified this as medium severity based on a surface-level description, but deeper analysis of the code path — chained integer overflows feeding unchecked memcpy calls — elevated the confirmed severity to critical. This is a common and important reminder: automated scanners provide a starting point, not a final verdict.

What Is This Vulnerability?

At its core, this is a chained memory corruption vulnerability rooted in two compounding weaknesses:

Unvalidated size fields from binary headers — The Mach-O binary format contains header fields (like segment sizes, section offsets, and load command lengths) that specify how much data to read or copy. If these fields are consumed directly without bounds checking, an attacker controls the parameters of memory operations.
Integer overflow in size calculations — When size values approach the maximum representable integer (SIZE_MAX for size_t, or UINT32_MAX for 32-bit fields), arithmetic operations like addition or multiplication can silently wrap around to a small value, causing subsequent allocations to be far too small for the data they receive.

When these two weaknesses combine — a wrapped-around allocation feeding an unchecked memcpy — the result is a classic heap buffer overflow, where attacker-controlled data is written beyond the bounds of an allocated buffer.

The Vulnerability Explained

Mach-O Parsing: Trusting What You Shouldn't

Mach-O is Apple's binary format for executables, object files, and libraries. It's a structured format with a header, followed by a series of load commands, each describing segments of the binary. A typical load command contains fields like:

struct segment_command_64 {
    uint32_t    cmd;        // LC_SEGMENT_64
    uint32_t    cmdsize;    // size of this command
    char        segname[16];
    uint64_t    vmaddr;
    uint64_t    vmsize;
    uint64_t    fileoff;
    uint64_t    filesize;   // <-- attacker-controlled
    // ...
};

The filesize and cmdsize fields are read directly from the binary file. In a legitimate binary, these values are set by a linker and are consistent with the actual file contents. In a crafted malicious binary, these values can be set to anything — including values designed to trigger overflows or out-of-bounds writes.

The Integer Overflow Path (CWE-190)

Consider a simplified version of vulnerable parsing logic:

// VULNERABLE: Before the fix
void processSegment(const uint8_t* fileData, segment_command_64* seg) {
    // filesize comes directly from the binary header — attacker controlled
    uint32_t dataSize = seg->filesize;  // Truncation: uint64_t -> uint32_t

    // If dataSize is 0xFFFFFFFF and we add any header overhead...
    uint32_t totalSize = dataSize + sizeof(SomeHeader);  // OVERFLOW: wraps to small value

    // Allocate based on the wrapped (tiny) size
    uint8_t* buffer = (uint8_t*)malloc(totalSize);  // Allocates e.g. 16 bytes

    // Copy based on the ORIGINAL large size — heap overflow!
    memcpy(buffer, fileData + seg->fileoff, dataSize);  // Writes 4GB into 16 bytes
}

This is the canonical integer overflow → heap overflow chain. The allocation is too small because the size calculation wrapped around, but the memcpy faithfully copies the full (large) amount, writing far beyond the allocated buffer.

The JSON Metadata Angle

The vulnerability description notes that json.cpp is also part of the attack surface. ZSign accepts JSON metadata alongside the binary (for specifying entitlements, bundle identifiers, etc.). If JSON-supplied numeric values — like buffer sizes, array lengths, or offsets — are used in similar arithmetic without validation, they represent a second injection point for the same class of overflow.

An attacker doesn't need to exploit both simultaneously; either path alone may be sufficient. But the existence of both dramatically lowers the bar for exploitation.

Real-World Attack Scenario

Here's how an attacker with access to the signing endpoint could weaponize this:

1. Attacker crafts a Mach-O binary with a segment_command_64 where:
      filesize  = 0xFFFFFFFF (or a value that causes overflow when added to header size)
      fileoff   = 0 (points to beginning of file)

2. Attacker submits this binary to the signing pipeline.

3. archo.cpp parses the binary, reads the attacker-controlled filesize,
   performs unchecked arithmetic → integer overflow → tiny allocation.

4. memcpy writes attacker-controlled data far beyond the heap buffer.

5. Heap metadata or adjacent objects are corrupted. With a carefully
   crafted binary, this can be turned into:
      - Arbitrary write primitive
      - Control flow hijack (overwriting a function pointer)
      - Code execution at the privilege level of the signing process

The signing process likely runs with code-signing entitlements or elevated privileges, making this a high-value target. Code execution here could mean:
- Signing malicious binaries as if they were legitimate
- Pivoting to other parts of the build infrastructure
- Exfiltrating signing certificates or private keys

The Fix

The fix centers on three defensive principles applied at every point where external data feeds into size calculations or memory operations:

1. Validate Before You Calculate

Before using any field from a binary header in arithmetic, validate that it falls within acceptable bounds:

// FIXED: Validate header fields before use
bool processSegment(const uint8_t* fileData, size_t fileSize, 
                    segment_command_64* seg) {
    // Reject obviously malicious values
    if (seg->filesize > fileSize) {
        logError("Segment filesize exceeds file bounds");
        return false;
    }

    if (seg->fileoff > fileSize || 
        seg->fileoff + seg->filesize > fileSize) {
        logError("Segment extends beyond file bounds");
        return false;
    }

    // Now safe to use
    size_t dataSize = (size_t)seg->filesize;
    // ...
}

2. Check for Overflow Before It Happens

Use explicit overflow-safe arithmetic rather than relying on the result being "obviously wrong":

// FIXED: Overflow-safe size calculation
bool safeSizeAdd(size_t a, size_t b, size_t* result) {
    if (b > SIZE_MAX - a) {
        return false;  // Would overflow
    }
    *result = a + b;
    return true;
}

// Usage
size_t totalSize;
if (!safeSizeAdd(dataSize, sizeof(SomeHeader), &totalSize)) {
    logError("Integer overflow in size calculation");
    return false;
}

uint8_t* buffer = (uint8_t*)malloc(totalSize);
if (!buffer) { return false; }

memcpy(buffer, fileData + seg->fileoff, dataSize);  // Safe: dataSize validated above

On modern compilers, you can also use built-in overflow checks:

// GCC/Clang built-ins for overflow detection
size_t totalSize;
if (__builtin_add_overflow(dataSize, sizeof(SomeHeader), &totalSize)) {
    return false;  // Overflow detected
}

3. Prefer Bounded Memory Operations

Where possible, replace unbounded operations with their bounded counterparts:

// Instead of:
memcpy(dest, src, untrustedSize);

// Prefer:
if (untrustedSize > destCapacity) {
    return false;  // Explicit bounds check
}
memcpy(dest, src, untrustedSize);  // Now safe

// Or use safer wrappers:
memcpy_s(dest, destCapacity, src, untrustedSize);  // MSVC / C11 Annex K

How the Fix Solves the Problem

By introducing validation gates at the point where binary header data enters the processing pipeline, the fix ensures that:

No attacker-controlled value can flow into malloc() or memcpy() without being checked against the actual file size and known-safe bounds.
Integer overflow is detected before it produces a dangerously small allocation.
The attack chain is broken at its first link — a crafted binary is rejected early in parsing rather than triggering memory corruption deep in the call stack.

Prevention & Best Practices

For C/C++ Binary Parsers

Always treat parsed data as untrusted. Even if your tool is "internal" or "only used by developers," the input it consumes (binary files, JSON, configuration) may be attacker-controlled. Apply the same skepticism you'd apply to web form input.

Establish a validation layer at the parsing boundary. Create explicit functions whose sole job is to validate that header fields are internally consistent and within file bounds. Call these before any memory allocation or copy operation.

// Good pattern: Explicit validation function
struct ValidationResult {
    bool valid;
    const char* error;
};

ValidationResult validateSegmentCommand(
    const segment_command_64* seg,
    size_t fileSize
) {
    if (seg->cmdsize < sizeof(segment_command_64))
        return {false, "cmdsize too small for segment_command_64"};

    if (seg->fileoff > fileSize)
        return {false, "fileoff beyond end of file"};

    // Check for overflow before addition
    if (seg->filesize > fileSize - seg->fileoff)
        return {false, "segment extends beyond end of file"};

    return {true, nullptr};
}

Use integer-overflow-safe libraries. Consider SafeInt (C++) or compiler sanitizers during development:

# Build with AddressSanitizer + UndefinedBehaviorSanitizer during development
clang++ -fsanitize=address,undefined -g archo.cpp -o archo_debug

Fuzz your parser. Binary format parsers are excellent targets for fuzzing. Tools like libFuzzer or AFL++ can automatically generate malformed inputs and identify crash-inducing cases:

// Minimal libFuzzer harness for Mach-O parsing
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    // Feed data to your parser; any crash = potential vulnerability
    parseMachO(data, size);
    return 0;
}

Security Standards & References

Standard	Reference	Relevance
CWE-190	Integer Overflow or Wraparound	Root cause of this vulnerability
CWE-122	Heap-based Buffer Overflow	Resulting vulnerability class
CWE-20	Improper Input Validation	Underlying weakness
OWASP	Input Validation Cheat Sheet	Validation guidance
SEI CERT C	INT30-C	Unsigned integer wrap prevention
SEI CERT C	MEM35-C	Sufficient memory allocation

Tooling Recommendations

Tool	Purpose	When to Use
AddressSanitizer (ASan)	Detects heap overflows at runtime	Development & CI
UBSan	Detects integer overflow at runtime	Development & CI
Valgrind	Memory error detection	Development
AFL++ / libFuzzer	Automated crash discovery	Pre-release testing
CodeQL	Static analysis for overflow patterns	CI/CD pipeline
Semgrep	Custom rules for unsafe patterns	Code review automation

A Note on Severity Assessment

This vulnerability was initially tagged as medium severity based on a surface-level description involving PDF.js rendering (likely a metadata mismatch in the tracking system). Deeper analysis of the actual code — unchecked memcpy fed by header-derived sizes, with integer overflow in the size calculation — correctly elevates this to critical.

This discrepancy is a useful reminder: automated severity scores are starting points, not conclusions. Always review the actual code path, consider the privilege context of the vulnerable process, and assess exploitability based on realistic attacker access. A "medium" in a privileged signing pipeline may be more dangerous than a "critical" in a sandboxed read-only viewer.

Conclusion

The V-009 vulnerability in archo.cpp is a textbook example of how low-level binary parsing, when combined with insufficient input validation, creates a critical attack surface. The chain — attacker-controlled header field → unchecked arithmetic → integer overflow → undersized allocation → heap overflow via memcpy — is well-understood, well-documented in CWE and CERT standards, and yet continues to appear in production code.

The fix is conceptually straightforward: validate before you calculate, check for overflow before you allocate, and never trust size fields from external data sources. What makes this hard in practice is discipline — maintaining validation rigor across every code path in a complex parser, especially as the codebase evolves.

Key takeaways:

🔒 Binary parsers are high-value attack surfaces — treat every header field as attacker-controlled
➕ Integer overflow is silent and dangerous — use explicit overflow checks before size arithmetic
🧪 Fuzz your parsers — automated fuzzing finds these issues faster than manual review
🛡️ Build with sanitizers — ASan and UBSan catch these bugs at runtime during development
📊 Don't trust automated severity scores blindly — always assess in context

Secure coding is not a one-time activity. It's a habit built through tools, code review practices, and a healthy skepticism toward any data that crosses a trust boundary. Keep validating, keep fuzzing, and keep the signing pipeline clean.

This post was generated as part of an automated security disclosure and education initiative by OrbisAI Security. The vulnerability described has been patched in the referenced pull request.

cwe	CWE-190 (Integer Overflow), CWE-680 (Integer Overflow to Buffer Overflow), CWE-120 (Buffer Copy without Checking Size)
fix	Implement strict input validation and integer overflow checks before all memory operations
risk	Arbitrary code execution at the privilege level of the signing process
language	C
root cause	Unvalidated size fields from Mach-O headers used directly in memcpy operations without bounds checking
vulnerability	Integer Overflow and Memory Corruption in Binary Header Processing

Integer Overflow & Memory Corruption in Binary Signing: A Deep Dive

Answer Summary

Vulnerability at a Glance