Integer Overflow & Memory Corruption in Binary Signing: A Deep Dive into CVE-Class Vulnerabilities in ZSign
Introduction
Binary signing tools occupy a uniquely dangerous position in the software supply chain. They run with elevated privileges, they consume untrusted input (the binaries and metadata submitted by users), and they are often trusted implicitly because their output — a signed binary — is treated as authoritative. When a vulnerability exists in a signing pipeline, the blast radius extends far beyond a single application: it can compromise the integrity of every artifact that pipeline touches.
This post examines a critical-severity memory corruption vulnerability (tracked as V-009, mapped to CWE-190: Integer Overflow or Wraparound) discovered in archo.cpp, a core component of the ZSign binary signing pipeline embedded in the Nyxian/LindChain/LiveContainer project. We'll walk through what made this vulnerability dangerous, how it could be exploited, and the principles behind fixing it — along with actionable guidance for developers writing similar low-level C++ code.
Note on severity discrepancy: The original scanner classified this as medium severity based on a surface-level description, but deeper analysis of the code path — chained integer overflows feeding unchecked
memcpycalls — elevated the confirmed severity to critical. This is a common and important reminder: automated scanners provide a starting point, not a final verdict.
What Is This Vulnerability?
At its core, this is a chained memory corruption vulnerability rooted in two compounding weaknesses:
-
Unvalidated size fields from binary headers — The Mach-O binary format contains header fields (like segment sizes, section offsets, and load command lengths) that specify how much data to read or copy. If these fields are consumed directly without bounds checking, an attacker controls the parameters of memory operations.
-
Integer overflow in size calculations — When size values approach the maximum representable integer (
SIZE_MAXforsize_t, orUINT32_MAXfor 32-bit fields), arithmetic operations like addition or multiplication can silently wrap around to a small value, causing subsequent allocations to be far too small for the data they receive.
When these two weaknesses combine — a wrapped-around allocation feeding an unchecked memcpy — the result is a classic heap buffer overflow, where attacker-controlled data is written beyond the bounds of an allocated buffer.
The Vulnerability Explained
Mach-O Parsing: Trusting What You Shouldn't
Mach-O is Apple's binary format for executables, object files, and libraries. It's a structured format with a header, followed by a series of load commands, each describing segments of the binary. A typical load command contains fields like:
struct segment_command_64 {
uint32_t cmd; // LC_SEGMENT_64
uint32_t cmdsize; // size of this command
char segname[16];
uint64_t vmaddr;
uint64_t vmsize;
uint64_t fileoff;
uint64_t filesize; // <-- attacker-controlled
// ...
};
The filesize and cmdsize fields are read directly from the binary file. In a legitimate binary, these values are set by a linker and are consistent with the actual file contents. In a crafted malicious binary, these values can be set to anything — including values designed to trigger overflows or out-of-bounds writes.
The Integer Overflow Path (CWE-190)
Consider a simplified version of vulnerable parsing logic:
// VULNERABLE: Before the fix
void processSegment(const uint8_t* fileData, segment_command_64* seg) {
// filesize comes directly from the binary header — attacker controlled
uint32_t dataSize = seg->filesize; // Truncation: uint64_t -> uint32_t
// If dataSize is 0xFFFFFFFF and we add any header overhead...
uint32_t totalSize = dataSize + sizeof(SomeHeader); // OVERFLOW: wraps to small value
// Allocate based on the wrapped (tiny) size
uint8_t* buffer = (uint8_t*)malloc(totalSize); // Allocates e.g. 16 bytes
// Copy based on the ORIGINAL large size — heap overflow!
memcpy(buffer, fileData + seg->fileoff, dataSize); // Writes 4GB into 16 bytes
}
This is the canonical integer overflow → heap overflow chain. The allocation is too small because the size calculation wrapped around, but the memcpy faithfully copies the full (large) amount, writing far beyond the allocated buffer.
The JSON Metadata Angle
The vulnerability description notes that json.cpp is also part of the attack surface. ZSign accepts JSON metadata alongside the binary (for specifying entitlements, bundle identifiers, etc.). If JSON-supplied numeric values — like buffer sizes, array lengths, or offsets — are used in similar arithmetic without validation, they represent a second injection point for the same class of overflow.
An attacker doesn't need to exploit both simultaneously; either path alone may be sufficient. But the existence of both dramatically lowers the bar for exploitation.
Real-World Attack Scenario
Here's how an attacker with access to the signing endpoint could weaponize this:
1. Attacker crafts a Mach-O binary with a segment_command_64 where:
filesize = 0xFFFFFFFF (or a value that causes overflow when added to header size)
fileoff = 0 (points to beginning of file)
2. Attacker submits this binary to the signing pipeline.
3. archo.cpp parses the binary, reads the attacker-controlled filesize,
performs unchecked arithmetic → integer overflow → tiny allocation.
4. memcpy writes attacker-controlled data far beyond the heap buffer.
5. Heap metadata or adjacent objects are corrupted. With a carefully
crafted binary, this can be turned into:
- Arbitrary write primitive
- Control flow hijack (overwriting a function pointer)
- Code execution at the privilege level of the signing process
The signing process likely runs with code-signing entitlements or elevated privileges, making this a high-value target. Code execution here could mean:
- Signing malicious binaries as if they were legitimate
- Pivoting to other parts of the build infrastructure
- Exfiltrating signing certificates or private keys
The Fix
The fix centers on three defensive principles applied at every point where external data feeds into size calculations or memory operations:
1. Validate Before You Calculate
Before using any field from a binary header in arithmetic, validate that it falls within acceptable bounds:
// FIXED: Validate header fields before use
bool processSegment(const uint8_t* fileData, size_t fileSize,
segment_command_64* seg) {
// Reject obviously malicious values
if (seg->filesize > fileSize) {
logError("Segment filesize exceeds file bounds");
return false;
}
if (seg->fileoff > fileSize ||
seg->fileoff + seg->filesize > fileSize) {
logError("Segment extends beyond file bounds");
return false;
}
// Now safe to use
size_t dataSize = (size_t)seg->filesize;
// ...
}
2. Check for Overflow Before It Happens
Use explicit overflow-safe arithmetic rather than relying on the result being "obviously wrong":
// FIXED: Overflow-safe size calculation
bool safeSizeAdd(size_t a, size_t b, size_t* result) {
if (b > SIZE_MAX - a) {
return false; // Would overflow
}
*result = a + b;
return true;
}
// Usage
size_t totalSize;
if (!safeSizeAdd(dataSize, sizeof(SomeHeader), &totalSize)) {
logError("Integer overflow in size calculation");
return false;
}
uint8_t* buffer = (uint8_t*)malloc(totalSize);
if (!buffer) { return false; }
memcpy(buffer, fileData + seg->fileoff, dataSize); // Safe: dataSize validated above
On modern compilers, you can also use built-in overflow checks:
// GCC/Clang built-ins for overflow detection
size_t totalSize;
if (__builtin_add_overflow(dataSize, sizeof(SomeHeader), &totalSize)) {
return false; // Overflow detected
}
3. Prefer Bounded Memory Operations
Where possible, replace unbounded operations with their bounded counterparts:
// Instead of:
memcpy(dest, src, untrustedSize);
// Prefer:
if (untrustedSize > destCapacity) {
return false; // Explicit bounds check
}
memcpy(dest, src, untrustedSize); // Now safe
// Or use safer wrappers:
memcpy_s(dest, destCapacity, src, untrustedSize); // MSVC / C11 Annex K
How the Fix Solves the Problem
By introducing validation gates at the point where binary header data enters the processing pipeline, the fix ensures that:
- No attacker-controlled value can flow into
malloc()ormemcpy()without being checked against the actual file size and known-safe bounds. - Integer overflow is detected before it produces a dangerously small allocation.
- The attack chain is broken at its first link — a crafted binary is rejected early in parsing rather than triggering memory corruption deep in the call stack.
Prevention & Best Practices
For C/C++ Binary Parsers
Always treat parsed data as untrusted. Even if your tool is "internal" or "only used by developers," the input it consumes (binary files, JSON, configuration) may be attacker-controlled. Apply the same skepticism you'd apply to web form input.
Establish a validation layer at the parsing boundary. Create explicit functions whose sole job is to validate that header fields are internally consistent and within file bounds. Call these before any memory allocation or copy operation.
// Good pattern: Explicit validation function
struct ValidationResult {
bool valid;
const char* error;
};
ValidationResult validateSegmentCommand(
const segment_command_64* seg,
size_t fileSize
) {
if (seg->cmdsize < sizeof(segment_command_64))
return {false, "cmdsize too small for segment_command_64"};
if (seg->fileoff > fileSize)
return {false, "fileoff beyond end of file"};
// Check for overflow before addition
if (seg->filesize > fileSize - seg->fileoff)
return {false, "segment extends beyond end of file"};
return {true, nullptr};
}
Use integer-overflow-safe libraries. Consider SafeInt (C++) or compiler sanitizers during development:
# Build with AddressSanitizer + UndefinedBehaviorSanitizer during development
clang++ -fsanitize=address,undefined -g archo.cpp -o archo_debug
Fuzz your parser. Binary format parsers are excellent targets for fuzzing. Tools like libFuzzer or AFL++ can automatically generate malformed inputs and identify crash-inducing cases:
// Minimal libFuzzer harness for Mach-O parsing
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
// Feed data to your parser; any crash = potential vulnerability
parseMachO(data, size);
return 0;
}
Security Standards & References
| Standard | Reference | Relevance |
|---|---|---|
| CWE-190 | Integer Overflow or Wraparound | Root cause of this vulnerability |
| CWE-122 | Heap-based Buffer Overflow | Resulting vulnerability class |
| CWE-20 | Improper Input Validation | Underlying weakness |
| OWASP | Input Validation Cheat Sheet | Validation guidance |
| SEI CERT C | INT30-C | Unsigned integer wrap prevention |
| SEI CERT C | MEM35-C | Sufficient memory allocation |
Tooling Recommendations
| Tool | Purpose | When to Use |
|---|---|---|
| AddressSanitizer (ASan) | Detects heap overflows at runtime | Development & CI |
| UBSan | Detects integer overflow at runtime | Development & CI |
| Valgrind | Memory error detection | Development |
| AFL++ / libFuzzer | Automated crash discovery | Pre-release testing |
| CodeQL | Static analysis for overflow patterns | CI/CD pipeline |
| Semgrep | Custom rules for unsafe patterns | Code review automation |
A Note on Severity Assessment
This vulnerability was initially tagged as medium severity based on a surface-level description involving PDF.js rendering (likely a metadata mismatch in the tracking system). Deeper analysis of the actual code — unchecked memcpy fed by header-derived sizes, with integer overflow in the size calculation — correctly elevates this to critical.
This discrepancy is a useful reminder: automated severity scores are starting points, not conclusions. Always review the actual code path, consider the privilege context of the vulnerable process, and assess exploitability based on realistic attacker access. A "medium" in a privileged signing pipeline may be more dangerous than a "critical" in a sandboxed read-only viewer.
Conclusion
The V-009 vulnerability in archo.cpp is a textbook example of how low-level binary parsing, when combined with insufficient input validation, creates a critical attack surface. The chain — attacker-controlled header field → unchecked arithmetic → integer overflow → undersized allocation → heap overflow via memcpy — is well-understood, well-documented in CWE and CERT standards, and yet continues to appear in production code.
The fix is conceptually straightforward: validate before you calculate, check for overflow before you allocate, and never trust size fields from external data sources. What makes this hard in practice is discipline — maintaining validation rigor across every code path in a complex parser, especially as the codebase evolves.
Key takeaways:
- 🔒 Binary parsers are high-value attack surfaces — treat every header field as attacker-controlled
- ➕ Integer overflow is silent and dangerous — use explicit overflow checks before size arithmetic
- 🧪 Fuzz your parsers — automated fuzzing finds these issues faster than manual review
- 🛡️ Build with sanitizers — ASan and UBSan catch these bugs at runtime during development
- 📊 Don't trust automated severity scores blindly — always assess in context
Secure coding is not a one-time activity. It's a habit built through tools, code review practices, and a healthy skepticism toward any data that crosses a trust boundary. Keep validating, keep fuzzing, and keep the signing pipeline clean.
This post was generated as part of an automated security disclosure and education initiative by OrbisAI Security. The vulnerability described has been patched in the referenced pull request.