Unsafe Dict Merge in Scapy: How `dict.update()` Opens the Door to Object Injection

Introduction

When building networked applications in Python, it's tempting to use convenient shortcuts to populate object attributes from parsed data. One such shortcut—self.__dict__.update(entries)—looks harmless at first glance. After all, it's just copying some keys and values into an object, right?

Wrong. When the source of those keys and values is an untrusted network packet or external input, this single line of code can become a critical security vulnerability. This post breaks down a real-world vulnerability discovered in scapy/scapy_pcp.py, explains how it could be exploited, and walks through what a proper fix looks like.

Whether you're a seasoned security engineer or a developer just beginning to think about secure coding practices, this vulnerability offers a powerful lesson: never merge untrusted data directly into your object's internal namespace.

The Vulnerability Explained

What Is `dict.update()` and Why Is It Dangerous?

In Python, every object has a __dict__ attribute—a dictionary that stores the object's instance attributes. When you write:

self.__dict__.update(entries)

You are directly merging every key-value pair from entries into the object's attribute namespace. If entries comes from a trusted, controlled source (like a hardcoded config), this is fine. But if entries is derived from a parsed network packet or any other form of external input, you've just handed an attacker the keys to your object.

The Vulnerable Code

At line 37 of scapy/scapy_pcp.py, the vulnerable pattern looked something like this:

# VULNERABLE CODE - Do not use
class PCPMessage:
    def __init__(self, entries):
        # Directly merging externally-supplied dictionary into object namespace
        self.__dict__.update(entries)  # ← Line 37: DANGEROUS

At first glance, this seems like a convenient way to initialize an object from parsed packet fields. In practice, it's a wide-open door for attackers.

How Could It Be Exploited?

Python objects have a number of special "dunder" (double-underscore) attributes and methods that control fundamental behavior:

Attribute	What It Controls
`__class__`	The object's type/class
`__init__`	The constructor method
`__repr__`	String representation
`__reduce__`	Pickle serialization behavior
`__module__`	The module the class belongs to

Because self.__dict__.update(entries) performs no filtering whatsoever, an attacker who can craft a malicious packet can inject any of these keys.

Consider a crafted packet payload that, when parsed, produces a dictionary like:

malicious_entries = {
    "opcode": 1,          # Legitimate field
    "lifetime": 3600,     # Legitimate field
    "__class__": <some_malicious_class>,  # INJECTED
    "__init__": lambda self: exec("import os; os.system('rm -rf /')"),  # INJECTED
    "_internal_state": "corrupted"  # INJECTED internal attribute
}

When self.__dict__.update(malicious_entries) runs, all of these keys—including the dangerous ones—get written directly into the object.

Real-World Attack Scenario

Imagine a network service that:
1. Listens for incoming PCP (Port Control Protocol) packets
2. Parses each packet using Scapy
3. Creates a PCPMessage object from the parsed fields
4. Passes that object to downstream business logic

An attacker on the network sends a specially crafted packet. The parser extracts fields from it and builds a dictionary. That dictionary gets passed to PCPMessage.__init__(). Because of the unchecked __dict__.update(), the attacker's injected keys overwrite critical object attributes.

Depending on how the application uses the resulting object, this could lead to:

Object state corruption: Internal counters, flags, or state variables get overwritten with attacker-controlled values
Method hijacking: Overwriting callable attributes causes the application to execute attacker-supplied logic
Denial of Service: Injecting oversized payloads or recursive structures exhausts memory/CPU (related to the input size constraints issue noted in V-008)
Privilege escalation: In some application architectures, corrupting object state can bypass authorization checks

The Fix

What Needs to Change

The core problem is the complete absence of input validation and key filtering. A proper fix must ensure that:

Only expected keys are accepted — Define an allowlist of valid attribute names
Dunder attributes are explicitly blocked — Never allow __-prefixed keys from external input
Values are validated — Check types and sizes before assignment
Unexpected keys are rejected or logged — Don't silently ignore potentially malicious input

The Secure Pattern (After Fix)

Here is what a hardened version of this code should look like:

# SECURE CODE - After fix
class PCPMessage:
    # Explicit allowlist of valid, expected fields
    ALLOWED_FIELDS = frozenset({
        "opcode",
        "lifetime",
        "result_code",
        "protocol",
        "internal_port",
        "external_port",
    })

    # Maximum allowed size for string/bytes fields
    MAX_FIELD_SIZE = 1024  # bytes

    def __init__(self, entries):
        if not isinstance(entries, dict):
            raise TypeError("entries must be a dictionary")

        for key, value in entries.items():
            # Block dunder and private attributes entirely
            if key.startswith("_"):
                raise ValueError(f"Illegal field name rejected: {key!r}")

            # Only accept explicitly allowlisted keys
            if key not in self.ALLOWED_FIELDS:
                raise ValueError(f"Unknown field rejected: {key!r}")

            # Enforce size constraints on string/bytes values
            if isinstance(value, (str, bytes)) and len(value) > self.MAX_FIELD_SIZE:
                raise ValueError(f"Field {key!r} exceeds maximum allowed size")

            # Safe to set — key is validated and allowlisted
            setattr(self, key, value)

Why This Fix Works

Let's walk through each defense layer:

1. Allowlist validation (ALLOWED_FIELDS)
Instead of accepting any key that arrives in the dictionary, we define exactly which keys are valid. Anything not on the list is rejected immediately. This is the classic allowlist over blocklist principle—far more robust than trying to enumerate all the bad things to block.

2. Dunder/private key blocking
The key.startswith("_") check ensures that even if someone somehow adds a new dunder attribute to Python in the future, it will still be blocked. Defense in depth.

3. setattr() instead of __dict__.update()
Using setattr(self, key, value) respects Python's attribute setting protocol, including any __setattr__ overrides you might add for additional validation. Direct __dict__ manipulation bypasses these safeguards entirely.

4. Size constraints
Enforcing MAX_FIELD_SIZE addresses the related Denial of Service vector (V-008) where oversized payloads could exhaust server resources during processing.

5. Type checking
isinstance(entries, dict) ensures we fail fast if something unexpected is passed in, rather than producing confusing errors downstream.

Prevention & Best Practices

1. Never Use `dict.update()` with Untrusted Data

This is the cardinal rule. If your data source is a network packet, a user-submitted form, an API request, or any other external input, never pass it directly to __dict__.update().

# ❌ NEVER do this with external data
self.__dict__.update(untrusted_data)

# ✅ Always validate first
for key, value in untrusted_data.items():
    if key in ALLOWED_FIELDS:
        setattr(self, key, value)

2. Use Data Validation Libraries

Libraries like Pydantic or marshmallow are purpose-built for this problem. They enforce schemas, validate types, and reject unexpected fields automatically:

from pydantic import BaseModel, Field
from typing import Optional

class PCPMessage(BaseModel):
    opcode: int = Field(..., ge=0, le=255)
    lifetime: int = Field(..., ge=0, le=86400)
    result_code: Optional[int] = Field(None, ge=0, le=255)
    internal_port: int = Field(..., ge=0, le=65535)
    external_port: int = Field(..., ge=0, le=65535)

    class Config:
        # Reject any extra fields not defined in the model
        extra = "forbid"

With Pydantic, attempting to pass __class__ or any other unexpected key will raise a ValidationError automatically.

3. Apply the Principle of Least Privilege to Data

When parsing network packets, only extract and store the fields your application actually needs. Discard everything else at the parsing stage, before it ever reaches your business logic objects.

4. Enforce Input Size Limits Early

Size validation should happen at the network/API boundary, not deep in your business logic:

MAX_PAYLOAD_SIZE = 4096  # bytes

def handle_packet(raw_data: bytes):
    if len(raw_data) > MAX_PAYLOAD_SIZE:
        raise ValueError("Packet exceeds maximum allowed size")
    # ... proceed with parsing

5. Use Static Analysis Tools

Several tools can catch this class of vulnerability automatically:

Bandit — Python security linter that flags dangerous patterns including __dict__ manipulation
Semgrep — Highly configurable static analysis with rules for injection vulnerabilities
PyLint with security plugins
Snyk Code — AI-powered SAST that understands context

Run these tools in your CI/CD pipeline so vulnerabilities are caught before they reach production.

6. Know Your CWEs

This vulnerability maps to several well-documented weakness categories:

CWE	Description
CWE-915	Improperly Controlled Modification of Dynamically-Determined Object Attributes
CWE-20	Improper Input Validation
CWE-400	Uncontrolled Resource Consumption (DoS aspect)
CWE-94	Improper Control of Generation of Code

Familiarizing yourself with the CWE catalog is an excellent way to recognize vulnerability patterns before you accidentally introduce them.

7. OWASP References

This vulnerability is relevant to several OWASP categories:

OWASP Top 10 A03:2021 – Injection: Attacker-controlled data influencing program logic
OWASP Top 10 A04:2021 – Insecure Design: Lack of input validation at design level
OWASP Top 10 A05:2021 – Security Misconfiguration: Overly permissive data handling

Conclusion

The self.__dict__.update(entries) pattern is a perfect example of how a single convenient line of code can introduce a serious security vulnerability. When entries comes from a network packet—as is the case in Scapy-based applications—you're essentially letting the network tell your object what it is and how it behaves.

The key takeaways from this vulnerability are:

Treat all external data as hostile until proven otherwise
Use allowlists, not blocklists, when validating input keys
Never bypass Python's attribute protocol by writing directly to __dict__
Enforce size constraints early to prevent resource exhaustion
Use schema validation libraries like Pydantic to make safe-by-default data handling easy

Security vulnerabilities in network parsing code are particularly dangerous because they can often be triggered remotely without authentication. Taking the time to add proper input validation isn't just good practice—in network-facing code, it's essential.

The fix applied here is a great template for any Python code that needs to initialize objects from external data sources. Copy the pattern, adapt the allowlist to your domain, and your code will be significantly more resilient against this class of attack.

Found a security vulnerability in your codebase? Consider integrating automated security scanning into your CI/CD pipeline to catch issues like this before they reach production.

Unsafe Dict Merge in Scapy: How dict.update() Opens the Door to Object Injection

Unsafe Dict Merge in Scapy: How `dict.update()` Opens the Door to Object Injection

Introduction

The Vulnerability Explained

What Is `dict.update()` and Why Is It Dangerous?

The Vulnerable Code

How Could It Be Exploited?

Real-World Attack Scenario

The Fix

What Needs to Change

The Secure Pattern (After Fix)

Why This Fix Works

Prevention & Best Practices

1. Never Use `dict.update()` with Untrusted Data

2. Use Data Validation Libraries

3. Apply the Principle of Least Privilege to Data

4. Enforce Input Size Limits Early

5. Use Static Analysis Tools

6. Know Your CWEs

7. OWASP References

Conclusion

View the Security Fix

Related Articles

Fixing OS Command Injection in SageMath: Shell Metacharacter Attacks

Command Injection in Firejail's netfilter.c: How Environment Variables Can Lead to Root Compromise

Integer Overflow to Heap Corruption: Fixing a Critical q3asm Vulnerability

Unsafe Dict Merge in Scapy: How __dict__.update() Opens the Door to Object Injection

Unsafe Dict Merge in Scapy: How __dict__.update() Opens the Door to Object Injection

Introduction

The Vulnerability Explained

What Is __dict__.update() and Why Is It Dangerous?

The Vulnerable Code

How Could It Be Exploited?

Real-World Attack Scenario

The Fix

What Needs to Change

The Secure Pattern (After Fix)

Why This Fix Works

Prevention & Best Practices

1. Never Use __dict__.update() with Untrusted Data

2. Use Data Validation Libraries

3. Apply the Principle of Least Privilege to Data

4. Enforce Input Size Limits Early

5. Use Static Analysis Tools

6. Know Your CWEs

7. OWASP References

Conclusion

View the Security Fix

Related Articles

Fixing OS Command Injection in SageMath: Shell Metacharacter Attacks

Command Injection in Firejail's netfilter.c: How Environment Variables Can Lead to Root Compromise

Integer Overflow to Heap Corruption: Fixing a Critical q3asm Vulnerability

Unsafe Dict Merge in Scapy: How dict.update() Opens the Door to Object Injection

Unsafe Dict Merge in Scapy: How `dict.update()` Opens the Door to Object Injection

What Is `dict.update()` and Why Is It Dangerous?

1. Never Use `dict.update()` with Untrusted Data