Supply Chain Attack via Unsafe `subprocess` in CI/CD Hooks: How It Works and How It Was Fixed

Introduction

Imagine merging what looks like a routine pull request — a small config tweak, perhaps a new hook script path — only to discover that it silently redirected your CI/CD runner to execute an attacker-controlled binary. No exploit kit required. No zero-day. Just a misconfigured subprocess.run() call and a YAML file that nobody thought to validate.

This is exactly the class of vulnerability that was discovered and patched in graphify/hooks.py. Rated high severity, it represents one of the most dangerous patterns in modern software development: a supply-chain attack vector baked directly into the developer tooling itself.

If you write Python that executes external scripts, runs shell commands, or processes configuration files from repositories — this post is for you.

The Vulnerability Explained

What Happened?

At line 154 of graphify/hooks.py, the application called subprocess.run() to execute external hook scripts. The path to those scripts was read from a configuration file — something like a .graphify/hooks.yaml in the repository root.

Here's the core problem: the script path was never validated.

In simplified terms, the vulnerable code looked something like this:

# VULNERABLE - Do not use
import subprocess
import yaml

def run_hook(config_path: str):
    with open(config_path) as f:
        config = yaml.safe_load(f)

    hook_script = config.get("hook_script")  # User-controlled value!

    # No validation — executes whatever path the config specifies
    subprocess.run([hook_script], check=True)

If an attacker could modify hooks.yaml — for example, by merging a pull request — they could set hook_script to point to any executable on the filesystem:

# Malicious hooks.yaml
hook_script: "/tmp/malicious_payload.sh"

Or even more subtly:

hook_script: "../../.git/hooks/post-merge"

How Could It Be Exploited?

The attack chain is straightforward and alarmingly practical:

Attacker forks a repository that uses graphify with hook support enabled.
Attacker submits a pull request that modifies .graphify/hooks.yaml to point hook_script at a malicious binary they've also included in the PR (or one already present on the runner).
CI/CD pipeline automatically checks out and processes the PR — as most modern pipelines do for automated testing and linting.
graphify reads the hooks config and calls subprocess.run() with the attacker-supplied path.
Arbitrary code executes on the CI/CD runner with whatever privileges the pipeline process holds.

From here, the attacker can:
- Exfiltrate secrets, API keys, and environment variables
- Tamper with build artifacts (injecting malicious code into your releases)
- Pivot to internal infrastructure accessible from the runner
- Establish persistence for future attacks

Why Is This Especially Dangerous in CI/CD?

CI/CD runners are high-value targets. They typically have access to:
- Repository secrets (deploy keys, signing certificates, cloud credentials)
- Package registries (the ability to publish releases)
- Internal networks (staging environments, databases, internal APIs)

A single malicious commit that gets processed by an automated pipeline can compromise all of the above — without ever requiring direct access to your infrastructure.

This attack pattern has been used in real-world incidents against major open-source projects. The SolarWinds attack and the codecov breach are high-profile examples of what happens when build pipeline trust is misplaced.

CWE and OWASP Classification

CWE-78: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
CWE-426: Untrusted Search Path
OWASP A03:2021 – Injection
OWASP A08:2021 – Software and Data Integrity Failures (Supply Chain)

The Fix

The patch to graphify/hooks.py introduces path allowlisting — validating the resolved, absolute path of any hook script against a set of permitted directories before passing it to subprocess.run().

Before (Vulnerable Pattern)

# BEFORE: No validation of the hook script path
import subprocess
import yaml

def run_hook(config_path: str):
    with open(config_path) as f:
        config = yaml.safe_load(f)

    hook_script = config.get("hook_script")
    subprocess.run([hook_script], check=True)  # ⚠️ Unvalidated user input

After (Secure Pattern)

# AFTER: Path validated against an allowlist before execution
import subprocess
import yaml
from pathlib import Path

# Define permitted base directories for hook scripts
ALLOWED_HOOK_DIRS = [
    Path("/opt/graphify/hooks").resolve(),
    Path("./graphify/default_hooks").resolve(),
]

class HookPathViolation(SecurityError):
    """Raised when a hook script path fails allowlist validation."""
    pass

def _validate_hook_path(script_path: str) -> Path:
    """
    Resolve the script path and verify it falls within an allowed directory.
    Raises HookPathViolation if the path is outside permitted locations.
    """
    resolved = Path(script_path).resolve()  # Resolves symlinks and ".." traversal

    for allowed_dir in ALLOWED_HOOK_DIRS:
        try:
            resolved.relative_to(allowed_dir)  # Raises ValueError if not a subpath
            return resolved
        except ValueError:
            continue

    raise HookPathViolation(
        f"Hook script '{script_path}' (resolved: '{resolved}') is outside "
        f"permitted directories: {ALLOWED_HOOK_DIRS}"
    )

def run_hook(config_path: str):
    with open(config_path) as f:
        config = yaml.safe_load(f)

    raw_hook_script = config.get("hook_script")
    if not raw_hook_script:
        return  # No hook configured, nothing to do

    # Validate before executing — raises if path is not permitted
    safe_script_path = _validate_hook_path(raw_hook_script)

    subprocess.run([str(safe_script_path)], check=True)  # ✅ Safe to execute

Key Security Improvements

Aspect	Before	After
Path validation	None	Allowlist against permitted directories
Symlink resolution	Not handled	`Path.resolve()` prevents symlink escapes
Directory traversal	Vulnerable to `../../`	Resolved absolute path checked
Error handling	Silent failure	Explicit `HookPathViolation` exception
Auditability	None	Clear, loggable rejection of invalid paths

Why `Path.resolve()` Matters

A naive check like script_path.startswith("/opt/graphify/hooks") can be bypassed with path traversal:

/opt/graphify/hooks/../../etc/passwd

Calling Path(script_path).resolve() first collapses all .. components and follows symlinks, giving you the true filesystem path before comparison. This is essential for any path-based security check.

Prevention & Best Practices

1. Never Trust Configuration File Contents

Configuration files — especially those committed to a repository — should be treated as untrusted user input. Any value read from them that influences execution (file paths, command names, arguments) must be validated.

# Treat config values like HTTP request parameters — validate everything
hook_script = config.get("hook_script")
assert isinstance(hook_script, str), "hook_script must be a string"
safe_path = _validate_hook_path(hook_script)  # Validate before use

2. Prefer Allowlists Over Denylists

It's tempting to block "dangerous" paths like /etc/ or /tmp/. Don't. Attackers are creative, and denylists are always incomplete. Instead, define exactly where scripts are permitted to live and reject everything else.

# ❌ Denylist — incomplete and bypassable
BLOCKED_PATHS = ["/etc", "/tmp", "/root"]

# ✅ Allowlist — explicit and safe
ALLOWED_DIRS = [Path("/opt/myapp/hooks").resolve()]

3. Apply Principle of Least Privilege to CI/CD Runners

Even with the fix in place, defense-in-depth matters:
- Run CI/CD jobs in ephemeral, isolated containers or VMs
- Use read-only repository checkouts where possible
- Scope secrets to only the jobs that need them
- Enable branch protection rules to require reviews before merging config changes

4. Audit `subprocess` Usage Regularly

Search your codebase for dangerous patterns:

# Find subprocess calls that might use variable input
grep -rn "subprocess\.\(run\|call\|Popen\|check_output\)" . \
  --include="*.py" | grep -v "shell=False"

Better yet, use static analysis tools:
- Bandit (pip install bandit) — specifically checks for subprocess misuse (rule B603, B604)
- Semgrep with the python.lang.security.audit.subprocess-shell-true rule
- Safety for dependency vulnerability scanning

# Run Bandit on your project
bandit -r ./graphify -t B603,B604,B607

5. Consider Sandboxing Hook Execution

For applications that genuinely need to run user-provided scripts, consider sandboxing:
- Docker containers with limited capabilities and no network access
- seccomp profiles to restrict syscalls
- firejail or bubblewrap for lightweight sandboxing on Linux

6. Security Standards to Reference

OWASP Command Injection Prevention Cheat Sheet
CWE-78: OS Command Injection
SLSA Supply Chain Security Framework
OpenSSF Scorecard — automated supply chain risk assessment

A Note on the Related Credential Storage Issue

While this post focuses on the subprocess vulnerability (V-003), it's worth briefly acknowledging that a separate critical vulnerability (V-001) was also identified: OAuth tokens and API keys being stored in plaintext on the local filesystem in graphify/extract.py.

Plaintext credential storage is a serious companion risk — if an attacker achieves code execution via the hook injection described above, any plaintext credentials on disk become immediately accessible. Defense-in-depth means fixing both issues: prevent the execution, and encrypt the credentials so that even a successful breach yields less value to the attacker.

Encrypting stored credentials using a key derivation function like PBKDF2 (already available in the project's Rust dependencies) is the recommended remediation for V-001.

Conclusion

The vulnerability fixed in this PR is a textbook example of why input validation must happen at every trust boundary — not just at HTTP endpoints, but anywhere your application reads data that influences execution. A configuration file in a repository is just as dangerous as a form field on a website if its contents are blindly trusted.

The key takeaways:

✅ Always validate file paths read from configuration against an explicit allowlist
✅ Use Path.resolve() before any path comparison to prevent traversal and symlink attacks
✅ Treat CI/CD pipelines as high-value targets — they have access to your most sensitive secrets
✅ Apply defense-in-depth: input validation + least privilege + sandboxing + secret scoping
✅ Automate security scanning with tools like Bandit and Semgrep to catch these patterns before they reach production

Supply chain attacks are not theoretical. They are happening today, at scale, against real organizations. The cost of adding a 10-line path validation function is essentially zero. The cost of not adding it can be catastrophic.

Write the validation. Merge the fix. Sleep better.

This vulnerability was identified and patched by OrbisAI Security. Automated security scanning combined with LLM-assisted code review confirmed both the vulnerability and the effectiveness of the fix.

Supply Chain Attack via Unsafe subprocess in CI/CD Hooks: Fixed

Supply Chain Attack via Unsafe `subprocess` in CI/CD Hooks: How It Works and How It Was Fixed

Introduction

The Vulnerability Explained

What Happened?

How Could It Be Exploited?

Why Is This Especially Dangerous in CI/CD?

CWE and OWASP Classification

The Fix

Before (Vulnerable Pattern)

After (Secure Pattern)

Key Security Improvements

Why `Path.resolve()` Matters

Prevention & Best Practices

1. Never Trust Configuration File Contents

2. Prefer Allowlists Over Denylists

3. Apply Principle of Least Privilege to CI/CD Runners

4. Audit `subprocess` Usage Regularly

5. Consider Sandboxing Hook Execution

6. Security Standards to Reference

A Note on the Related Credential Storage Issue

Conclusion

View the Security Fix

Related Articles

Stack Buffer Overflow in MapScale: How Five Unsafe sprintf Calls Created a Critical Vulnerability

Heap Buffer Overflows in YAML Parser: How Unchecked memcpy Calls Create Critical Attack Vectors

Critical Buffer Overflow Fixed: When "Safe" Functions Aren't Safe

Supply Chain Attack via Unsafe subprocess in CI/CD Hooks: Fixed

Supply Chain Attack via Unsafe subprocess in CI/CD Hooks: How It Works and How It Was Fixed

Introduction

The Vulnerability Explained

What Happened?

How Could It Be Exploited?

Why Is This Especially Dangerous in CI/CD?

CWE and OWASP Classification

The Fix

Before (Vulnerable Pattern)

After (Secure Pattern)

Key Security Improvements

Why Path.resolve() Matters

Prevention & Best Practices

1. Never Trust Configuration File Contents

2. Prefer Allowlists Over Denylists

3. Apply Principle of Least Privilege to CI/CD Runners

4. Audit subprocess Usage Regularly

5. Consider Sandboxing Hook Execution

6. Security Standards to Reference

A Note on the Related Credential Storage Issue

Conclusion

View the Security Fix

Related Articles

Stack Buffer Overflow in MapScale: How Five Unsafe sprintf Calls Created a Critical Vulnerability

Heap Buffer Overflows in YAML Parser: How Unchecked memcpy Calls Create Critical Attack Vectors

Critical Buffer Overflow Fixed: When "Safe" Functions Aren't Safe

Supply Chain Attack via Unsafe `subprocess` in CI/CD Hooks: How It Works and How It Was Fixed

Why `Path.resolve()` Matters

4. Audit `subprocess` Usage Regularly