Back to Blog
medium SEVERITY8 min read

Shell Script JSON Injection: When printf Becomes a Security Risk

A medium-severity vulnerability was discovered and patched in `scripts/openai_compat_report.sh`, where shell-based JSON construction using `printf` and variable interpolation left API payloads open to injection attacks. Without proper escaping of special characters, attacker-controlled input could malform JSON or silently alter API request semantics. This post breaks down how the vulnerability works, how it was fixed, and what every developer should know about safe JSON construction in shell scr

O
By orbisai0security
May 9, 2026
#shell-scripting#json-injection#bash-security#api-security#input-validation#owasp#secure-coding

Shell Script JSON Injection: When printf Becomes a Security Risk

Vulnerability ID: V-004 | Severity: Medium | File: scripts/openai_compat_report.sh


Introduction

Shell scripts are the duct tape of the software world — quick, powerful, and everywhere. They glue together pipelines, automate deployments, and call APIs. But that convenience comes with a hidden cost: shell scripts have no native understanding of structured data formats like JSON. When developers reach for printf or string interpolation to build JSON payloads, they're essentially constructing a structured document with a blunt instrument.

This is exactly what happened in scripts/openai_compat_report.sh. The script was building JSON payloads destined for an API using shell variable interpolation — and doing so without any escaping or sanitization. The result? A classic JSON injection vulnerability hiding in plain sight inside a shell script.

If you've ever written something like this in a shell script:

payload="{\"message\": \"$user_input\"}"

...then this post is for you.


The Vulnerability Explained

What Is JSON Injection in a Shell Script?

JSON injection occurs when user-controlled or externally-sourced data is embedded into a JSON structure without proper escaping. In compiled languages, developers often use JSON serialization libraries that handle escaping automatically. In shell scripts, there is no such safety net.

The vulnerable code in openai_compat_report.sh was constructing JSON payloads using printf and shell variable interpolation, then writing those payloads to output files used directly as API request bodies. Something conceptually like this:

# ⚠️ VULNERABLE pattern — do not use
printf '{"model": "%s", "prompt": "%s"}' "$MODEL" "$USER_CONTENT" > payload.json
curl -X POST https://api.example.com/v1/completions \
  -d @payload.json

At first glance, this looks harmless. But consider what happens when $USER_CONTENT contains any of the following JSON special characters:

Character JSON Meaning Effect if Unescaped
" String delimiter Terminates the string early
\ Escape character Corrupts the following character
\n Newline Breaks JSON string literals
Control chars (\x00-\x1f) Reserved Produces invalid JSON

How Could It Be Exploited?

Let's walk through a concrete attack scenario.

Imagine $USER_CONTENT is populated from a file, an environment variable, or any external source that an attacker can influence. The attacker supplies the following string:

Hello", "role": "system", "injected": "true

The resulting JSON becomes:

{
  "model": "gpt-4",
  "prompt": "Hello", "role": "system", "injected": "true"
}

This is no longer the intended payload. The attacker has:

  1. Closed the prompt string early with the " character
  2. Injected new JSON fields (role, injected) into the request
  3. Potentially altered the API request semantics — for example, escalating a user message to a system-level instruction in an LLM API call

Even if the attacker can't fully control the outcome, they can reliably break the JSON structure, causing API errors, unexpected behavior, or information leakage through error messages.

What's the Real-World Impact?

The impact varies depending on what the API does with the payload, but the risks include:

  • Semantic injection: Altering the meaning of an API request (e.g., changing a user role to a system role in an LLM API call)
  • Denial of service: Consistently malforming JSON to cause API failures and disrupt dependent workflows
  • Data exfiltration: In some API designs, injected fields can trigger unexpected response content
  • Bypassing application logic: If the API response influences downstream decisions, injected fields could manipulate those decisions

In the context of an OpenAI-compatible API report script, the most concerning scenario is prompt injection — where an attacker-controlled value modifies the system prompt or model parameters, potentially causing the LLM to behave in unintended ways.


The Fix

What Changed?

The fix addresses the root cause: shell variable interpolation into JSON strings without escaping. The solution involves sanitizing or escaping all variables before they are interpolated into JSON payloads.

The safest approach — and the one recommended here — is to use a dedicated JSON-aware tool like jq to construct payloads, rather than building them manually with printf.

Before (Vulnerable):

# ⚠️ No escaping — special characters in variables break JSON
generate_payload() {
  local model="$1"
  local content="$2"
  printf '{"model": "%s", "messages": [{"role": "user", "content": "%s"}]}' \
    "$model" "$content" > "$OUTPUT_FILE"
}

After (Fixed):

# ✅ Use jq to safely construct JSON with proper escaping
generate_payload() {
  local model="$1"
  local content="$2"
  jq -n \
    --arg model "$model" \
    --arg content "$content" \
    '{
      "model": $model,
      "messages": [{"role": "user", "content": $content}]
    }' > "$OUTPUT_FILE"
}

Why Does jq Solve the Problem?

When you pass a shell variable to jq using --arg, jq treats the value as a raw string and handles all necessary JSON escaping internally. It will:

  • Escape double quotes ("\")
  • Escape backslashes (\\\)
  • Escape newlines and other control characters
  • Produce syntactically valid JSON regardless of input content

This completely eliminates the injection surface. No matter what characters an attacker puts into the input variables, jq will encode them safely.

Alternative: Escaping with printf (If jq Is Unavailable)

If jq is not available in your environment, you can write a sanitization function. However, this approach is error-prone and not recommended for production use:

# ⚠️ Manual escaping — fragile, use jq instead
json_escape() {
  local input="$1"
  # Escape backslashes first, then quotes, then control characters
  printf '%s' "$input" \
    | sed 's/\\/\\\\/g' \
    | sed 's/"/\\"/g' \
    | sed ':a;N;$!ba;s/\n/\\n/g'
}

content=$(json_escape "$USER_CONTENT")
printf '{"model": "%s", "content": "%s"}' "$MODEL" "$content"

Even this approach can miss edge cases (null bytes, other control characters). Always prefer jq or a language with native JSON support.


Prevention & Best Practices

1. Never Construct JSON Manually in Shell Scripts

This is the cardinal rule. Shell scripts lack the type awareness and escaping libraries that make JSON construction safe in other languages. If you need to build JSON in a shell script, always use jq.

# ✅ Safe JSON construction with jq
jq -n --arg key "value with \"quotes\" and \\ backslashes" '{"key": $key}'

2. Validate and Sanitize All External Input

Before any external data touches a JSON payload, validate it:

# Validate that a value is a safe model name (alphanumeric + hyphens only)
validate_model_name() {
  local model="$1"
  if [[ ! "$model" =~ ^[a-zA-Z0-9_-]+$ ]]; then
    echo "ERROR: Invalid model name: $model" >&2
    exit 1
  fi
}

3. Add Resource Limits for File-Based Inputs

As noted in the vulnerability description, the script also lacked size checks when loading files. Always enforce limits:

# Check file size before loading
MAX_FILE_SIZE=$((1 * 1024 * 1024))  # 1MB limit
check_file_size() {
  local file="$1"
  local size
  size=$(stat -c%s "$file" 2>/dev/null || stat -f%z "$file")
  if [[ "$size" -gt "$MAX_FILE_SIZE" ]]; then
    echo "ERROR: File too large ($size bytes). Maximum allowed: $MAX_FILE_SIZE bytes." >&2
    exit 1
  fi
}

4. Validate JSON Structure After Construction

Even after using jq, verify the output is valid JSON before sending it:

# Validate JSON before using it
validate_json() {
  local file="$1"
  if ! jq empty "$file" 2>/dev/null; then
    echo "ERROR: Generated payload is not valid JSON" >&2
    exit 1
  fi
}

5. Use Principle of Least Privilege for API Calls

Ensure the API keys and tokens used in scripts have the minimum required permissions. If an injection does occur, limited permissions reduce the blast radius.

6. Static Analysis and Linting

Use tools to catch these issues before they reach production:

  • ShellCheck — Static analysis for shell scripts. Won't catch JSON injection directly, but flags many common scripting mistakes.
  • Semgrep — Can be configured with custom rules to detect unsafe JSON construction patterns in shell scripts.
  • Manual code review — Look for any printf, echo, or string concatenation that produces JSON with unescaped variables.

7. Relevant Security Standards

This vulnerability maps to several well-known security standards:


A Note on Resource Exhaustion

The vulnerability description also flags a related issue: the script loads entire files into memory without size checks and parses JSON without depth limits. This is a separate but equally important concern.

A maliciously crafted import file with:
- Millions of entries → Memory exhaustion
- Deeply nested JSON (e.g., {"a":{"a":{"a":...}}} repeated thousands of times) → Stack overflow or extreme parse time

These are denial-of-service vectors that can take down the process or the host running the script. The fix should include:

# Limit JSON depth when parsing with jq
jq --argjson max_depth 10 'if (. | path(..) | length) > $max_depth then error("JSON too deeply nested") else . end' input.json

# Or simply enforce a file size limit before parsing
[[ $(wc -c < "$INPUT_FILE") -gt 1048576 ]] && { echo "File too large"; exit 1; }

Conclusion

The vulnerability in openai_compat_report.sh is a great reminder that injection vulnerabilities aren't limited to web applications. Shell scripts that construct structured data formats like JSON are just as susceptible — and often less scrutinized.

The key takeaways from this fix:

  1. Never build JSON with printf and raw variable interpolation in shell scripts
  2. Always use jq --arg to safely pass shell variables into JSON structures
  3. Enforce file size and depth limits before parsing any externally-supplied data
  4. Validate inputs before they enter any structured format
  5. Run static analysis (ShellCheck, Semgrep) on all shell scripts, especially those that handle external data or make API calls

Security vulnerabilities in shell scripts are easy to overlook precisely because shell scripts feel "simple." But simplicity doesn't equal safety. Every time your script touches external data and produces output consumed by another system, you have an injection surface that deserves careful attention.

Secure coding isn't just for application developers — it's for everyone who writes code, including the humble shell script.


This vulnerability was identified and fixed as part of an automated security scanning process. Fix verified by scanner re-scan and LLM code review.

Found a security issue in your codebase? Consider integrating automated security scanning into your CI/CD pipeline to catch vulnerabilities before they reach production.

View the Security Fix

Check out the pull request that fixed this vulnerability

View PR #1055

Related Articles

medium

Fixing NULL Pointer Dereference in eMMC Memory Allocation

A high-severity NULL pointer dereference vulnerability was discovered and fixed in embedded eMMC storage handling code, where unchecked `malloc` and `calloc` return values could allow an attacker with a crafted eMMC image to crash the host process. The fix adds proper NULL checks after every memory allocation, preventing exploitation through maliciously oversized partition size fields. This type of vulnerability is surprisingly common in systems-level C code and serves as a reminder that defensi

medium

Silent Data Destruction: The Hidden Danger in Upload Price Tier Logic

A medium-severity vulnerability in Fastlane's `deliver` tool revealed how a subtle semantic distinction between `nil` and an empty array could silently remove an app from sale in every App Store territory worldwide — with no warning, no confirmation, and a misleading success message to cover its tracks. This post breaks down how the bug worked, why it matters, and what developers can learn about defensive coding with destructive operations.

medium

Slidev Resolver Vulnerability: When Themes Become Trojan Horses

A medium-to-high severity vulnerability was discovered and patched in Slidev's resolver module, where dynamically loaded theme and plugin packages specified in slide frontmatter lacked proper validation, allowing a malicious package name to execute arbitrary code with the developer's full OS privileges. This fix addresses a supply-chain-adjacent attack vector that could allow attackers to exfiltrate credentials or compromise developer machines simply by sharing a crafted markdown presentation fi