Introduction
SQL injection remains one of the most dangerous and prevalent web application vulnerabilities, consistently ranking in OWASP's Top 10 security risks. Despite being well-documented for decades, developers still occasionally fall into the trap of constructing SQL queries using string formatting or concatenation—a practice that opens the door to devastating attacks.
Recently, a medium-severity SQL injection vulnerability was identified and fixed in utils/github_app.py. While the specific code changes weren't detailed in the pull request, the vulnerability type is clear: formatted SQL queries were being used instead of parameterized queries. This blog post will explore why this matters, how SQL injection works, and why parameterized queries are your first line of defense.
The Vulnerability Explained
What is SQL Injection?
SQL injection occurs when an attacker can manipulate SQL queries by injecting malicious SQL code through user-supplied input. This happens when applications construct SQL queries by directly embedding user input into query strings without proper sanitization or parameterization.
The Dangerous Pattern: String Formatting
Here's a typical vulnerable pattern that might have existed in the codebase:
# VULNERABLE CODE - DO NOT USE
def get_user_repos(username):
query = f"SELECT * FROM repositories WHERE owner = '{username}'"
# or using older string formatting:
# query = "SELECT * FROM repositories WHERE owner = '%s'" % username
cursor.execute(query)
return cursor.fetchall()
This looks innocent enough, right? But here's the problem: the application trusts that username contains only a legitimate username.
How Could It Be Exploited?
An attacker could provide a malicious username like:
' OR '1'='1
The resulting query becomes:
SELECT * FROM repositories WHERE owner = '' OR '1'='1'
Since '1'='1' is always true, this query returns all repositories, regardless of ownership. This is a data breach waiting to happen.
But it gets worse. An attacker could escalate to:
'; DROP TABLE repositories; --
Resulting in:
SELECT * FROM repositories WHERE owner = ''; DROP TABLE repositories; --'
This would execute two statements: a harmless SELECT followed by deleting the entire repositories table. The -- comments out the remaining query, preventing syntax errors.
Real-World Impact
SQL injection vulnerabilities can lead to:
- Data breaches: Unauthorized access to sensitive information
- Data manipulation: Modifying or deleting critical data
- Authentication bypass: Logging in as any user without credentials
- Privilege escalation: Gaining administrative access
- Complete system compromise: In some cases, executing operating system commands
According to the OWASP Top 10, injection attacks (including SQL injection) affect approximately 19% of applications tested and can have severe business consequences including regulatory fines, reputation damage, and loss of customer trust.
The Fix: Parameterized Queries
The solution is straightforward: use parameterized queries (also called prepared statements). Here's how the vulnerable code should be rewritten:
Before (Vulnerable):
# VULNERABLE - String formatting
def get_user_repos(username):
query = f"SELECT * FROM repositories WHERE owner = '{username}'"
cursor.execute(query)
return cursor.fetchall()
After (Secure):
# SECURE - Parameterized query
def get_user_repos(username):
query = "SELECT * FROM repositories WHERE owner = ?"
cursor.execute(query, (username,))
return cursor.fetchall()
Or using named parameters (depending on your database library):
# SECURE - Named parameters
def get_user_repos(username):
query = "SELECT * FROM repositories WHERE owner = :username"
cursor.execute(query, {"username": username})
return cursor.fetchall()
How Does This Solve the Problem?
Parameterized queries work by:
- Separating SQL logic from data: The query structure is sent to the database separately from the parameter values
- Treating parameters as data only: The database engine knows that parameter values should be treated as literal data, not executable SQL code
- Automatic escaping: The database driver handles all necessary escaping and quoting
When an attacker tries to inject ' OR '1'='1, the parameterized query treats the entire string as a literal username value. The database looks for a user literally named ' OR '1'='1 (which doesn't exist), rather than executing it as SQL code.
Security Improvement
This fix provides:
- ✅ Complete protection against SQL injection for parameterized values
- ✅ No need for manual escaping or input validation for SQL-specific characters
- ✅ Performance benefits through query plan caching
- ✅ Cleaner, more maintainable code
Prevention & Best Practices
1. Always Use Parameterized Queries
Make parameterized queries your default approach for all database interactions:
# Python with sqlite3
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
# Python with psycopg2 (PostgreSQL)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
# Python with SQLAlchemy (ORM)
session.query(User).filter(User.id == user_id).all()
2. Use ORMs Carefully
Object-Relational Mappers (ORMs) like SQLAlchemy, Django ORM, or Sequelize generally protect against SQL injection when used correctly. However, be cautious with:
# VULNERABLE - Raw SQL in ORM
User.objects.raw(f"SELECT * FROM users WHERE name = '{name}'")
# SECURE - Parameterized raw SQL
User.objects.raw("SELECT * FROM users WHERE name = %s", [name])
# SECURE - ORM methods
User.objects.filter(name=name)
3. Validate and Sanitize Input
While parameterized queries protect against SQL injection, you should still validate input for business logic reasons:
def get_user_repos(username):
# Validate input format
if not username or not re.match(r'^[a-zA-Z0-9_-]+$', username):
raise ValueError("Invalid username format")
# Use parameterized query
query = "SELECT * FROM repositories WHERE owner = ?"
cursor.execute(query, (username,))
return cursor.fetchall()
4. Apply Principle of Least Privilege
Database users should have minimal necessary permissions:
-- Don't give web application users DROP, CREATE, or GRANT permissions
GRANT SELECT, INSERT, UPDATE ON database.* TO 'webapp_user'@'localhost';
5. Use Static Analysis Tools
Implement tools to catch SQL injection vulnerabilities during development:
- Bandit (Python): Detects common security issues
- SonarQube: Multi-language security scanner
- Semgrep: Pattern-based code analysis
- SQLMap: Penetration testing tool for finding SQL injection
Example Bandit configuration:
# Install
pip install bandit
# Run scan
bandit -r . -f json -o bandit-report.json
6. Follow OWASP Guidelines
The OWASP SQL Injection Prevention Cheat Sheet recommends:
- Option 1: Use prepared statements (parameterized queries) - Primary Defense
- Option 2: Use stored procedures (with parameterization)
- Option 3: Whitelist input validation (for table/column names that can't be parameterized)
- Option 4: Escape all user input (last resort, error-prone)
7. Security Testing
Include SQL injection testing in your security testing strategy:
# Example security test
def test_sql_injection_protection():
malicious_inputs = [
"' OR '1'='1",
"'; DROP TABLE users; --",
"' UNION SELECT * FROM passwords --",
"admin'--",
"1' AND '1'='1"
]
for malicious_input in malicious_inputs:
# Should not cause errors or unexpected behavior
result = get_user_repos(malicious_input)
assert len(result) == 0 # No user with this name exists
Relevant Security Standards
- CWE-89: Improper Neutralization of Special Elements used in an SQL Command
- OWASP Top 10 2021: A03:2021 – Injection
- SANS Top 25: CWE-89 ranks consistently in the most dangerous software errors
Conclusion
SQL injection remains a critical threat, but it's also one of the most preventable vulnerabilities. The fix applied to utils/github_app.py—replacing formatted SQL queries with parameterized queries—demonstrates the right approach to database security.
Key Takeaways:
- Never construct SQL queries using string formatting or concatenation with user input
- Always use parameterized queries or prepared statements
- Parameterized queries separate SQL logic from data, preventing injection attacks
- Implement static analysis tools to catch these issues during development
- Follow defense-in-depth: combine parameterized queries with input validation and least privilege
Remember: secure coding isn't about being paranoid—it's about being responsible. Every SQL query you write is a potential entry point for attackers. By making parameterized queries your default practice, you protect not just your application, but also your users' data and your organization's reputation.
Take action today: Review your codebase for string-formatted SQL queries and refactor them to use parameterized queries. Your future self (and your security team) will thank you.
For more information, visit:
- OWASP SQL Injection Prevention Cheat Sheet
- CWE-89: SQL Injection
- Python DB-API 2.0 Specification