SQL Injection Prevention: Moving from String Formatting to Parameterized Queries

Introduction

SQL injection remains one of the most dangerous and prevalent web application vulnerabilities, consistently ranking in OWASP's Top 10 security risks. Despite being well-documented for decades, developers still occasionally fall into the trap of constructing SQL queries using string formatting or concatenation—a practice that opens the door to devastating attacks.

Recently, a medium-severity SQL injection vulnerability was identified and fixed in utils/github_app.py. While the specific code changes weren't detailed in the pull request, the vulnerability type is clear: formatted SQL queries were being used instead of parameterized queries. This blog post will explore why this matters, how SQL injection works, and why parameterized queries are your first line of defense.

The Vulnerability Explained

What is SQL Injection?

SQL injection occurs when an attacker can manipulate SQL queries by injecting malicious SQL code through user-supplied input. This happens when applications construct SQL queries by directly embedding user input into query strings without proper sanitization or parameterization.

The Dangerous Pattern: String Formatting

Here's a typical vulnerable pattern that might have existed in the codebase:

# VULNERABLE CODE - DO NOT USE
def get_user_repos(username):
    query = f"SELECT * FROM repositories WHERE owner = '{username}'"
    # or using older string formatting:
    # query = "SELECT * FROM repositories WHERE owner = '%s'" % username

    cursor.execute(query)
    return cursor.fetchall()

This looks innocent enough, right? But here's the problem: the application trusts that username contains only a legitimate username.

How Could It Be Exploited?

An attacker could provide a malicious username like:

' OR '1'='1

The resulting query becomes:

SELECT * FROM repositories WHERE owner = '' OR '1'='1'

Since '1'='1' is always true, this query returns all repositories, regardless of ownership. This is a data breach waiting to happen.

But it gets worse. An attacker could escalate to:

'; DROP TABLE repositories; --

Resulting in:

SELECT * FROM repositories WHERE owner = ''; DROP TABLE repositories; --'

This would execute two statements: a harmless SELECT followed by deleting the entire repositories table. The -- comments out the remaining query, preventing syntax errors.

Real-World Impact

SQL injection vulnerabilities can lead to:

Data breaches: Unauthorized access to sensitive information
Data manipulation: Modifying or deleting critical data
Authentication bypass: Logging in as any user without credentials
Privilege escalation: Gaining administrative access
Complete system compromise: In some cases, executing operating system commands

According to the OWASP Top 10, injection attacks (including SQL injection) affect approximately 19% of applications tested and can have severe business consequences including regulatory fines, reputation damage, and loss of customer trust.

The Fix: Parameterized Queries

The solution is straightforward: use parameterized queries (also called prepared statements). Here's how the vulnerable code should be rewritten:

Before (Vulnerable):

# VULNERABLE - String formatting
def get_user_repos(username):
    query = f"SELECT * FROM repositories WHERE owner = '{username}'"
    cursor.execute(query)
    return cursor.fetchall()

After (Secure):

# SECURE - Parameterized query
def get_user_repos(username):
    query = "SELECT * FROM repositories WHERE owner = ?"
    cursor.execute(query, (username,))
    return cursor.fetchall()

Or using named parameters (depending on your database library):

# SECURE - Named parameters
def get_user_repos(username):
    query = "SELECT * FROM repositories WHERE owner = :username"
    cursor.execute(query, {"username": username})
    return cursor.fetchall()

How Does This Solve the Problem?

Parameterized queries work by:

Separating SQL logic from data: The query structure is sent to the database separately from the parameter values
Treating parameters as data only: The database engine knows that parameter values should be treated as literal data, not executable SQL code
Automatic escaping: The database driver handles all necessary escaping and quoting

When an attacker tries to inject ' OR '1'='1, the parameterized query treats the entire string as a literal username value. The database looks for a user literally named ' OR '1'='1 (which doesn't exist), rather than executing it as SQL code.

Security Improvement

This fix provides:

✅ Complete protection against SQL injection for parameterized values
✅ No need for manual escaping or input validation for SQL-specific characters
✅ Performance benefits through query plan caching
✅ Cleaner, more maintainable code

Prevention & Best Practices

1. Always Use Parameterized Queries

Make parameterized queries your default approach for all database interactions:

# Python with sqlite3
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))

# Python with psycopg2 (PostgreSQL)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Python with SQLAlchemy (ORM)
session.query(User).filter(User.id == user_id).all()

2. Use ORMs Carefully

Object-Relational Mappers (ORMs) like SQLAlchemy, Django ORM, or Sequelize generally protect against SQL injection when used correctly. However, be cautious with:

# VULNERABLE - Raw SQL in ORM
User.objects.raw(f"SELECT * FROM users WHERE name = '{name}'")

# SECURE - Parameterized raw SQL
User.objects.raw("SELECT * FROM users WHERE name = %s", [name])

# SECURE - ORM methods
User.objects.filter(name=name)

3. Validate and Sanitize Input

While parameterized queries protect against SQL injection, you should still validate input for business logic reasons:

def get_user_repos(username):
    # Validate input format
    if not username or not re.match(r'^[a-zA-Z0-9_-]+$', username):
        raise ValueError("Invalid username format")

    # Use parameterized query
    query = "SELECT * FROM repositories WHERE owner = ?"
    cursor.execute(query, (username,))
    return cursor.fetchall()

4. Apply Principle of Least Privilege

Database users should have minimal necessary permissions:

-- Don't give web application users DROP, CREATE, or GRANT permissions
GRANT SELECT, INSERT, UPDATE ON database.* TO 'webapp_user'@'localhost';

5. Use Static Analysis Tools

Implement tools to catch SQL injection vulnerabilities during development:

Bandit (Python): Detects common security issues
SonarQube: Multi-language security scanner
Semgrep: Pattern-based code analysis
SQLMap: Penetration testing tool for finding SQL injection

Example Bandit configuration:

# Install
pip install bandit

# Run scan
bandit -r . -f json -o bandit-report.json

6. Follow OWASP Guidelines

The OWASP SQL Injection Prevention Cheat Sheet recommends:

Option 1: Use prepared statements (parameterized queries) - Primary Defense
Option 2: Use stored procedures (with parameterization)
Option 3: Whitelist input validation (for table/column names that can't be parameterized)
Option 4: Escape all user input (last resort, error-prone)

7. Security Testing

Include SQL injection testing in your security testing strategy:

# Example security test
def test_sql_injection_protection():
    malicious_inputs = [
        "' OR '1'='1",
        "'; DROP TABLE users; --",
        "' UNION SELECT * FROM passwords --",
        "admin'--",
        "1' AND '1'='1"
    ]

    for malicious_input in malicious_inputs:
        # Should not cause errors or unexpected behavior
        result = get_user_repos(malicious_input)
        assert len(result) == 0  # No user with this name exists

Relevant Security Standards

CWE-89: Improper Neutralization of Special Elements used in an SQL Command
OWASP Top 10 2021: A03:2021 – Injection
SANS Top 25: CWE-89 ranks consistently in the most dangerous software errors

Conclusion

SQL injection remains a critical threat, but it's also one of the most preventable vulnerabilities. The fix applied to utils/github_app.py—replacing formatted SQL queries with parameterized queries—demonstrates the right approach to database security.

Key Takeaways:

Never construct SQL queries using string formatting or concatenation with user input
Always use parameterized queries or prepared statements
Parameterized queries separate SQL logic from data, preventing injection attacks
Implement static analysis tools to catch these issues during development
Follow defense-in-depth: combine parameterized queries with input validation and least privilege

Remember: secure coding isn't about being paranoid—it's about being responsible. Every SQL query you write is a potential entry point for attackers. By making parameterized queries your default practice, you protect not just your application, but also your users' data and your organization's reputation.

Take action today: Review your codebase for string-formatted SQL queries and refactor them to use parameterized queries. Your future self (and your security team) will thank you.

For more information, visit:
- OWASP SQL Injection Prevention Cheat Sheet
- CWE-89: SQL Injection
- Python DB-API 2.0 Specification