What is SQL injection in PostgreSQL?

SQL injection occurs when user-controlled input is concatenated directly into SQL queries without proper escaping or parameterization. In PostgreSQL, this includes both literal values and identifiers (table/database names). Even quoted identifiers can be exploited if the quoting mechanism itself is bypassed.

How do you prevent SQL injection in PostgreSQL?

Use parameterized queries with placeholders ($1, $2, etc.) for values, and use PostgreSQL's quote_ident() function or libpq's PQescapeIdentifier() for identifiers like table and database names. Never concatenate user input directly into SQL strings.

What CWE is SQL injection?

CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection'). This is one of the most critical and exploitable vulnerability classes.

Is quoting database names with single quotes enough to prevent this?

No. This vulnerability demonstrates that naive quoting with chr(39) is insufficient. PostgreSQL allows single quotes and metacharacters in database names when created with quoted identifiers. Proper escaping using quote_ident() or PQescapeIdentifier() is required.

Can static analysis detect this vulnerability?

Yes. The multi_agent_ai scanner detected this pattern by identifying string concatenation with chr(39) followed by current_database() in a COPY command context. Pattern-based static analysis is effective for detecting these SQL injection signatures.

How SQL injection happens in PostgreSQL

How SQL Injection Happens in PostgreSQL Dictionary Synchronization and How to Fix It

Introduction

In the zhparser PostgreSQL extension, a critical SQL injection vulnerability was discovered in the zhparser--2.1.sql file at line 46. The dictionary synchronization function constructs a COPY command using string concatenation with chr(39) for quoting, incorporating the current_database() return value directly into the query without proper escaping. This seemingly innocent pattern—using single quotes to delimit a database name—created a catastrophic security flaw.

The vulnerability exists because PostgreSQL allows database names containing single quotes and SQL metacharacters when created with quoted identifiers (e.g., CREATE DATABASE "test'--";). When the vulnerable code concatenates this unescaped database name into a COPY command, an attacker can break out of the string literal and inject arbitrary SQL code that executes with the privileges of the PostgreSQL server.

This is not a theoretical vulnerability—it's exploitable in production environments where database administrators might use special characters in names, or where attackers can influence database creation through application logic.

The Vulnerability Explained

The Vulnerable Code Pattern

The original zhparser--2.1.sql file contained this dangerous pattern:

-- VULNERABLE CODE (BEFORE FIX)
COPY zhparser.zhprs_custom_word FROM 
  '/path/to/dict/' || current_database() || '/custom_words.txt' 
  WITH (FORMAT csv);

More specifically, the vulnerable code used string concatenation to build the dict_path:

dict_path := '/var/lib/postgresql/dicts/' || chr(39) || 
             current_database() || chr(39) || '/words.txt';
COPY zhparser.zhprs_custom_word FROM dict_path;

The problem is immediately visible: chr(39) is a single quote character, and it's being used to manually "quote" the database name. This approach fails catastrophically when current_database() returns a value containing special characters.

The Attack Scenario

Consider an attacker who creates a database with a malicious name:

CREATE DATABASE "test'; DROP TABLE zhparser.zhprs_custom_word; --";

PostgreSQL allows this because the name is quoted with double quotes (which are valid for identifiers). When a user later connects to this database and the vulnerable synchronization function runs, current_database() returns exactly: test'; DROP TABLE zhparser.zhprs_custom_word; --

The vulnerable code then constructs this COPY command:

COPY zhparser.zhprs_custom_word FROM 
  '/var/lib/postgresql/dicts/' || 'test'; DROP TABLE zhparser.zhprs_custom_word; --' || '/words.txt'

Which PostgreSQL interprets as:

COPY zhparser.zhprs_custom_word FROM '/var/lib/postgresql/dicts/test'
DROP TABLE zhparser.zhprs_custom_word; -- ' || '/words.txt'

The COPY command terminates early, then the injected DROP TABLE executes, destroying the custom words table. The -- comments out the rest, preventing syntax errors.

Why This Matters

This vulnerability has severe real-world implications:

Data Destruction: Attackers can drop critical tables in the zhparser schema
Data Exfiltration: Injected SELECT statements could write sensitive data to files
Privilege Escalation: If the function runs with elevated privileges, attackers inherit those privileges
Denial of Service: Malicious database names could crash the PostgreSQL server or consume all resources
Lateral Movement: In shared hosting environments, this could compromise other tenants' data

The regression test in the PR demonstrates this by attempting to create databases with payloads like "test'--" and "db'; DROP TABLE zhparser.zhprs_custom_word; --", verifying that the fix prevents these from being interpreted as SQL code.

The Fix

The fix replaces manual string concatenation with proper identifier escaping using PostgreSQL's built-in security functions.

Before (Vulnerable)

-- zhparser--2.1.sql (VULNERABLE)
CREATE OR REPLACE FUNCTION zhparser.sync_dict(
    OUT dict_count int
) AS $$
DECLARE
    dict_path text;
BEGIN
    dict_path := '/var/lib/postgresql/dicts/' || chr(39) || 
                 current_database() || chr(39) || '/words.txt';

    EXECUTE 'COPY zhparser.zhprs_custom_word FROM ' || 
            quote_literal(dict_path) || ' WITH (FORMAT csv)';

    SELECT count(*) INTO dict_count FROM zhparser.zhprs_custom_word;
END;
$$ LANGUAGE plpgsql;

After (Fixed)

-- zhparser--2.1.sql (FIXED)
CREATE OR REPLACE FUNCTION zhparser.sync_dict(
    OUT dict_count int
) AS $$
DECLARE
    dict_path text;
    db_name text;
BEGIN
    -- Use quote_ident() for identifier escaping, not chr(39) for values
    db_name := current_database();
    dict_path := '/var/lib/postgresql/dicts/' || 
                 quote_ident(db_name) || '/words.txt';

    -- Use parameterized query with proper escaping
    EXECUTE format('COPY zhparser.zhprs_custom_word FROM %L WITH (FORMAT csv)', 
                   dict_path);

    SELECT count(*) INTO dict_count FROM zhparser.zhprs_custom_word;
END;
$$ LANGUAGE plpgsql;

Key Changes Explained

quote_ident(db_name) instead of chr(39) || db_name || chr(39)
- quote_ident() is PostgreSQL's standard function for safely escaping identifiers
- It properly handles all special characters, including single quotes
- For the malicious database name test'; DROP TABLE..., it produces "test'; DROP TABLE..."
- The double quotes ensure the entire string is treated as a single identifier
format() with %L placeholder instead of string concatenation
- The format() function provides parameterization for SQL commands
- %L is the format specifier for literal string values (properly escaped)
- This separates the SQL structure from the data, preventing injection
Explicit variable assignment
- Extracting current_database() into a named variable makes the data flow clearer
- It's easier for security audits to verify that this value is properly escaped

The Regression Test

The PR includes a comprehensive C test that verifies the fix:

START_TEST(test_sql_injection_in_dict_sync)
{
    // Invariant: User input (database names) must never appear in 
    // SQL queries without proper escaping
    const char *payloads[] = {
        "test'--",                          
        "db'; DROP TABLE zhparser.zhprs_custom_word; --",
        "normal_db"
    };

    // ... test creates databases with these names and verifies
    // that the vulnerable COPY command cannot be injected
}

This test ensures that even if someone creates a database with a SQL injection payload as its name, the fixed code safely handles it by treating the entire name as an identifier literal.

Prevention & Best Practices

1. Always Use Proper Escaping Functions

For identifiers (table names, column names, database names): Use quote_ident() in PL/pgSQL or PQescapeIdentifier() in C/libpq
For literal values: Use quote_literal() in PL/pgSQL or parameterized queries
For format strings: Use format() with appropriate placeholders (%I for identifiers, %L for literals)

-- GOOD: Using format() with proper placeholders
EXECUTE format('SELECT * FROM %I WHERE id = %L', table_name, user_id);

-- GOOD: Using quote_ident() for identifiers
SELECT * FROM zhparser.sync_dict_from(quote_ident(database_name));

-- BAD: String concatenation
EXECUTE 'SELECT * FROM ' || table_name || ' WHERE id = ' || user_id;

2. Treat Database Names as Untrusted Input

Even though current_database() is a system function, treat its return value as untrusted when building dynamic SQL. PostgreSQL allows special characters in database names, and in multi-tenant environments, attackers may control database creation.

3. Use Static Analysis Tools

Enable PostgreSQL linters and static analysis:
- pgLint: Detects common PL/pgSQL security issues
- Semgrep: Has rules for detecting SQL injection patterns
- Orbis AppSec: Automatically scans for SQL injection and opens pull requests with fixes

# Example Semgrep rule to detect this pattern
semgrep --config p/security-audit --config p/owasp-top-ten

4. Implement Security Code Review Practices

Never allow string concatenation in SQL: Make this a mandatory code review point
Verify all dynamic SQL uses parameterization: Check every EXECUTE statement
Document why dynamic SQL is necessary: Often it can be avoided entirely

5. Apply the Principle of Least Privilege

Run PostgreSQL functions with minimal required privileges
Separate read-only operations from administrative operations
Use role-based access control to limit what each function can do

-- GOOD: Function runs with limited privileges
CREATE FUNCTION zhparser.sync_dict() 
SECURITY DEFINER 
SET search_path = zhparser 
AS $$ ... $$;

-- Grant only to trusted roles
GRANT EXECUTE ON FUNCTION zhparser.sync_dict() TO app_user;

6. Reference Security Standards

CWE-89: Improper Neutralization of Special Elements used in an SQL Command
OWASP Top 10 2021 - A03:2021: Injection
PostgreSQL Documentation: Handling Special Characters in Queries

Key Takeaways

Never use chr(39) or manual quote concatenation for PostgreSQL identifiers — The zhparser vulnerability shows that naive quoting fails when database names contain special characters. Always use quote_ident() or PQescapeIdentifier().
current_database() return values must be escaped before use in dynamic SQL — Even though it's a system function, its output can contain SQL metacharacters in PostgreSQL, making it exploitable when concatenated into queries.
String concatenation in SQL construction is a security anti-pattern — The vulnerable code in zhparser--2.1.sql demonstrates why parameterization (using format() with %I and %L placeholders) is mandatory for any dynamic SQL.
The COPY command is a SQL statement and subject to injection — Many developers assume file I/O operations are safe, but the dict_path in the COPY command was still vulnerable to injection through the database name parameter.
Regression tests must include realistic attack payloads — The test case in this PR specifically uses database names like "test'; DROP TABLE..." to verify that the fix actually prevents the injection, not just that the code runs without errors.

How Orbis AppSec Detected This

Source: The current_database() function return value, which is influenced by PostgreSQL's database naming rules that allow special characters in quoted identifiers.

Sink: The COPY command construction at line 46 of zhparser--2.1.sql, specifically where chr(39) || current_database() || chr(39) is concatenated into the file path string without proper escaping.

Missing Control: The code lacked proper identifier escaping. It used manual single-quote concatenation (chr(39)) instead of PostgreSQL's built-in quote_ident() function or the %I placeholder in format().

CWE: CWE-89 - Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')

Fix: Replaced string concatenation with quote_ident(current_database()) and used the format() function with the %L placeholder for safe parameterization of the file path in the COPY command.

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

The zhparser SQL injection vulnerability represents a critical class of security issues that persist in production code because they're subtle and easy to overlook. The manual quoting approach used in the original code seems reasonable at first glance—after all, the developer was attempting to quote the database name. However, it failed to account for PostgreSQL's flexible identifier naming rules and the distinction between escaping values and escaping identifiers.

This fix demonstrates that secure coding requires using the right tool for each job: quote_ident() for identifiers, quote_literal() for values, and parameterized queries with format() for dynamic SQL construction. By following these practices consistently, developers can eliminate entire classes of SQL injection vulnerabilities.

The regression test included in this PR is equally important—it ensures that future maintainers cannot accidentally revert to the vulnerable pattern, and it documents exactly what attack scenarios this code is designed to prevent.

For teams working with PostgreSQL extensions, dynamic SQL, or multi-tenant database systems, this vulnerability serves as a reminder to treat all user-influenced input—including database names, table names, and other identifiers—as untrusted and to apply proper escaping before including them in SQL queries.

References

CWE-89: https://cwe.mitre.org/data/definitions/89.html
OWASP SQL Injection: https://owasp.org/www-community/attacks/SQL_Injection
PostgreSQL quote_ident() Documentation: https://www.postgresql.org/docs/current/functions-string.html
PostgreSQL format() Function: https://www.postgresql.org/docs/current/functions-string.html#FUNCTIONS-STRING-FORMAT
Semgrep SQL Injection Rules: https://semgrep.dev/r?q=sql-injection
GitHub PR: fix: add parameterized queries in zhparser--2.1.sql

cwe	CWE-89 (Improper Neutralization of Special Elements used in an SQL Command)
fix	Replace string concatenation with PQescapeIdentifier() or PostgreSQL's quote_ident() function
risk	Remote code execution, data exfiltration, table deletion, privilege escalation
language	PostgreSQL PL/pgSQL
root cause	String concatenation with chr(39) quoting instead of parameterized identifier escaping
vulnerability	SQL Injection via unescaped database identifier in COPY command

How SQL injection happens in PostgreSQL dictionary synchronization and how to fix it

Answer Summary

Vulnerability at a Glance