How LDAP injection happens in Python Apache Airflow FAB security manager and how to fix it
Vulnerability at a glance: LDAP Injection (CWE-90) in Apache Airflow's FAB security manager — an unvalidated
AUTH_LDAP_SEARCH_FILTERconfiguration value was directly interpolated into an LDAP query string, enabling authentication bypass or directory data exfiltration for attackers who could influence that configuration. The fix validates filter structure before use.
Summary
A critical LDAP injection vulnerability was discovered in Apache Airflow's FAB (Flask-AppBuilder) security manager, specifically in the _search_ldap() method of override.py. The AUTH_LDAP_SEARCH_FILTER configuration value was interpolated directly into LDAP filter strings without validation, enabling attackers who could influence that configuration value to craft malicious filters that bypass authentication or exfiltrate directory data. The fix adds structural validation of the filter string before it is used, ensuring it has balanced parentheses and a proper LDAP filter format.
Introduction
The override.py file in Apache Airflow's FAB provider handles LDAP-based authentication — a common enterprise deployment pattern where Airflow delegates user login to a corporate directory service like Active Directory or OpenLDAP. Deep inside the _search_ldap() method, a subtle but dangerous flaw existed at line 2475: while the username parameter was correctly sanitized, the AUTH_LDAP_SEARCH_FILTER configuration value was dropped directly into the LDAP query string with no structural validation whatsoever.
This is a textbook example of a second-order injection risk: developers correctly defended against the obvious attack vector (user-supplied input) but overlooked a less obvious one (administrator-supplied configuration). In containerized deployments — which are the norm for Airflow — configuration values frequently arrive via environment variables, and those environment variables can be a target for attackers who have compromised CI/CD pipelines, Kubernetes secrets, or shared hosting environments.
The Vulnerability Explained
The Vulnerable Code
Here is the original code in _search_ldap() at line 2474–2478:
# build the filter string for the LDAP search
# escape username to prevent LDAP injection attacks
escaped_username = ldap.filter.escape_filter_chars(username)
if self.auth_ldap_search_filter:
filter_str = f"(&{self.auth_ldap_search_filter}({self.auth_ldap_uid_field}={escaped_username}))"
else:
filter_str = f"({self.auth_ldap_uid_field}={escaped_username})"
Notice the asymmetry: username goes through ldap.filter.escape_filter_chars() — a proper LDAP escaping function from the ldap3 library — but self.auth_ldap_search_filter is dropped into the f-string raw. No escaping, no structural check, no validation.
Why This Is Dangerous
LDAP filter strings have a well-defined grammar. A filter like:
(&(objectClass=person)(uid=alice))
means "find entries where objectClass is 'person' AND uid is 'alice'." Parentheses control grouping, and operators like & (AND), | (OR), and ! (NOT) control logic — exactly like SQL's WHERE clause.
If AUTH_LDAP_SEARCH_FILTER is set to a malicious value, the attacker can inject additional filter clauses. For example, if an attacker sets:
AUTH_LDAP_SEARCH_FILTER=)(uid=*
The resulting filter becomes:
(&)(uid=*(uid=alice))
This is structurally broken in a way that many LDAP servers interpret permissively — the & operator with no conditions may match all entries, effectively bypassing the intended search restriction. A more sophisticated payload like:
AUTH_LDAP_SEARCH_FILTER=(|(uid=*)(objectClass=*)
Could produce:
(&(|(uid=*)(objectClass=*)(uid=alice))
...which matches every user in the directory, not just the one being authenticated.
The Attack Path
The 2-step exploit chain works as follows:
-
Gain configuration influence: An attacker compromises a CI/CD pipeline, injects a malicious environment variable into a Kubernetes deployment manifest, exploits a misconfigured secrets manager, or abuses a shared hosting scenario where multiple teams share an Airflow instance.
-
Trigger authentication bypass: With
AUTH_LDAP_SEARCH_FILTERset to a crafted value, the next LDAP authentication attempt executes the injected filter. Depending on the payload, this could match all users (allowing login as any username), enumerate directory entries, or cause the LDAP server to return unexpected results that the Airflow session logic interprets as a successful login.
This is not a purely theoretical attack. Environment variable injection into containerized deployments is a well-documented real-world attack vector, and Airflow deployments are frequently targeted because they hold credentials to data pipelines, cloud services, and databases.
The Fix
What Changed
The fix adds a structural validation check for AUTH_LDAP_SEARCH_FILTER before it is ever interpolated into the query string. Here is the patched code:
# build the filter string for the LDAP filter injection
# escape username to prevent LDAP filter injection
escaped_username = ldap.filter.escape_filter_chars(username)
if self.auth_ldap_search_filter:
# validate the search filter has balanced parentheses
_sf = self.auth_ldap_search_filter
if not (_sf.startswith("(") and _sf.endswith(")") and _sf.count("(") == _sf.count(")")):
raise ValueError(
f"AUTH_LDAP_SEARCH_FILTER must be a valid LDAP filter with balanced parentheses, "
f"starting with '(' and ending with ')'. Example: '(objectClass=person)'. "
f"Got: {repr(_sf)[:100]}"
)
filter_str = f"(&{self.auth_ldap_search_filter}({self.auth_ldap_uid_field}={escaped_username}))"
else:
filter_str = f"({self.auth_ldap_uid_field}={escaped_username})"
Before vs. After
Before (vulnerable):
if self.auth_ldap_search_filter:
filter_str = f"(&{self.auth_ldap_search_filter}({self.auth_ldap_uid_field}={escaped_username}))"
After (fixed):
if self.auth_ldap_search_filter:
_sf = self.auth_ldap_search_filter
if not (_sf.startswith("(") and _sf.endswith(")") and _sf.count("(") == _sf.count(")")):
raise ValueError(
f"AUTH_LDAP_SEARCH_FILTER must be a valid LDAP filter with balanced parentheses, "
f"starting with '(' and ending with ')'. Example: '(objectClass=person)'. "
f"Got: {repr(_sf)[:100]}"
)
filter_str = f"(&{self.auth_ldap_search_filter}({self.auth_ldap_uid_field}={escaped_username}))"
Why This Fix Works
The validation enforces three structural properties that a legitimate LDAP filter must satisfy:
- Starts with
(— Every valid LDAP filter component begins with an opening parenthesis. - Ends with
)— Every valid LDAP filter component ends with a closing parenthesis. - Balanced parentheses —
_sf.count("(") == _sf.count(")")ensures no unmatched brackets that could break the outer query structure.
Any injection payload that attempts to close existing parentheses early (like )(uid=*) or add extra opening brackets will fail this check and raise a ValueError with a clear, actionable error message — including an example of a valid filter format.
Critically, the fix fails loudly: it raises an exception rather than silently allowing a malformed filter to be used. This means operators will immediately know something is wrong with their configuration, rather than unknowingly running with a broken or compromised LDAP filter.
Prevention & Best Practices
1. Treat Configuration Values as Untrusted Input
Configuration values — especially those sourced from environment variables, databases, or admin UIs — should be validated just like user input. The principle of "trust but verify" applies: even if you control the configuration source, validate its structure before using it in security-sensitive operations.
2. Validate LDAP Filter Syntax Before Use
For any string that will be interpolated into an LDAP filter, apply structural validation:
def validate_ldap_filter(filter_value: str) -> bool:
"""Validate that a string is a structurally valid LDAP filter."""
if not filter_value:
return True
if '\x00' in filter_value:
return False
if not (filter_value.startswith("(") and filter_value.endswith(")")):
return False
depth = 0
for char in filter_value:
if char == '(':
depth += 1
elif char == ')':
depth -= 1
if depth < 0:
return False
return depth == 0
3. Use Parameterized LDAP Queries Where Available
Some LDAP libraries support parameterized queries or filter builders that prevent injection by construction. Prefer these over manual string interpolation.
4. Always Escape User-Supplied Values
The fix correctly retains ldap.filter.escape_filter_chars(username) for the username. This is the right approach for any value that comes from user input. The ldap3 library's escape_filter_chars() escapes the five special LDAP filter characters: \, *, (, ), and the null byte.
5. Audit All LDAP Filter Construction Points
Search your codebase for patterns like:
f"(&{some_variable}({field}={value}))"
Any variable in that string that isn't passed through escape_filter_chars() or a structural validator is a potential injection point.
6. Apply Defense in Depth for Configuration
- Use read-only secrets management (e.g., HashiCorp Vault, AWS Secrets Manager) with audit logging.
- Restrict which environment variables can be set in containerized deployments.
- Validate all configuration at startup, not at runtime.
Relevant Standards
- OWASP: LDAP Injection Prevention Cheat Sheet
- CWE-90: Improper Neutralization of Special Elements used in an LDAP Query
- OWASP Top 10: A03:2021 – Injection
Key Takeaways
- Escaping user input is not enough: In
_search_ldap(), the username was correctly escaped withescape_filter_chars(), but theAUTH_LDAP_SEARCH_FILTERconfig value was not validated — proving that injection can enter through any unvalidated string in a query, not just direct user input. - Configuration values are an attack surface: In containerized Airflow deployments, environment variables like
AUTH_LDAP_SEARCH_FILTERcan be tampered with via CI/CD pipeline compromise, Kubernetes secret injection, or shared hosting abuse. - Fail loudly on invalid security-critical config: The fix raises a descriptive
ValueErrorwith an example of valid input — this is far better than silently using a malformed filter that could cause unpredictable authentication behavior. - Balanced parenthesis checking is a lightweight but effective LDAP filter guard: The three-condition check (
startswith("("),endswith(")"),count("(") == count(")")) catches the vast majority of injection payloads without requiring a full LDAP filter parser. - Second-order injection is easy to miss in code review: The vulnerable line looked safe because
escaped_usernamewas properly handled. Always ask "where does every variable in this string come from?" when reviewing query construction code.
How Orbis AppSec Detected This
- Source: The
AUTH_LDAP_SEARCH_FILTERconfiguration value, which can be set via environment variables in containerized Airflow deployments. - Sink: The f-string interpolation
f"(&{self.auth_ldap_search_filter}({self.auth_ldap_uid_field}={escaped_username}))"at line 2475 ofproviders/fab/src/airflow/providers/fab/auth_manager/security_manager/override.py. - Missing control: No structural validation or escaping of
self.auth_ldap_search_filterbefore it was incorporated into the LDAP filter string. - CWE: CWE-90 — Improper Neutralization of Special Elements used in an LDAP Query ('LDAP Injection').
- Fix: Added a pre-use validation check that enforces balanced parentheses and proper LDAP filter framing on
AUTH_LDAP_SEARCH_FILTER, raising aValueErrorwith an actionable message if the value is structurally invalid.
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
This vulnerability is a reminder that injection attacks don't always arrive through the front door. The _search_ldap() method in Airflow's FAB security manager correctly escaped user-supplied usernames — a security control developers often focus on — but left the door open through an administrator-supplied configuration value that received no scrutiny at all. In modern cloud-native deployments, configuration is just as much an attack surface as user input.
The fix is elegant in its simplicity: three conditions, one ValueError, and a clear error message that guides operators toward a valid configuration. It costs almost nothing in performance and eliminates the entire injection vector. This is the kind of targeted, minimal fix that security patches should aspire to be.
When you write code that builds LDAP queries, SQL queries, shell commands, or any other structured language from string components, ask yourself about every variable in the string: "Where does this come from? Can it be influenced by an attacker? Have I validated its structure?" If the answer to the second question is "maybe" and the answer to the third is "no," you have a vulnerability waiting to be exploited.