Introduction
In the Ultralytics repository, we discovered a high severity reflected XSS vulnerability in ultralytics/solutions/templates/similarity-search.html at line 134. The similarity search feature allows users to query images using natural language descriptions, but the template was rendering the search query directly into an HTML input attribute without guaranteed escaping. Specifically, the code used value="{{ request.form['query'] }}" to populate the search box with the user's previous query—a common pattern that becomes dangerous when proper escaping isn't enforced.
This matters because the similarity search template is part of the production codebase serving a web interface. Without proper escaping, an attacker could craft a malicious URL containing JavaScript code that would execute in any user's browser who clicked the link.
The Vulnerability Explained
Let's examine the vulnerable code from similarity-search.html line 134:
<input
type="text"
name="query"
placeholder="Describe the scene (e.g., man walking)"
value="{{ request.form['query'] }}"
required
/>
The problem lies in how request.form['query'] is rendered directly into the value attribute. If Jinja2's autoescape isn't enabled globally or if the attribute context isn't properly handled, an attacker can break out of the attribute and inject event handlers.
How Could It Be Exploited?
An attacker could craft a malicious search query URL like this:
https://example.com/similarity-search?query=" onfocus="alert(document.cookie)" autofocus="
When the template renders this input, it would produce:
<input
type="text"
name="query"
placeholder="Describe the scene (e.g., man walking)"
value="" onfocus="alert(document.cookie)" autofocus=""
required
/>
Notice how the double quote closes the value attribute prematurely, then onfocus="alert(document.cookie)" becomes a separate attribute that executes JavaScript when the input receives focus. The autofocus="" ensures the input automatically focuses when the page loads, triggering the attack immediately.
Real-World Impact
For the Ultralytics similarity search feature, this vulnerability could allow attackers to:
- Steal session cookies: The injected JavaScript can access
document.cookieand send it to an attacker-controlled server - Perform actions as the victim: Execute API calls using the victim's authenticated session
- Deface the interface: Modify the page content to phish for credentials
- Redirect to malicious sites: Change
window.locationto redirect users to phishing pages
Since this is a web service with a public-facing search interface, the attack surface is significant—any user clicking a malicious link becomes a potential victim.
The Fix
The fix addresses both the escaping issue and adds defensive programming with a single line change:
Before (Vulnerable):
value="{{ request.form['query'] }}"
After (Secure):
value="{{ request.form.get('query', '') | e }}"
This change implements two security improvements:
-
Explicit escaping with
| e: The Jinja2 escape filter (| e) explicitly HTML-encodes special characters like<,>,",', and&. This ensures that even if autoescape is disabled or misconfigured, the output is still safe. When the malicious payload" onfocus="alert(1)passes through the escape filter, it becomes" onfocus="alert(1), which renders as literal text instead of breaking the attribute boundary. -
Safe dictionary access with
.get(): Replacingrequest.form['query']withrequest.form.get('query', '')prevents aKeyErrorif the query parameter is missing. This defensive programming practice ensures the template doesn't crash and provides a safe default empty string.
The security improvement is concrete: the fix prevents attribute-breaking attacks by ensuring all user input is HTML-encoded before insertion into the DOM. An attacker's payload is now rendered as harmless text that displays in the search box rather than executing as code.
Prevention & Best Practices
To avoid reflected XSS vulnerabilities in Jinja2 templates and similar templating engines:
1. Always Enable Autoescape Globally
Configure your Jinja2 environment with autoescape enabled:
from jinja2 import Environment, FileSystemLoader, select_autoescape
env = Environment(
loader=FileSystemLoader('templates'),
autoescape=select_autoescape(['html', 'htm', 'xml'])
)
2. Use Explicit Escape Filters for Defense-in-Depth
Even with autoescape enabled, use explicit | e filters for user-controlled data in security-critical contexts:
{{ user_input | e }}
3. Context-Aware Escaping
Different contexts require different escaping:
- HTML content: | e (HTML escape)
- JavaScript strings: | tojson (JSON encoding)
- URLs: | urlencode (URL encoding)
- CSS: Avoid user input entirely or use strict validation
4. Content Security Policy (CSP)
Implement a strict CSP header to mitigate XSS impact:
Content-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'
5. Input Validation
While not a substitute for output encoding, validate input format:
import re
def validate_query(query):
# Allow only alphanumeric, spaces, and basic punctuation
if not re.match(r'^[a-zA-Z0-9\s.,!?-]+$', query):
raise ValueError("Invalid query format")
return query
Security Standards References
- OWASP Top 10 2021: A03:2021 – Injection
- CWE-79: Improper Neutralization of Input During Web Page Generation
- OWASP XSS Prevention Cheat Sheet: Comprehensive guide for preventing XSS across different contexts
Key Takeaways
-
Never trust template autoescape alone: The
similarity-search.htmlvulnerability shows that relying on global autoescape settings without explicit filters creates risk. Always use| efor user input in HTML attributes. -
Attribute context is particularly dangerous: Breaking out of an HTML attribute with a double quote is trivial if escaping isn't applied. The
value="{{ request.form['query'] }}"pattern is a common pitfall in search forms and input echo scenarios. -
Use
.get()with defaults for form data: The change fromrequest.form['query']torequest.form.get('query', '')prevents crashes and provides a safe fallback, demonstrating that security fixes often improve reliability too. -
One character can be the difference: Adding
| e(just three characters) to the template completely neutralizes the attack. Small, focused fixes are often the most effective security improvements. -
Test with attribute-breaking payloads: The regression test includes
'"><img src=x onerror=alert(1)>'and'" onmouseover="alert(1)'specifically to catch attribute-context XSS. Your test suites should include similar boundary-case attack vectors.
How Orbis AppSec Detected This
- Source: User-controlled input from the HTTP request parameter
queryaccessed viarequest.form['query'] - Sink: Direct rendering into HTML attribute context in
similarity-search.html:134without guaranteed escaping - Missing control: No explicit HTML escape filter applied, and no validation that autoescape is enabled for attribute contexts
- CWE: CWE-79 (Improper Neutralization of Input During Web Page Generation - Cross-site Scripting)
- Fix: Applied explicit Jinja2 escape filter (
| e) and changed to safe dictionary access method (.get()with default)
Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.
Conclusion
This reflected XSS vulnerability in the Ultralytics similarity search template demonstrates how a single missing escape filter can create a critical security hole in production code. The fix—adding | e and using .get() with a default—is simple but crucial. By combining explicit escaping, safe dictionary access, and comprehensive testing with attack payloads, we've ensured that user input can never break out of its intended context to execute malicious code.
Remember: output encoding is your last line of defense against XSS. When working with templates, always escape user data explicitly, enable autoescape globally, and test with boundary-breaking payloads. Security doesn't require complex solutions—often, it's the small, consistent practices that make the difference between a secure application and an exploitable one.