What is Server-Side Request Forgery (SSRF)?

SSRF is a vulnerability where an attacker tricks a server into making HTTP requests to unintended destinations, typically internal services or cloud metadata endpoints that are not directly accessible from the internet.

How do you prevent SSRF in Python?

Resolve the hostname to IP addresses using `socket.getaddrinfo()`, then validate each resolved address against private/internal ranges using Python's `ipaddress` module before making any HTTP request. Also enforce redirect validation to prevent DNS rebinding.

CWE-918: Server-Side Request Forgery (SSRF). It falls under OWASP Top 10 2021 category A10:2021 – Server-Side Request Forgery.

Is URL pattern matching enough to prevent SSRF?

No. Hostname-based blocklists can be bypassed using DNS rebinding, IPv6 representations, decimal IP notation, or URL encoding tricks. You must resolve the hostname and validate the actual IP addresses.

Can static analysis detect SSRF?

Yes. Static analysis tools can flag patterns where user-controlled input flows into HTTP request functions like `requests.get()` without intervening validation. Taint analysis traces data from API inputs to network call sinks.

Critical SSRF Fix in Python ML Code

How Server-Side Request Forgery (SSRF) happens in Python requests.get() and how to fix it

Introduction

The models/common.py file in the YOLOv5 object detection framework handles model inference including loading images from various sources. At line 882, when the input string starts with 'http', the code calls requests.get() to fetch the image — but without any validation of where that URL actually points. This created a critical SSRF vulnerability that could allow attackers interacting with the Flask REST API to probe internal infrastructure, access cloud metadata endpoints, and potentially exfiltrate sensitive credentials.

The vulnerability is particularly dangerous because YOLOv5 is widely deployed as a service behind REST APIs, meaning user-supplied image URLs are a standard input vector. An attacker doesn't need any special privileges — they just need to supply a crafted URL where an image URL is expected.

The Vulnerability Explained

Server-Side Request Forgery occurs when a server-side application fetches a resource from a URL that an attacker can control, and the application doesn't validate whether that URL targets internal resources.

In models/common.py, the vulnerable pattern looked like this:

# Vulnerable code - fetches from any URL without validation
if str(im).startswith('http'):
    response = requests.get(im)
    # Process image from response...

The problem is straightforward: if im is "http://169.254.169.254/latest/meta-data/iam/security-credentials/", the server dutifully makes that request from within the cloud environment — where the metadata endpoint is accessible. The response containing IAM credentials gets returned to the attacker.

Concrete attack scenarios against this code:

AWS credential theft: An attacker sends http://169.254.169.254/latest/meta-data/iam/security-credentials/ as an image URL to the YOLOv5 inference API. The server fetches IAM temporary credentials and returns them (or an error containing the response body).
Internal port scanning: By sending URLs like http://192.168.1.1:8080/, http://192.168.1.1:3306/, etc., an attacker can map internal services by observing response times and error messages.
Accessing internal services: URLs like http://localhost:6379/ could interact with internal Redis instances, potentially allowing data exfiltration or command execution.

Additionally, the diff reveals a secondary issue on line 514 where eval(meta["names"]) was used to parse model metadata — an arbitrary code execution risk if an attacker can control model files:

# Also vulnerable - arbitrary code execution via eval()
stride, names = int(meta["stride"]), eval(meta["names"])

The Fix

The fix introduces two new functions and applies multiple security improvements:

1. DNS-resolution-based URL validation (_validate_ssrf_url)

def _validate_ssrf_url(url: str) -> None:
    """Raise ValueError if url resolves to any private/internal address."""
    hostname = urlparse(url).hostname or ""
    try:
        results = socket.getaddrinfo(hostname, None)
    except socket.gaierror as e:
        raise ValueError(f"Could not resolve hostname '{hostname}': {e}") from e
    for _family, _type, _proto, _canonname, sockaddr in results:
        addr = ipaddress.ip_address(sockaddr[0])
        if addr.is_private or addr.is_loopback or addr.is_link_local or addr.is_reserved or addr.is_multicast:
            raise ValueError(f"Blocked request to internal address: {addr}")

This function resolves the hostname to actual IP addresses using socket.getaddrinfo(), then checks every resolved address against Python's ipaddress module properties. This approach is superior to regex-based hostname matching because:

It catches DNS names that resolve to private IPs (e.g., internal.attacker.com → 169.254.169.254)
It handles all IP representations (IPv4, IPv6, mapped addresses)
It validates the actual network destination, not just the URL string

2. Redirect-safe fetching (_request_ssrf_url)

def _request_ssrf_url(url: str, max_redirects: int = 5):
    """Fetch a URL after validating against SSRF..."""

This wrapper handles HTTP redirects safely — each redirect target is re-validated before following it. This prevents DNS rebinding attacks where the initial URL passes validation but redirects to an internal address.

3. Replacing eval() with ast.literal_eval()

# Before (dangerous):
stride, names = int(meta["stride"]), eval(meta["names"])

# After (safe):
stride, names = int(meta["stride"]), ast.literal_eval(meta["names"])

This change ensures that model metadata parsing only accepts Python literals (strings, numbers, tuples, lists, dicts, booleans, None) and cannot execute arbitrary code.

New imports added:

import ipaddress
import socket
from urllib.parse import urljoin, urlparse  # urljoin added

Prevention & Best Practices

1. Always validate URLs at the network level, not the string level

String-based validation (regex on hostnames) is consistently bypassed. Resolve the hostname and validate the IP:

import socket, ipaddress

def is_safe_url(url):
    hostname = urlparse(url).hostname
    for result in socket.getaddrinfo(hostname, None):
        addr = ipaddress.ip_address(result[4][0])
        if addr.is_private or addr.is_loopback or addr.is_link_local:
            return False
    return True

2. Validate after every redirect

HTTP 301/302 redirects can point to internal addresses. Disable automatic redirect following and validate each hop:

response = requests.get(url, allow_redirects=False)
while response.is_redirect:
    redirect_url = response.headers['Location']
    _validate_ssrf_url(redirect_url)  # Validate before following
    response = requests.get(redirect_url, allow_redirects=False)

3. Never use eval() on external data

Replace eval() with ast.literal_eval() for parsing data structures, or use proper serialization formats (JSON, YAML with safe loading).

4. Apply network-level controls

Even with code-level validation, defense in depth matters:
- Use network policies to restrict outbound traffic from application servers
- Block metadata endpoint access via instance firewall rules
- Use IMDSv2 (token-required) on AWS to mitigate metadata theft

Key Takeaways

requests.get() on user-supplied URLs is an SSRF sink — any code path where external input reaches an HTTP client must validate the resolved IP addresses, not just the URL string.
DNS resolution is the correct validation layer — socket.getaddrinfo() + ipaddress module catches bypasses that string-matching misses, including DNS rebinding and alternative IP representations.
Redirect following must re-validate each hop — the _request_ssrf_url() function with max_redirects prevents attackers from using open redirects to reach internal services.
eval() on model metadata was a hidden RCE — the ast.literal_eval() replacement eliminates arbitrary code execution from malicious model files, a supply-chain attack vector.
ML inference APIs are high-value SSRF targets — because they naturally accept URLs as input (for images, models, datasets), they require explicit SSRF protection that general web frameworks don't provide by default.

How Orbis AppSec Detected This

Source: User-supplied image URL string passed through the Flask REST API inference endpoint, flowing into the im parameter in models/common.py
Sink: requests.get(im) at models/common.py:882 — an unvalidated HTTP request to an attacker-controlled destination
Missing control: No DNS resolution validation, no IP address range checking, no redirect validation before fetching arbitrary URLs
CWE: CWE-918 (Server-Side Request Forgery)
Fix: Added _validate_ssrf_url() function that resolves hostnames via socket.getaddrinfo() and blocks requests to any private, loopback, link-local, reserved, or multicast IP addresses

Orbis AppSec automatically detected this vulnerability and opened a pull request with the fix. Try Orbis AppSec on your repositories to find and fix issues like this automatically.

Conclusion

This SSRF vulnerability in YOLOv5's models/common.py demonstrates a pattern common in ML/AI services: image processing pipelines that accept URLs as input without considering that the server's network position grants access to internal resources an external attacker shouldn't reach. The fix — resolving hostnames to IPs and validating each address against private ranges — is the gold standard for SSRF prevention in Python. Combined with redirect validation and the bonus eval() → ast.literal_eval() hardening, this patch closes both a network-level and code-execution attack surface.

If your application fetches resources from user-supplied URLs, implement DNS-resolution-based validation before every request. Don't trust the URL string — trust the resolved IP address.

How Server-Side Request Forgery (SSRF) happens in Python requests.get() and how to fix it

Answer Summary

Vulnerability at a Glance

How Server-Side Request Forgery (SSRF) happens in Python requests.get() and how to fix it

Introduction

The Vulnerability Explained

The Fix

Prevention & Best Practices

Key Takeaways

How Orbis AppSec Detected This

Conclusion

References

Frequently Asked Questions

What is Server-Side Request Forgery (SSRF)?

How do you prevent SSRF in Python?

What CWE is SSRF?

Is URL pattern matching enough to prevent SSRF?

Can static analysis detect SSRF?

View the Security Fix

Related Articles

How buffer overflow in memcpy() happens in C/C++ embedded firmware and how to fix it

How command injection happens in Python subprocess and how to fix it

How integer overflow in path_join() happens in C and how to fix it

How stack buffer overflow happens in C memcpy() with caller-controlled length and how to fix it

How buffer overflow in memcpy happens in C SVG parsing (nanosvg.h) and how to fix it

cwe	CWE-918
fix	Added `_validate_ssrf_url()` that resolves hostnames and blocks private/internal IP addresses before fetching
risk	Attackers can probe internal infrastructure, access cloud metadata, and exfiltrate credentials via crafted image URLs
language	Python
root cause	`requests.get()` called on user-supplied URLs without IP address validation
vulnerability	Server-Side Request Forgery (SSRF)