Occasionally I used to get asked to look at web application architecture for companies. Companies that grow above a certain size or threat level often move to using inline caching proxies, inline cloud based WAF solutions (e.g. CloudFlare or Incapsula), or both. For a long time I’ve had a hard time explaining why this could be a problem but I finally ran into a confluence of problems that demonstrate why this is an issue. Let’s start with the major problem.
When you have a website that needs to use IP addresses, you’ll run into strange situations if you run an inline security proxy. The most important issue is that the IP address of the machine connecting to your web server is always that of the upstream proxy/ies and not that of the person connecting. The user connects to the proxy and the proxy connects to your website; therefore, your website always sees the same IP address. IP addresses are used for all kinds of security measures. They’re used for seeding secret strings in cookies in PHP. They’re used for doing flood detection. They’re used for brute force detection and lockouts. IPs are used all the time. But what happens when all the IPs look the same? To get around that, proxies have invented something called the X-Forwarded-For header, which can look like a lot of random things. It can look like any of the following:
X-Forwarded-For: 192.168.0.5 X-Forwarded-For: 192.168.1.2, 188.8.131.52 X-Forwarded-For: 184.108.40.206 X-Forwarded-For: localhost, 220.127.116.11
Because it’s an optional header it contains random things. Sometimes those things are real IP addresses (sometime internal RFC1918 address space and sometimes public) and sometimes it just contains garbage. Either way, most proxies have decided that the X-Forwarded-For header is the best header to use to tack on their information. So they tack the IP address of the user who is connecting to them onto the end of the string that they receive (or create a new string if there isn’t one already) and pass that to the web-server. The web-server then has to be smart enough to take that information and parse apart the string to grab the last IP address and intelligently replace the IP address of the security proxy with the IP address listed in the X-Forwarded-For header. Inline devices that sit behind the security proxy have to be just as smart. This leads to all kinds of weird scenarios where an attacker can spoof IP addresses by sending X-Headers after having breached the network, but that is less likely.
To accomplish this goal of looking at the X-Forwarded-For header, many people turn to rpaf, which performs this task very easily. The problem is that if rpaf doesn’t see the header it doesn’t know what IP address to use, and it will instead default to nothing. So how do we get the inline proxy to send something that rpaf won’t understand? Simple: we use a null byte (here shown as %00 below so you can visualize it, but normally it is not URL encoded):
GET / HTTP/1.1 Host: www.example.com User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0 X-Forwarded-For%00: whatever
This will create a 400 error, because Apache doesn’t understand the request. However, the most important thing is what it looks like in the logs. Notice that in the first log file there is an IP address, and in the second there’s no IP address: Mar 17 20:05:46 18.104.22.168 – – [17/Mar/2013:20:05:46 +0000] “GET / HTTP/1.1” 200 15 “-” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0” Mar 17 20:06:10 – – – [17/Mar/2013:20:06:10 +0000] “GET / HTTP/1.1” 400 56 “-” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0” An attacker’s mileage my vary depending on how the proxy treats the header with a null byte in it. Still the proxy may do its own logging, which may render this attack useless. The most dangerous variant would be if an attacker can simply bypass the cloud based WAF solution and go directly to the origin server. By bypassing the WAF the attacker doesn’t have to worry about how the proxy handles the null byte or any extra logging it may perform.
Why would an attacker intentionally want to send a request that creates a 400 error? There are lots of potential reasons. A few of the fine folks on Twitter suggested the following:
- Fingerprinting the operating system
- Filling up the logs
- Using the user-agent to seed the system logs with a remote file include
- Using the user-agent to seed the system to create XSS attacks in log parsers
- Distraction from another attack
There may be many additional reasons that a request that creates a 400 error may be useful, but the point is that as a result there’s no IP address associated with the request in the logs in Apache.
Sometimes proxies may communicate very sensitive information to the server, so that the server knows that it’s talking to the right thing. These secrets can be just about anything. Let’s say for instance that knowing that secret would allow you to contact the server directly and it would believe you are the proxy. Then let’s say the proxy and the web server decide to use another X header instead of X-Forwarded-For to obfuscate it so that an attacker may not know what the real header is – then the attacker will be unable to spoof another IP address. Here is where TRACE comes back to haunt us. The HTTP method TRACE comes back once every few years to cause problems, and for some reason it’s still enabled by default more times than not. With TRACE an attacker can see what they sent to the server. But because they are not connecting directly to the server but instead to the proxy, what the attacker really sees is what the proxy is sending to the web server. Here’s what it might look like:
TRACE / HTTP/1.1 Host: www.example.com User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate HTTP/1.1 200 OK Date: Sat, 16 Mar 2013 22:05:22 GMT Server: Apache Content-Type: message/http X-Secret-info: lkjfalkjsfoij2oif4oijalskdfjsecretstringgoeshere12342134 Obfuscated-Client-Ip: 22.214.171.124 Content-Length: 348
So relatively easily the attacker now knows the secret and the obfuscated header that the web server is using as a replacement for IP addresses. Assuming the web server allows inbound connections from the Internet and the real IP address of the web-server can be found out, the attacker can now communicate to the web-server as if they were another IP address. This is not an ideal scenario. So at an absolute minimum, disabling TRACE is a really important and easy step to take. But doing forensic logging which doesn’t rely on rpaf or other tricks to figure out IP address from alternate HTTP headers is also a good idea. Special thanks to @gmanfunky @dan_crowley @ivanristic @dallendoug @cgkades @desi_juggaad and @therealsaumil for the help with uses for 400 errors!