In a previous blog, Kate Haworth described how she found a Path Traversal Attack vulnerability on one of our customers’ web applications. Kate and I created a webinar together describing her penetration test methodology and results, followed by my description of how Sentinel’s Dynamic scanning and Sentinel Source analysis would identify this vulnerability, as well as best practices in application security coding to avoid it.
For those of you who prefer words to pictures, here’s The Cure portion of the presentation.
How Dynamic Analysis finds path traversal opportunities
Sentinel will look for any parameters in the URL or in the body of the request that looks like contains a folder or takes a file path. Example:
The scanner assumes that any file reference or directory function has a dynamic functionality. This is probably a function that is loading in and displaying data from an external file, e.g.
echo file (“somedir/file.txt”);
Because the site is dynamically calling this file, Sentinel will attempt to change the value to test for access to a hidden, sensitive file. This happens through a series of Injectors, specific to the parameters/methods of the particular operating system. These injectors are looking to see if they can actually see the following root files. It is done in a black-box style test, with options for both O/S types, based on attacker scenarios. These two files are usually located on the root directory of their respective systems.
Once Sentinel confirms being able to see those files in a scan, an attempt is made to discover if the root file can be altered, or whether a file can be inserted on the server outside the root directory of the web doc. As you learned earlier in Part I of the path traversal attack, the ../ command tells an OS to go up one directory. Extraneous ../s are ignored
How Sentinel Static Code Analysis checks for path traversal vulnerabilities
Static source code analysis isn’t the black box testing style of Dynamic. Because there is already access to the source code, Source can look deeper to discover where input is being requested or allowed by each language type. Input in this case includes HTTP Request/Responses, where the application makes a request and response to and from an external service. This can include calls to the database or filesystem. We look for inputs combined with a file system change request to find code vulnerable to path traversal and manipulation.
For your use, I’ve collected links to basic file manipulation methods for four of the most common languages. These are great sites from the language developers, and every ethical hacker should have them bookmarked:
Path Traversal Prone Functionality
We have done an analysis of over 7000 vulnerabilities across five languages, and categorized the most dangerous functionality which might be vulnerable to path traversals with damaging results.
Vulnerable Patterns – Dynamic Template Inclusion
An attacker is could gain control of the entire path, via a request parameter display. As you can see, the attack string here shows what could be used to read the request, get the value from the parameter, and embed a template file with that name.
Vulnerable Patterns – File Upload
This example is similar to Kate’s path traversal vulnerability, where the attacker is given partial control of the path via an uploaded file’s name. As you can see, I have commented all the steps where the input comes into application, how it is concatenated with strings, and with line 18 is able to access the file system in a dangerous way.
Vulnerable patterns – file management
A lot of applications need to move files around in a file system. In many cases, DevOps may not realize that the files are being touched by the user. But this is an example of a generic file copy method that you might see in a PHP application, and the difficulty in this case is that it doesn’t protect itself. It just takes whatever input from the source and destination is provided and makes the operation to copy the file.
Generic functions that copy or delete resources without any validating controls are very problematic. It can cause developers who didn’t even write the underlying functionality to write features that are vulnerable.
Vulnerable patterns – serving files
Webservers are designed to serve files off disk. Writing your own functionality to do the same thing can be dangerous. In both examples the input is concatenated with a string before being used to read from the filesystem/
Vulnerable patterns – storing files on disk
Storing user content on the filesystem may not lead to the disclosure of filesystem contents,
but instead allow the attacker to overwrite files on disk. Which can be just as harmful for different purposes.
So how do you avoid path traversal opportunities?
Avoid composing file paths by concatenating untrusted data. You should be very wary whenever you see string + variable + string + variable in application code. In addition, a policy-based enforcement mechanism can really help when the user needs to manipulate the file system in some way, but you don’t want them to have direct control.
Indirect Object References< /p>
Developers may also need indirect object referencing. For example, in a generic example, if the application must choose between three files to use or reference, create numbered mapping (abc.txt, bcd.txt, cde.txt), as below, to the file:
1 -> /path/to/abc.txt
2 -> /path/to/bcd.txt
3 -> /path/to/cde.txt
The untrusted input will be the key (1, 2, or 3) so that the application can perform the lookup.
If concatenation with untrusted data cannot be avoided, validate the user input as well as the final composed path using a character whitelist.
In a more specific example, this is what an indirect object reference looks like in Java. I’ve used a simple hashmap to relate the input integer document. (Notice that I have changed that input from a string to an integer on Line 3.) I’ve mapped 1 to “numguess1” and so forth, so that when we read it there is is zero chance of user input to determine the path of the file.
Validate your inputs
For validating the input alone, we need to make certain that the input will contain only name of the file or the name of a directory. This rarely requires characters other than alphanumeric, and potentially a word separator (e.g. underscore, dash). A whitelist should be composed that matches your path and filename character set needs in the most restrictive manner possible.
On the first line here I have highlighted how to write a simple, regular expression which sanitizes the input to that alphanumeric value.
Validate the entire path
There are a couple methods for validating the entire path.
“Safer File” approach
This is the safer file class in Java. I have extended the file class, and added several patterns (lines 19 and 20) to set valid character set for file names and the file path – each individual path.
This is the actual constructor which does the work when you instantiate this class. You create a file object through the super. Next, check the canonical path through the super, do a directory check vs whitelist, then check the file name vs allowed character set.
These are the details of the directory check, and the file name check. They are slightly different, as one needs to allow for absolute path. A simple logic, easy-to-follow approach helps you avoid ‘holes’ in your application.
File name check: