Each year in Web security new vulnerability classes are published, new variations of the existing ones are documented, and each website must be tested for them. Likewise, each year, the already mountainous pile of HTTP requests necessary to test for these issues grows, which significantly increases scan times. This problem can’t be solved by threading scans, making simultaneous HTTP requests, alone. A smarter way of going about the scanning process is required. We need solutions drastically reducing the number of HTTP requests per scan, while maintaining vulnerability identification performance.
To begin the discussion, let’s take a look at SiteX, an every day e-commerce website. The attack surface of SiteX can be encompassed by 100 distinct URLs that have a total of 400 unique name / value pairs, 20 Web forms with a total 60 of fields, and 3 cookies that add 12 more input points. This gives a grand total of 472 injection points, all of which must be checked for vulnerabilities in a dynamic scan. (aka run-time testing, fuzz testing, fault-testing, black-box testing, etc.)
Let’s begin testing SiteX for Cross-Site Scripting (XSS) by starting with a simple payload like <* XSSTEST>. If “<* XSSTEST>” returns un-html-encoded in the response page, this is a good indication a vulnerability exists. 472 HTTP requests later, to fully exercise the aforementioned attack surface, we might have some interesting bug-bounty-worthy results.
Scaling this out further, consider what happens if we need to test 5 XSS payloads, 10 payloads, 20, or maybe even 50! Racking up such a lengthy list of injections is trivial when attempting all the myriad of filter-bypass tricks documented over the years. For example, full url hex encoding converts “<* XSSTEST>” into “%20%3C%58%53%53%54%45%53%54%3E,” which might even work! We can even try Base64 encoding as well if we’d like, “PFhTU1RFU1Q+.”
In our scenario, such exhaustive XSS testing may require up to 23,600 HTTP requests (50 payloads x 472 attack surface), which could take a long while to complete. Next, think about similar testing for SQL Injection, Content-Spoofing, Command Execution, Path Traversal, HTTP Response Splitting, and injection style classes of attack. All of a sudden the number of HTTP requests for a full website vulnerability scan starts getting up into the 6 figures rather fast. This is why scans routinely take hours, even multiple days to finish.
At WhiteHat Security, one such way we’ve been counteracting this problem is by analyzing our historical vulnerability scan data. We’ve scanned tens of thousands of real-world websites of all shapes, sizes, and types, over and over again for years, and identified countless numbers of vulnerabilities. The data shows that certain payloads are far more likely to succeed than others. Obviously then, it makes sense to attempt those most likely to succeed first. If one payload works, injecting subsequent payloads in that class become unnecessary.
Figure 1 illustrates payload by payload performance using a simple graph. On the horizontal, each tick mark represents one of our payloads. The vertical is their relative effectiveness by website percentage. Clearly some payloads succeed on a large number of websites, while others do not. Figure 2 is subtly different. Instead measuring website percentage, the vertical shows the relative total quantity of vulnerabilities payloads are credited for identifying. Some payloads are definitely more productive than others.
Back to our previous example, if “<* XSSTEST>” works on the first shot, the remaining 49 payloads, whatever they are, for that one injection point, don’t have to be sent. We can save 49 HTTP requests and the time it takes to send them. By smartly ordering our injections we drastically increase our scan efficiency without sacrificing overall vulnerability identification performance. We can know this for a fact through regression testing. This is one of the areas our Research team, the “R” in “TRC,” focuses on.
From here, scan efficiency in our technology gets ever cooler, well, sophisticated at least. A while back we introduced data-backed conditional logic. Depending on how SiteX responds to one test, it impacts what the next test will be. For example, if SiteX does not echo “>” and “<,” there is no need to inject any more of those types of payloads. Doing this exponentially cuts down the number of requests a scan might otherwise require.
Figure 3 is a dynamically generated graphical diagram of the logic flow of our payloads that illustrates a little bit about how this looks. At the risk of revealing some intellectual property, Figure 4 zooms in on a particular area of the decision tree and provides a bit clearer picture of what’s happening behind the scenes.
As a Software-as-a-Service vendor, we have the ability to see and measure payload performance and use it to our advantage — to our customer’s advantage. A luxury the desktop scanner guys do not have. Every day, with each new scan, with each new website we test, with each new payload to test, we get just a little bit better. Every day, a little bit smarter.