Breaking News

The Future of ML at WhiteHat Security

In a previous blog post, we concentrated on one task that is now done with machine learning (ML) involvement. It was the one that is most visible and valuable to our clients: vulnerability verification and elimination of false positives from the results that we deliver to them.

While this is indeed an important task, before we reach the point of being able to deliver the results, there’s a lot of other hard work that our team does to lay the foundation for the vulnerability discovery process.

In order to fully assess a client website, it needs to be first fully mapped and scanned–with all links, forms, APIs, etc. discovered. Currently, a lot of human time and resources are dedicated to these tasks.

For example, a substantial part of the sites under our management include sites protected by some form of authentication, which can be quite complicated at times. In order for these sites to be scanned automatically, ‘login handlers’ need to be developed per site.

While we do have sophisticated systems in place that make this work as streamlined as possible, often it still requires human touch. Using ML, we could potentially speed up this process even more, further decreasing time to value for our clients.

Neural networks, which are highly effective with pattern matching, could be used to preemptively categorize the websites, steering the ‘login handler’ work towards appropriate, predefined routes.

Another task that currently takes a lot of effort is ‘form training.’ Naturally, WhiteHat Security is all about safe production testing. Our clients trust us to find the vulnerabilities in their sites while not disrupting the functionality of their assets. This is why each and every html form that can potentially change a website state is carefully considered, prefilled with the correct data and approved by a team of security engineers.

This is an essential part of our business cycle, but it takes time and can contribute to longer than desired time to value. We think ML can be of use here as well. One can think of a form and the values it needs to be filled with as a pair of different languages. Then the ‘form training’ starts to resemble somewhat of a translation task from one language to another. In recent years, machine translation has improved substantially, and we hope the same techniques and algorithms can be used for our purposes as well.

In truth, the potential for ML at WhiteHat is unlimited. We have only scratched the surface. We hope one day we can have a fully automated, client self-service scanning product. It could be geared towards a wider market adoption due to inherent scale and savings.

But such a product can only be a part of company strategy for now–because as powerful as ML is currently, it needs to be designed, trained and filled with new and ever-changing data and algorithms.

Our strategy is not to replace but to empower and elevate our engineers with ML, free them from menial tasks and let them work on hard, cognitive challenging tasks worthy of their attention.