Data has a better idea sign

Will Machines Replace Us?

Automatic detection vs. manual detection

By Andres Cuberos | February 13, 2018 | Category: Philosophy

More than 20 years have passed since Garry Kasparov, the chess world champion, was defeated by Deep Blue, the supercomputer designed by IBM. For many people, that event was proof that machines had managed to exceed human intelligence [1]. This belief raised many doubts and concerns regarding technological advance, that went all the way from workers worried about their jobs, to beliefs that the apocalypse was coming, incentivized by Hollywood, and beliefs that machines would conquer and oppress our world.

Leaving all fiction aside, the first concern was well grounded and made some sense. Every year we witness new machines out in the market with the ability to complete a task with precision and speed capable of outdoing tenths of experienced workers. By machine I am not talking about a robot with the appearance of a young Arnold Schwarzenegger wearing sunglasses. It could be any device programmed to complete a specific task.

Self-driving cars, robotic arms that minimize costs and increase the efficiency of a process and, in the field of Information Security, tools that can detect errors and vulnerabilities in a web app. What can we, information security specialist, expect from all of this? Are we becoming more and more expendable?

Robot holding a firearm with an explosion in the background
Figure 1. The inevitable outcome that Hollywood shows us

Fortunately, our future does not look so bleak. Even though there are many powerful automated tools for vulnerability detection in web applications, the human role is still vital if a detailed and effective security analysis is desired. There are still situations in which we have the upper hand:

  • Tools may have a large database of vulnerabilities, know how to find them and what risk level they imply. However, they do not have that human factor and expertise that comes with experience, that instinct that allows a security analyst to identify which vulnerabilities can be combined in order to create a more critical attack vector. Expertise can also allow a person to find vulnerabilities a machine may overlook.

  • Yes, analyzers generate a report with all their findings and criticality but, how complete is it? It tells you how many input fields are affected by a vulnerability but, does it tell you which of those inputs allow the extraction of sensitive data? Does it tell you how to take advantage of a form in order to modify a database? Sadly, the answer is no, automated tools only determine the existence of a flaw. How that flaw can be exploited and leveraged to an attackers favor in order to affect a particular business scenario is strictly a human ability, acquired thanks to the expertise we previously mentioned.

  • False positives, perhaps an analyzer´s greatest flaw, report the discovery of a vulnerability that, in reality, does not exist [2]. It’s a very common problem among these tools, which results from the inability to exploit a certain flaw. A tool that does not properly filter out false positives can bring more damage than benefits. If it was used to avoid the cost of hiring a security professional, you now have X vulnerabilities reported. Out of those, you have no idea which ones are false positives. The task of filtering those out now falls on the developer, someone who does not necessarily dominate the topic of security. Was the remedy worse than the illness?

    False positives are also one of the reasons for which these tools are not commonly used in Continuous Integration environments. If we program an integrator to check every change made in a source code, and stop the deployment of the app if an error is found, false positives could make the deployment of an app a pain in the ass.

  • Netsparker (the developer of one of these tools), agrees with this position [3], no analyzer is capable of detecting all vulnerabilities classified as the Top 10 most critical. They reach to the conclusion that an analyzer cannot determine if an application is working as intended and aligned with the company objectives, whether the sensitive information (which vary depending on the type of business) is being properly protected, and the users´ privileges are being properly assigned, among many other cases where human reasoning must make the final decision.

Our goal is not to take merit away from these tools; we use many of them in our professional lives and they are very strong allies. What we want is to revise the mistaken belief that those tools are sufficient in order to decide if a web app is secure or not.

To do this, we developed our own experiment. We used the insecure web application bWAPP and the automated analyzers W3af, Wapiti and OWASP ZAP. These all share the features of being Open Source and being able to be executed from the command line. Thanks to this, it is possible to use them in a Continuous Integration environment. For bWAPP, we assumed a total of 170 vulnerabilities, based on the results of the company that developed the app [4]. Let’s see how our contestants performed:

Table 1. bWAPP Vulnerability Analysis
Tool Detected % Not Detected Time

















In the previous information, ZAP-Short refers to the ZAP tool with only the XSS and SQLi plugins enabled. ZAP-Full refers to the same tool with all of its plugins enabled. It is important to note that the application authentication had to be disabled. This was done in order to allow the tools to work properly from the command line. This fact not only takes the experiment further away from reality, but also leaves a layer of the web app unanalyzed.

Another important detail is that the analyzers were not aimed at the main site, as a real test would have. The target of the attack was a specific bWAPP page where links to all the other pages are listed. This way a complete identification is achieved by the tool. bWAPP uses forms to reach all other pages, which is why aiming the attack at the main page would result in 0 sites of interest being found. There are tools such as Burp that solve this problem through the evaluation of the forms [5], but there are others that fail in the same situation due to their inability to navigate the main site.

To facilitate the analysis of the results, let’s take the best result (ZAP-Full) and the worst one (Wapiti), and compare them against the whole surface of the application, let’s see what coverage was achieved.

Coverage comparison between two analyzers with the scope of an application
Figure 2. Visual representation of the best and worst result from Table 1- ZAP-Full y Wapiti

We can see that even the best of the tools we used left out over half of the vulnerabilities, and false positives may exist amongst the ones that were found. Also, it took an hour and a half to finish the analysis, time that is not appropriate for a Continuous Integration environment.

A developer who wishes to reduce costs avoiding hiring a security analyst, and relying solely on the automated tools, would remediate the vulnerabilities and would acquire a false sense of security. He would ignore that more than half of the flaws could still present, just waiting to be found and exploited by a malicious user. This way, the resources saved during the development will be spent, with interests, in the production stage.


Yes, the rivalry between humans and machines has been present for a long time now, and it will remain that way for a lot more. However, it is not necessary to look at it as a rivalry in every aspect. In the field of Information Security, more than a rivalry, a complementary relationship can exist, where the tool helps the analyst performing repetitive tasks faster and the analyst adds his/her instinct and experience to detect the maximum amount of vulnerabilities in an efficient manner. Therefore giving a greater sense of security and satisfaction to the web application developers.

Paraphrasing Garry Kasparov in his TED talk [6], where he uses a freestyle chess tournament as an example (in which amateur players with three common machines defeated grand-masters of the chess game and supercomputers), the relationship between human and machine, through an effective process, is the perfect recipe to achieve our grandest dreams.

Handshake between a human and a robot
Figure 3. Alternative outcome to the human-machine relationship