Table of contents

Title

Table of content

Title



Blog

Development

Gherkin on steroids: How to document detailed attack vectors

cover-gherkin-steroids (https://unsplash.com/photos/Ky6x9T8j128)

Rafael Ballestas

Security analyst

Updated

Mar 13, 2018



8 min

In the field of information security, 'finding all vulnerabilities' is as important as 'reporting them as soon as possible'. For that, we need an effective means to communicate with all stakeholders. We have proposed before using Gherkin. In that entry, we showed how to use Gherkin's syntax in order to document attack vectors, i.e., how to find and exploit vulnerabilities in an app. We also showed the basics of the language, so if you haven't done so already, we recommend you to take a look a it.

More keywords

Sometimes you need to specify a larger piece of text than fits in a decent-length line. For that, Gherkin, has docstrings ("""):

Specifying long input.

You may write anything between the docstrings, but they must be in their own lines and the indentation is relative to them. They are particularly useful for citing code, output from CLI programs and unstructured plain text.

For 'structured' plain text, Gherkin has the Data Table syntax element, (don’t confuse with tables from Scenario Outlines):

Tabular data with tables.

You don’t have to align the pipes (|) as above, but it makes your .feature file look nicer. Gherkin doesn’t care about that, only that the number of columns match.

Speaking of Scenario Outlines, as seen in our previous entry, these are very useful to specify many cause-effect relations:

When I do <action>
Then I get a <result>

Examples:
 | <action> | <result>

Detailed attack vectors

Let us put these to practice by documenting a vulnerability in detail from our good old friend bWAPP, which simply gives us a cryptic message:

A mysterious message.

No matter how dumb it might seem, this is the first thing we need to document: how the page, app or whatever we’re testing works at the moment we tested it. We might use a separate "Normal use case" scenario as we did before.

Background

Or we can just plug that behavior right into the Background. This must also include, in detail, everything needed to run the app. Our target bWAPP is a PHP web server; Maybe you’re running it inside a bee-box virtual machine? Or did you set up the LAMP server yourself? On what operating system? All of this must be in the background, in order to allow reproducibility.

I, for one, am running bWAPP inside a Docker container made by raesene, so let there be a record of that in our attack feature:

All programs and versions are explicitly listed, plus the URL and field where the vulnerability was found. Note how we can refer to external evidence files, too.

Dynamic detection and exploitation

Now, the cryptic message in the page might be trying to tell us something. Where can we climb? As it turns out, anywhere. The next hint is in the URL. The page takes a GET parameter page=message.txt. So the file message.txt is a simple text file that contains the words above, and what the page does is display it. What if we change it to another text file? Let’s try /commandi.php.

Abusing the website.

Notice two things here: first, the PHP code and text commentaries are shown. Hence we could theoretically access the PHP source of any page in this server. Second, the HTML part is actually rendered in the browser, which could lead to a XSS or CSRF attack.

But wait. The server is not just `floating'' in space: it lives inside a `GNU/Linux machine. And 'everything' in such an OS is a file, many of which are plain-text files. One of them is of particular importance: /etc/passwd, which stores information about users. Let us try to display it in this page, setting page=/etc/passwd:

Listing users in the `bWAPP` servers.

We can document that using Gherkin data tables, in a scenario of its own, due to the importance of the finding:

Documenting a particular exploitation.

Now we know how many users there are on the server, and which of them have passwords set. Those are stored in /etc/shadow in the form of hashes, which can be cracked if the passwords are weak. However, the shadow file, unlike the passwd file, is protected:

A failure.

'Drat!' Well, we’ll find a way around it, sooner or later. Now that we got the hang of it we can try other files. Since we always do the same: change page=message.txt to page=desired-file.txt we can use a Scenario Outline for that, using one column for what we give as input, and the other for the result:

Documenting many cases in one Outline.

 Scenario Outline: Dynamic detection and exploitation
 Given the message and the page=message.txt GET parameter in the URL
 When I change the GET parameter page=message.txt to another page=<path>
 Then I see the file <printed> in the page, if it is a text file:

 Examples:
 | <path> | <printed> | <evidence>

It is only natural to make several tries, some of which fail, some of which succeed. All of them should be reported in the most scientific spirit.

Static detection and possible fixes

Let us see why passwd could be read and shadow couldn’t. From 'inside' the server let us say

Notice that passwd has three r’s: one for the owner (the user `root), one for the the owner’s group (again, just root) and the final one is for everyone else. However shadow doesn’t have that last r, so it can only be read by root.

While we’re at static detection of problems, let us see what is wrong with that page so we can try to fix it. The source code for the page simply takes the GET parameter page, and displays it.

Adapted from bWAPP code. Some lines and brackets omitted for clarity.

$file = $_GET["page"];
show_file($file);
function show_file($file)
 if(is_file($file))
 $fp = fopen($file, "r") or die("Couldn't open $file.");
 while(!feof($fp))
 $line = fgets($fp,1024);
 echo($line);
 echo "<br />

We can include this exact snippet, numbers and all, between docstrings, while discussing code exploration in our feature file.

Now the main problem with this is that we can pass, as seen before, any file as a GET parameter and it will be shown, i.e., that input should have been validated and cleaned before show_file.

To fix that, a good first step would be to clean strings like .., ./ and ../, which is what you would generally use to ``climb higher Spidy'':

This would block attackers who do not know the file system hierarchy in the server, but still allows us to give absolute paths as the parameter. An even better defense would be that the user should not be allowed to display files outside the current folder:

But this still allows us to display the file with the heroes' passwords. In fact, it would be better just not to allow users to display files at their will.

More details

So far, we’ve documented in Gherkin:

the background where we’re running the vulnerable app,
the dynamic detection and exploitation phase, with several examples and evidences,
the important records we were able to extract from the app,
the static detection part, with specific bad code snippets, issues and suggestions.

To finish a proper .feature file, we’re missing, well, the feature itself, which is the vulnerability, or rather, the finding and exploitation thereof.

Remember that we can document features and scenarios using 'descriptions'. After the keywords Feature, Scenario, Scenario Outline or Example we can write anything we like, as long as no line starts with a keyword (including comments - you can’t mix descriptions with comments, I learned that the hard way).

It is usual to describe features with the format As <type of user> I want to <do something> In order to <get some result>. We can take advantage of such a structure to document the 'Scenario' and 'Actor' of the vulnerability, the 'Threat' and what records can be 'compromised'. We can also use that space to document anything else we consider to be globally important:

For anything else, use comments. I will include details such as the vulnerability code, CWE, CVE if present, computed metrics such as CVSS scores, etc in comments (#) at the beginning of the file. See the full feature below.

And that is how we propose using this language to document attacks. You may ask: why Gherkin and not just plain text? Because it is line-oriented and has a light structure, we can define a template like the one discussed here, and we can enforce following of the format using the readily available parsers, linters and compilers for the language. We still need to work further on the template definition, so stay tuned.

Appendix: full feature

local-file-inclusion.feature here.

Get started with Fluid Attacks' RBVM solution right now

Tags:

vulnerability

exploit

software

web







Subscribe to our newsletter

Stay updated on our upcoming events and latest blog posts, advisories and other engaging resources.

Coding with gen AI: Five best practices

Read post



cover-secure-coding-five-steps (https://unsplash.com/photos/zc9pWsPZd4Y)

Development

Felipe Ruiz

•

December 5, 2022

Secure coding in five steps? A simple approach to try out in cybersecurity training

Read post



Development

Felipe Ruiz

•

November 22, 2022

Go over and practice secure coding

Read post



cover-understand-program-semantics (https://unsplash.com/photos/j3dxI7CNYL0)

Development

Rafael Ballestas

•

February 14, 2020

Understanding program semantics with symbolic execution

Read post



cover-code-translate (https://unsplash.com/photos/r8H8K3w9AzA)

Development

Rafael Ballestas

•

January 31, 2020

Can code be translated? From code to words

Read post



cover-further-code2vec (https://unsplash.com/photos/FoiZoPtxSyA)

Development

Rafael Ballestas

•

January 24, 2020

Further down code2vec: Vector representations of code

Read post



Development

Rafael Ballestas

•

January 10, 2020

Embedding code into vectors: Vector representations of code

Read post



cover-vector-language (https://unsplash.com/photos/_E1PQXKUkMw)

Development

Rafael Ballestas

•

December 13, 2019

The vectors of language: Distributed representations of natural language

Read post



Start your 21-day free trial

Discover the benefits of our Continuous Hacking solution, which organizations of all sizes are already enjoying.

Try for free

Contact sales

Start your 21-day free trial

Discover the benefits of our Continuous Hacking solution, which organizations of all sizes are already enjoying.

Try for free

Contact sales

Start your 21-day free trial

Discover the benefits of our Continuous Hacking solution, which organizations of all sizes are already enjoying.

Try for free

Contact sales

Start your 21-day free trial

Discover the benefits of our Continuous Hacking solution, which organizations of all sizes are already enjoying.

Try for free

Contact sales

Fluid Attacks' solutions enable organizations to identify, prioritize, and remediate vulnerabilities in their software throughout the SDLC. Supported by AI, automated tools, and pentesters, Fluid Attacks accelerates companies' risk exposure mitigation and strengthens their cybersecurity posture.