Attacks
Search the history: Searching for credentials in a repository
Security analyst
Updated
Apr 29, 2020
7 min
At the moment, every company that develops their own product is sure that they are using some form of a source control management tool
. This is used to track modifications to a source code repository and also helps developers by preventing loss of work due to conflict overwriting and ensures that they are always working on the right version of the source code.
The most common form of version control systems
is a centralized version control
, where the repository is in one place, and it allows access to multiple clients. Here Git is one of the biggest ones; it is an open-source distributed source code management system
that allows you to create a copy of your repository known as a branch
. With this branch
, you can work on your code independently, and when you are ready with your changes, you can store them as a commit
, then Git
compare your changes with the main branch
(this is called a diff
) and finally you can merge
them to the master branch
. It also allows you to reverse the changes and to work in different versions of the same source code. Used by millions of developers, it is the base of many platforms such as Github, Gitlab, Bitbucket, among others.
As you know, storing clear text passwords in your machine, code, or anywhere (yes, I mean the sticky notes too) is a huge hole in your security. OWASP and CWE mark this as a vulnerability, but many developers make this mistake by creating configuration files and uploading them to a repository.
Maybe you are thinking, "who in the world is going to do that?" But this practice is more common than it appears. Recently (September 2019), it was discovered that a big bank was storing highly sensitive data on a publicly accessible repository on Github
, maybe your company is doing this right now.
Git disclosure lab
To set up our lab, we are going to create an empty repository, here we are going to create a database file with some credentials and commit the change:
db.sql.
setting up the lab.
Now we have a repository with clear text credentials. What the developers usually do to solve the problem? Let’s delete the credentials and commit the change:
db.sql modified.
deleting the credentials.
If this change goes to production, then there are no credentials in the file but anyone with access to the repository could view those changes. Also, it is common that the credentials do not change because it will break some interconnected systems.
To get credentials from a git repository, we can use several tools such as:
In this example, we are going to use truffleHog because it searches for keys based on entropy. To install it, we simply need to use PyPI
:
installing truffleHog.
We are ready to go.
Getting the credentials
One way to simply get credentials from a repository is to run the command grep
with a keyword like username, password, key, admin, etc.:
using grep.
As we see, it shows us the file, line, and content of that line of code, if we have a big source code, this is useful to locate potential files that could contain clear text credentials in them.
Next, we can search for the history of that file using git
:
history git.
There is a more efficient way to do this and is by using truffleHog
, this tool searches automatically through the entire repository and prints the keys with high entropy:
history git.
Solution
As we have seen by now, if a developer puts sensitive data into a file and commits the changes, an attacker could get our credentials by searching the history of our source code, but what can we do about that?
First of all, we can avoid using credentials at all by using environment variables and pipelines; every major source code management platform has this feature within their services. Pipelines are the top-level component of continuous integration, delivery, and deployment. With this, we can test, build, and deploy our projects, and by setting our credentials there into environment variables, we ensure the principle of least privilege.
Another thing we can do is to delete them from the repository using tools like BFG Repo-Cleaner. This searches through the commit history and removes sensitive data. Using our example, we can put our credentials into a file:
passwords.txt.
Then run the BFG Repo-Cleaner
in our repository:
running BFG.
Now if we check the history of our file, we will see that the credentials are removed:
history git removed.
If, for whatever reason, we could not avoid storing passwords into configuration files, then it is possible to store them encoded in a strong cryptographic algorithm. Please avoid the use of base64
for this endeavor because the encoding can be detected and decoded easily.
The last thing that we must do is to revoke any exposed credentials in order to minimize the damage done.
If you want more information about secure coding, you can check our Criteria about them.
Get started with Fluid Attacks' SSCS solution right now
Other posts