3.3.4 Cleaning Git Commit History
Leaking secrets in code is very much a common scenario however the problem that arises is the way git work, all your commits and the corresponding data is always available for viewing. Hence, by simply masking your sensitive information in X+1 commit won't hide it in commit X !
The best way to deal with this situation is
-
Clean your Git history
-
Rotate your credentials and store them in a secure manner (which we'll see in more detail in Chapter 5 : Securing the Cluster)
Let's walk through the process of being able to clean the git history
๐๏ธ View the Git Historyโ
Before we begin let's first view the sensitive information in the git history on github.com
- Click on Commits on the main repository page
- Switch to gitleaks branch as shown below
- View the Commit history and the sensitive information
๐งน Prepare for Git Cleaningโ
- This command reverts the changes that we did for
providers.tf
andapplication.properties
file and pushes it into gitleaks branch. This is what a developer would ideally do when they want to remove the secret from code.
cd ~/playground/
cp -r ~/s4cpcode/chapter3/3E/. ~/playground/
git status
git add .
git commit -m "removing leaks"
git push --set-upstream origin gitleaks
- Next, we need to create a new directory called
leak
cd ~
mkdir leak
cd leak
๐งฌ Git clone the repoโ
- Next, we need to now clone and mirror the git repository and the specific branch i.e.
gitleaks
in which our secrets were leaked. - Copy the mirrored repository and create a backup
Please check your username
cd ~/leak
git clone --mirror -b gitleaks git@github.com:<username>/playground.git
cp -r playground.git playground.git.bak
๐ Create File Containing Sensitive Textโ
- Next,we need to create a file
passwords.txt
which contains the exact string that was leaked. The below command does just that and echoes it to the terminal output.
cd ~/leak
echo "AKIAERKSDFASDFKASDMD" >> passwords.txt
echo "CuNQE0DQBU1IrTX0K7HBuBTwBLyq0rp0Tm6J2dne" >> passwords.txt
echo "testpassword123" >> passwords.txt
echo "testpassword" >> passwords.txt
cat passwords.txt
๐งน Clean Git History using BFGโ
- Next, we'll utilize the BFG Repo Cleaner tool that will browse through the entire git history and searches for the strings that exist in the
passwords.txt
file that was created in the earlier step.
java -jar ~/tools/bfg.jar --replace-text passwords.txt playground.git
๐ Git Reflog and Update the changesโ
Next, we need to perform git reflog
to update the references in all the previous commits. Once updated we'll force push the changes.
cd playground.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push -f
๐ Explanation of git reflogโ
-
git reflog expire --expire=now --all
: This part of the command is responsible for expiring or clearing the reflog entries immediately. The reflog is a history of references, including branches and other commit pointers, which is useful for tracking changes and recovering lost commits. By expiring the reflog entries "now," you're essentially clearing the entire reflog history, making it inaccessible for recovery. -
git gc --prune=now --aggressive
: This part initiates the "garbage collection" process in Git. Garbage collection is responsible for cleaning up and optimizing the repository's database. The options used here are as follows:--prune=now
: This option prunes unreferenced objects immediately, freeing up space and optimizing the repository.--aggressive
: This option instructs Git to perform a more thorough and resource-intensive garbage collection, further optimizing the repository's database. It's useful when you want to maximize space reclamation.
In summary, the entire command is used to clear the reflog history and perform an aggressive garbage collection, optimizing the Git repository by reclaiming space and ensuring it runs efficiently.
๐ View the Commitsโ
- Again, change branch to gitleaks and open the commit with message
git leaks won't catch
or any other commit id where sensitive information was commited.
- View the commit
๐ Merge the PRโ
Lastly, we need to close the PR by merging the gitleaks
branch into main
branch.
- Merging the branches
- Close with comment
[skip ci]
as we donot want GHA to run this time.
Its important here to add the string [skip ci] in the message as we donot wish to run the Github Actions upon the merge. Same can be seen in the screenshot below.
- First enter [skip ci] in the comment.
- Click on
Close with comment
button.
That completes the Chapter 3, before you move to Chapter 4
- Ensure that you've setup and integrated Semggrep in Github Actions for implementing Static Application Security Testing (SAST)
- Ensure that you've setup and integrated Dependency Checker in Github Actions for implementing Software Composition Analysis (SCA)
- Ensure that you've setup and integrated Gitleaks in Github Actions for implementing Secrets Detection
- Ensure that you've learnt how to Practically remove secrets from Git history