How to use Ghidra for malware analysis, reverse-engineering
The Ghidra malware analysis tool helps infosec beginners learn reverse-engineering quickly. Get help setting up a test environment and searching for malware indicators.
Security researchers use reverse-engineering tools to examine how potentially malicious files and executables work. One such tool is the National Security Agency's Ghidra malware analysis framework, which has been publicly available since 2019.
In Ghidra Software Reverse Engineering for Beginners, author and senior malware analyst A.P. David introduces readers to the open source Ghidra and how to use it. While he focuses on reverse-engineering, penetration testing and malware analysis for beginners, experienced users will also find the book useful.
"I noticed from reviews and general feedback that advanced reverse-engineers found this book useful, especially when it comes to how to compile Ghidra, use PCode for scripting and more," David said.
Start learning how to reverse-engineer malware using Ghidra in this excerpt from Chapter 5 of David's book. Here, he explains how to set up an initial testing environment and search binary files for malware indicators.
Download a PDF of Chapter 5 to dive into dissecting malware sample components to determine their function.
More on Ghidra Software Reverse Engineering for Beginners
Learn more about Ghidra and reverse-engineering in this Q&A with author A.P. David.
In this chapter, we will introduce reverse engineering malware using Ghidra. By using Ghidra, you will be able to analyze executable binary files containing malicious code.
This chapter is a great opportunity to put into practice the knowledge acquired during Chapter 1, Getting Started with Ghidra, and Chapter 2, Automating RE Tasks with Ghidra Scripts, about Ghidra's features and capabilities. To put this knowledge into practice, we will analyze the Alina Point of Sale (PoS) malware. This malware basically scrapes the RAM memory of PoS systems to steal credit card and debit card information.
Our approach will start by setting up a safe analysis environment, then we will look for malware indicators in the malware sample, and, finally, we will conclude by performing in-depth malware analysis using Ghidra.
In this chapter, we're going to cover the following main topics:
- Setting up the environment
- Looking for malware indicators
- Dissecting interesting malware sample parts
Technical requirements
The requirements for this chapter are as follows:
- VirtualBox, an x86 and AMD64/Intel64 virtualization software: https://www.virtualbox.org/wiki/Downloads
- VirusTotal, an online malware analysis tool that aggregates many antivirus engines and online engines for scanning: https://www.virustotal.com/
The GitHub repository containing all the necessary code for this chapter can be found at https://github.com/PacktPublishing/Ghidra-Software-Reverse- Engineering-for-Beginners/tree/master/Chapter05.
Check out the following link to see the Code in Action video: https://bit.ly/3ou4OgP
Setting up the environment
At the time of writing this book, the public version of Ghidra has no debugging support for binaries. This limits the scope of Ghidra to static analysis, meaning files are analyzed without being executed.
But, of course, Ghidra static analysis can complement the dynamic analysis performed by any existing debugger of your choice (such as x64dbg, WinDbg, and OllyDbg). Both types of analysis can be performed in parallel.
Setting up an environment for malware analysis is a broad topic, so we will cover the basics of using Ghidra for this purpose. Keep in mind that the golden rule when setting up a malware analysis environment is to isolate it from your computer and network. Even if you are performing static analysis, it is recommended to set up an isolated environment because you have no guarantee that the malware won't exploit some Ghidra vulnerability and get executed anyway.
The CVE-2019-17664 and CVE-2019-17665 Ghidra vulnerabilities
I found two vulnerabilities on Ghidra that could lead to the unexpected execution of malware when it is named: cmd.exe or jansi.dll. At the time of writing this book, CVE-2019-17664 is not fixed yet: https://github.com/NationalSecurityAgency/ghidra/issues/107.
In order to analyze malware, you can use a physical computer (restorable to a clean state via hard disk drive backups) or a virtual one. The first option is more realistic but slower when restoring the backup and more expensive.
You also have to isolate your network. A good example to illustrate the risk is ransomware encrypting the shared folders during analysis.
Let's use a VirtualBox virtualized environment, with read-only (for safety reasons) shared folders in order to transfer files from the host machine to the guest and no internet connection as it is not necessary for static analysis.
Then, we follow these steps:
- Install VirtualBox by downloading it from the following link: https://www.virtualbox.org/wiki/Downloads
- Create a new VirtualBox virtual machine or download it from Microsoft: https://aka.ms/windev_VM_virtualbox
- Set up a VirtualBox read-only shared folder, allowing you to transfer files from the host machine to the guest: https://www.virtualbox.org/manual/ch04.html#sharedfolders.
- Transfer Ghidra and its required dependencies to the guest machine, install it, and also transfer the malware you are interested in analyzing.
Additionally, you can transfer your own arsenal of Ghidra scripts and extensions.
Looking for malware indicators
As you probably remember from previous chapters, Ghidra works with projects containing zero or more files. Alina malware consists of two components: a Windows driver (rt.sys) and a Portable Executable (park.exe). Therefore, a compressed Ghidra project (alina_ghidra_project.zip) containing both components can be found in the relevant GitHub project created for this book.
If you want to get the Alina malware sample as is instead of a Ghidra project, you can also find it in the GitHub project (alina_malware_sample.zip), compressed and protected with the password infected. It is quite common to share malware in this way so that it does not accidentally get infected.
Next, we will try to quickly guess what kind of malware we are dealing with in general terms. To do that, we will look for strings, which can be revealing in many cases. We will also check external sources, which can be useful if the malware has been analyzed or classified. Finally, we will analyze its capabilities by looking for Dynamic Linking Library (DLL) functions.
Looking for strings
Let's start by opening the Ghidra project and double-clicking on the park.exe file from the Ghidra project in order to analyze it using CodeBrowser. Obviously, do not click
on park.exe outside of the Ghidra project as it is malware and your system can get infected. A good starting point is to list the strings of the file. We'll go to Search | For Strings... and start to analyze it:
As shown in the preceding screenshot, the user Benson seems to have compiled this malware. This information could be useful to investigate the attribution of this malware. There are a lot of suspicious strings here.
For instance, it is hard to imagine the reason behind a legitimate program making reference to windefender.exe. Also, SHELLCODE_MUTEX and System Service Dispatch Table (SSDT) hooking references are both explicitly malicious.
System Service Dispatch Table
SSDT is an array of addresses to kernel routines for 32-bit Windows operating systems or an array of relative offsets to the same routines for 64-bit Windows operating systems.
A quick overview of the strings of the program can sometimes reveal whether it is malware or not without further analysis. Simple and powerful.
Intelligence information and external sources
It is also useful to investigate the information found using external sources such as intelligence tools. For instance, as shown in the following screenshot, we identified two domains when looking for strings, which can be investigated using VirusTotal:
To analyze a URL in VirusTotal, go to the following link, write the domain, and click on the magnifying glass icon to proceed: https://www.virustotal.com/gui/home/url:
Search results are dynamic and might change from time to time. In this case, both domains produce positive results in VirusTotal. The results can be viewed at https://www.virustotal.com/gui/url/422f1425108ae35666d2f86f46f9cf56
5141cf6601c6924534cb7d9a536645bc/detection:
Apart from that, VirusTotal can provide more useful information that you can find by browsing through the page. For instance, it detected that the javaoracle2.ru domain was also referenced by other suspicious files:
When analyzing malware, it is recommended to review public resources before starting the analysis because it can bring you a lot of useful information for the starting point.
How to look for malware indicators
When looking for malware indicators, don't just try to look for strings used for malicious purposes, but also look for anomalies. Malware is usually easily recognized for multiple reasons: some strings will never be found in goodware files and the code could be artificially complex.
It is also interesting to check the imports of the file in order to investigate its capabilities.
Checking import functions
As the binary references some malicious servers, it must implement some kind of network communication. In this case, this communication is performed via an HTTP protocol,
as shown in the following import functions located in Ghidra's CodeBrowser Symbol Tree window:
Looking at ADVAPI32.DLL, we can identify functions named Reg* that allow us to work with the Windows Registry, while others that mention the word Service or SCManager allow us to interact with the Windows Service Control Manager, which enables us to load drivers:
There are really a lot of imports from KERNEL32.DLL, so, as well as many other things, it allows us to interact with and perform actions related to named pipes, files, and processes:
We have identified a lot of things with a very quick analysis. If you are experienced, you will know malware code patterns, leading to mentally matching API functions with strings and easily inferring what the malware will try to do when given the previously shown information.
About the author
A.P. David is a senior malware analyst and reverse-engineer. He has more than seven years of experience in IT, having previously worked on his own antivirus product. He started working for a company to reverse-engineer banking malware and help automate the process. After, David joined the critical malware department of an antivirus company. He is currently working as a security researcher at the Galician Research and Development Center in Advanced Telecommunications (Gradiant) while doing a malware-related Ph.D. He has hunted vulnerabilities for some relevant companies in his free time, including Microsoft's Windows 10 and the National Security Agency's Ghidra project.
https://packt.link/sGCLd