Reversing Malware: October 2010

Sunday, October 31, 2010

Intro to Static Analysis Part 2

This week I plan to go a bit more in detail on each of the steps from last week. I was going to get into deeper items such as using disassemblers and such, but I will go there in the next two posts. I really wanted everyone to see just how much you can learn from just doing the simple things I discussed on the last post.

Here are the steps from last post step by step with screen shots. For the sample, I just went to Malware Domain List and grabbed a suspected malware sample from that page.

I will warn you now as always. You should only be downloading and executing the following sample in a lab environment. If you don't know how to do that, watch for Jamy's next post where he will start you down that path.

This weeks sample can be downloaded here.

I don't know what this sample does right now. I'm looking at it for the first time just as you are :)

First I'm going to boot my lab environment and pull the sample into it. Then I'm going to run a hash application to get the digital fingerprint of the binary.

I just installed the Malcode Analyst Pack. This will give us a handy right click menu to obtain the MD5 Hash and Strings. So I right click the file and choose MD5 Hash. Here is the result:

As you can see, the hash value of our sample is 121340AA444B4D4153510C0BE58D4D61. We will jot this down in our notes.

Next we will take our fuzzy hash with SSDEEP. In this example we will use the SSDEEP Front End. This is a nice little GUI to make things easy. When we first open the application, we need to choose Create or Append Hash Value. We will choose Single File as our Input. Click the Choose Input button and navigate to where you have your sample, as seen below:

Next, we click the Choose Output button. This will open an Explorer window, where you can choose where you would like your output to go. Here we are going to choose to put it on the Desktop so we can find it easy. I generally name the file: <filename>_exe (or dll if it's a dll). This is just my method to know what the name is and what that sample it goes with. Click the Open button. You then need to click the Execute button.

A dialog box will pop up telling you that you are about to run a batch file, and that a DOS box is about to pop up. Click the yes button to continue.

A DOS box will pop up for a second and go away. You will be left back at the main screen of the SSDEEP Front End. You can choose Exit at this time.

You now have the file on your desktop. Double Click to open it. Windows will ask you what application you would like to use to open the file. Choose to select a program from a list and click OK. I normally use the Notepad application for these, but you can use any text editor or viewer that you want. I generally also choose to always use the selected program to open this type of file. That way I don't need to choose every time.

As you can see from the screen shot below. Our fuzzy hash is

3072:HPZJsRgSmHWii6X4/QDDu3vDTw/hkfSUJjLTJra:vZJkjiie44DC/A/hkfSUJjPJr,"mdktask.exe"

The next thing we are going to do is take our Hash value and search Virus Total to see if anyone has submitted this sample before. So we navigate to http://www.virustotal.com. We will click the Search link and paste our Hash value in and hit search.

As you can see from the following screen shot someone has submitted this sample and 25 our of 41 anti virus applications say it's a virus.

Next, we want to classify the file. To do this, we are going to use TRiD. Ensure you download the latest definition files from here. Once you run TriD, you will need to point to your definition xml file. Choose the Browse button on the bottom right of the application. Navigate to your definition files and choose the first listed xml file in the directory. I generally put the xml files in a folder called Defs in the TRiD application folder.

You will want to choose Browse on the top to choose our sample we want to analyze. After choosing the file, we click the Analyze button and get the results. As you can see below we have a match of 86% of a Windows 32 Executable file, potentially written in Visual Basic 6.

Normally at this point, we would run the sample through a few other similar applications to see if anything new is found. To keep this post relatively short, I'm going to bypass that. I'm also not going to upload the sample to any sites to be scanned for known virus signatures. Mainly because I've seen by our search on Virus Total that we can be pretty sure it is known.

The next step we will do is to open the file with BinText. Choose the Browse button and choose our sample. After that we hit the go button. This will show us some of the strings available in the binary file. In some cases you may not see what you like here due to packing or encryption. We will go into that more later. In the case of this sample we are able to see a good bit of detail as seen below:

Below are some of the things I saw that should be added to our notes as items of interest:

Form1. This tells us that potentially there is a visual form
Timer1. This lets us know there is a timer or countdown of some sort.
Winsock.
WinsockAPI. These two tell us there is some sort of network component.
modSMTP. This would let us know there could be an email component which starts to corroborate what we have found from our search on Virus Total earlier.
mod_Variaveis. A quick Google of this word looks like it translates to variables from Portuguese. We now have an idea of where it may have come from. Maybe from Brazil or somewhere like that.
getpeername. This function will retrieve a name of a socket that was created. This starts to show there are more facts to prove network connectivity.

I could go through each, but I'm going to shorten this to only pull out the remainder of very interesting points that I see.

C:\Arquivos de programas\Messenger\msmsgs.exe\3 (Microsoft Messenger...interesting)
DownloadFile. Looks like we are getting more malware.

Crypt. Looks like some hashing or encryption going on.
GetWinPlatform. Looking for a specific Windows version maybe?
strEmailTO
strEmailTO1
strEmailTO2
strEmailTO3. Now we see some email capabilities which shows us why VirusTotal called this an email worm possibly
D:\Programas Daniel\infect\infect Interlig - rato\Project1.vbp. Humm is this guy called Daniel? gatta love when people don't clean up :)
GOD DAMNIT, the internet doesn't work... Little bit of error checking?
catia@oticasopcao.com. Source Email maybe?
showbol2010.log. Log for us to see how things went maybe?
http://www.youtube.com/. Maybe a link used in the email?
www.youtube.com. A link to more malware sent in the email possibly.
InternetBanking.exe. The next executable file name to download if you click the link or when the executable file executes. InternetBanking is interesting. Maybe we have something trying to steal banking credentials?

There are plenty of other links in there as well as you will see if you go through this sample yourself.

atualizahook.cfg. A config file on how to set things up?
Subject:
From: more email functions
EHLO
AUTH LOGIN
MAIL FROM:<
RCPT TO:<. Email server communications.
Norton AntiVirus. Possibly to look like it's been scanned in transit?
lhe enviou o link do video no youtube. sent the video link on youtube in Portuguese.
This program cannot be run in DOS mode. We know this shows up at the beginning of PE files, but this is in the middle. Maybe this is our worm sending a copy of itself. Going through the rest of that section takes use through what we just went through previously. We may assume this is the case until we find out more.

Everything we have seen, and believe me there is a lot more in there, shows us that our results at Virus Total were pretty close so far. We could finish by looking at the PE format with PEiD or attempt to look for packing or encryption, but I think you can see we really don't need to at this time.

I will end this post here. As you can see we didn't need to know any programming to find out what this thing does so far. For the APIs we may not have known, a quick search on Google or MSDN gave use the capability of those pieces of code. Not all code is this simple and to be honest I'm happy I choose one so simple randomly as it made our day a bit easier.

In the next post I will take you into the debugger and disassembler. We will start by showing IDA Pro free version and OllyDBG. These are 2 of the more popular tools in this field. The fourth part of our introduction to static analysis will show some examples with a new sample much like we showed more details on the first post's tools.

I hoped you enjoyed the post. Look for the next post in a week or so. In the mean time, if you have any questions or comments, feel free to leave them on the blog. We will respond to all comments and questions that are reasonable. We are here to share the knowledge, so don't hesitate if you desire to understand something in more detail!

If you see a possible mistake, please let us know as well. We are not perfect and to be quite frank, I'm not a programmer and I don't play one on the Internet. Therefore, I may make assumptions or come to conclusions that might be debatable by others who may know more than I.

Dynamic Analysis part 2

Welcome to the second installment of the dynamic analysis section of out blog. In the last post, I discussed why you should use a VM solution and made some recommendations on choosing one. In this post, I will go over some information on building out a the VM's themselves, including recommended operating systems and tools to install.

When considering what virtual machines to install, you must consider what tools you will be running and what operating system the malware is targeted at. For this reason, you will most likely end up with various Microsoft Windows installations along with a few Unix variants, most likely Linux. I will not cover installations themselves as there are plenty of instructions out there for installing operating systems. I will also cover a few classes of tools and make some recommendations for each category.

For your Windows VM's, I recommend having both a current Windows 7 install and an older Windows XP installation. The reason for 7 is that malware is starting to directly target Windows 7, as it is starting to gain critical mass in enterprise and general user environments. For most situations, Windows XP will be sufficient though. Windows will be used both as an analysis platform and the place where you run most of your malware samples.

In regards to Linux, you will primarily be using the VM's for analysis or to provide target services for your malware samples. At this point you might be asking, what is a service for malware? A service in this context is just like any other IT service, such as an IRC or FTP service. Many types of malware use legitimate protocols such as FTP or IRC as transports for communication. While performing dynamic analysis, you want to offer up the services the malware is looking for, as a way to go deeper into your analysis. Pretty much any Linux will work well for these duties, as a result I recommend choosing the distribution that you are most familiar with.

Next I will cover recommended tools. The main classes of tools for dynamic analysis are process monitors, file monitors, and network monitors.

Process Monitors are used to monitor running processes on the system where you are running the malware. These tools tell you such things as memory in use, files open, CPU use, drivers used, and DLL's in use. Two of my favorite process monitor tools are System Internals Process Monitor and Process Hacker. Both tools provide very similar information, however I personally think Process Hacker does a better job than Process Monitor of showing you sub-processes spawned by execution of a file.

File Monitors are tools used to detect changes in files on disk. The primary use of file monitors as it relates to malware analysis is to detect changes to operating system files such as the windows registry or configuration files. The tools I generally recommend for this this task are Tripwire/AIDE and Capture Bat/Regshot. Tripwire and AIDE are general File Integrity scanners, they work by taking an MD5 has of all the files on a system, when a file changes they detect the change by comparing the new MD5 to the original. Capture Bat and Regshot work by taking an initial snap shot of the contents of specific files on the disk and comparing them to a later snapshot. Capture Bat and Regshot are both manual tools that require the user to take the first and second snapshots, and then require you to tell the tools to compare the snapshots.

The last class of tools I will cover are network monitors. Network monitors are tools that either capture packets on the wire or monitor TCP/IP sockets on the lcoal system. My favorite tools in this are are Wireshark and TCP View. Wireshark is a packet capture tool that runs in promiscuous mode on a network interface and captures all packets going across the wire. TCP View on the other hand sits on a local system and lists the TCP/IP ports that are open and the processes that have them open.

This post wraps up our basic analysis station configuration. In the next couple of weeks, I will show how these tools can be used to begin to perform dynamic analysis on sample malware. Stay tuned!

Sunday, October 17, 2010

Intro to Dynamic Analysis Part 1

Curt has covered static analysis quite well and briefly mentioned dynamic analysis. At this point you are probably wondering what is dynamic analysis? Simply put, it is the act of running the code and observing what happens.

Infecting a system with malware from the wild can be very dangerous. A malware infection on your system can cause everything from destruction of personal data to bot infection, to performance degradation, and all the way to complete data loss. At this point you might be saying, “I already know it is dangerous, but I need to analyze malware.” Many of us in the information security world have that same need whether it is for job duties or personal research to learn about threats in the wild, my goal is to give you some insight into building a malware analysis lab environment to start your dynamic analysis.

The first technology needed to start your own malware analysis, and I feel the most important, is a virtual machine (VM) environment. A VM environment provides several advantages over a physical environment including, configurable resources, advanced disaster recovery, and isolation.

Virtual environments are based on the concept of resource sharing and reutilization. This means that once a virtual environment is installed onto a physical system, you will have the ability to configure as many VM’s as you want by slicing up the physical systems resources. In addition, the VM environment kernel, also known as the hypervisor, allows all the VM’s to share memory and processor time. In practical use, this allows a research to for example have multiple Windows installations on a single system with only 2 GB of RAM, where as in a physical environment the same 2 GB system would only allow one installation.

Virtual environments provide several advantages over physical environments when it comes to disaster recovery. The most important advantage for malware analysis is the ability to snap shot. Snapshotting is the ability to capture the system configuration a specific point in time and to gracefully rollback to a snap shot. The advantage here is that a malware researcher can run malware in a live environment to determine what the malware does, then once done roll the system back to a clean state. Before virtual environments researchers had to rebuild their lab machines using install media to go back to a clean environment.

The last advantage that VM environments provide us is isolation. The resource sharing and control of virtual environments also gives us the capability to easily isolate machines from one another. With this capability we can easily take a machine we want to run malware on an isolate it from other systems. In the case of a bot net, we could add a system to the virtual environment to simulate the command and control function, while only allowing the command and control and our original infected host to communicate.

There are several good VM tools available both commercially and free. Which one you should use is completely up to you. My only recommendation is to look at either VMWare’s tools or Sun’s Virtual Box, both tools support the use of .vmdk files which means that you can use many of the pre-built virtual appliances available on the net.

look for part two, in which I will go over setting up your malware analysis VM.

Intro to Static Analysis Part 1

This is Part 1 of our Intro to Static Analysis.

There are a number of schools of thought on how to approach reversing malware. Some people jump right into dynamic analysis in effort to quickly learn what the specimen is doing so they can put rules in place on their network to stop it's functionality or see who else might be infected. Some of these people, will then perform static analysis to see if they may have missed something. Others leave it at the results found from their dynamic analysis.

Other people do static analysis first to fully understand the expected behavior so they know if something is happening with the sample when running it in a dynamic lab other than what is expected. They will then run dynamic analysis on it to see if their findings are correct.

Some people just submit the sample to places like Virus Total, Anubis, or CWSanbox, just to name a few . They then take action based on the results they get back from these tools. Finally there are some that just submit it to their Anti Virus vendor and wait for a signature to be released.

There is nothing wrong with any of these methods. Most are done because of either lack of time, skills or understanding of how to reverse malware. Some may think, why reinvent the wheel? This is all OK.

Here on this blog, we enjoy getting our hands dirty with looking at malware. We are not satisfied by just letting someone else do the work. There is nothing like the feeling you get when you have reversed a sample and know what it does and can stop it's functionality based on what you yourself have found. It is a great feeling we hope everyone reading will experience.

So what do I do? Speaking for myself, I tend to follow the method to do some static analysis first. I will then put the malware into a lab and see what it might do differently than what I have found. The personal reason that I do this is because sometimes you will find evidence that a sample is capable of realizing that it is being run in a virtual environment. I have seen this in the wild. I have seen samples that are capable of this which then either don't run at all, or run with fake results.

One sample I found that did this actually went through motions of downloading a new sample, executing it which would only lead to removing the old sample, putting the new one in place then downloading another sample that did the same thing. I only figured this out after I started to realize the samples it was grabbing started to cycle through the same names. This wasted a good bit of time while I chased that ghost. Had I looked at it statically first, then I possibly could have saved myself the time.

I'm not going to tell anyone which method to follow. Choose the path that you are most comfortable with and stick with it. I can only warn you of the things I have found in my time doing this.

This first post of mine is to get you ready to do static analysis. Jamy will be posting an introduction to dynamic analysis which will get into lab setup. I will refer you to his post to start there. If your doing dynamic or static analysis, it is a very good idea to start with creating a lab of non production systems to run your samples in. This could be virtual or physical or both.

Overall process of static analysis:

*****WARNING*******

I want to mention that I will be linking to tools on this blog. I cannot vouch for the validity of the copy you get. Some of these tools may include malware themselves. Try to get a copy of the tools from the original source. You will probably also want to scan the files and watch what your lab systems do after installing them. We will not be held responsible if you download a tool we mention that is infected with a virus. Be cautious and thorough in your investigation of each of these tools!

The first step in your process should be to start a log or collection of the details you are about to find. Some people do this is a text file, others may just jot things down in a notebook. I personally like to use a mind mapping software such as FreeMind. Lenny Zeltser, a SANS instructor and an overall excellent resource for reversing knowledge, has freely released a template for FreeMind specifically for analyzing malicious code. You can download it here.

If you are looking for very good training in this area, Lenny also offers a few courses with SANS that can be taken online or at a conference. He has specifically created, along with others, the Forensics 610: Reverse Engineering Malware , and the Security 569: Combating Malware in the Enterprise courses. I highly encourage you to take these courses if you are interested in malware analysis.

It doesn't matter which method you choose to write your notes down, but it is very important that you do. Documenting this will help you to keep on track and assist you in writing reports of your analysis if you do this professionally. Additionally if you choose a method that allows you to compile all of your findings centrally, you will be able to see trends or recognize similar behaviors of samples that could help in reversing future samples that exhibit similar characteristics.

This blog does not go into how to identify malicious code on your systems. We may do a post on that at a later date. In the mean time, there are plenty of good resources on the Internet to help you in this area.

With that aside, take note of the system you found the malware on. Take notice of the operating system, patch level, applications installed etc. Write down where you found the code (i.e. C:\windows\system32). Add any information that may be relevant on how the code was found (i.e. the system administrator noticed the system was running slowly, or found the system blue screened).

Next take a hash of the file or files found. A hash, in this case, is a mathematical computation on the bits of a specimen. This will help with identification of other copies of the malware even if the name is changed. You can think of this much like doctors and scientists look for specific characteristics of a virus in the human body. That way, other doctors or scientists can identify the same virus in another person.

It is generally accepted to perform a MD5 hash on the file. Some people will also do a SHA1 or other computations as well. There is also a newer method called fuzzy hashing or piecewise hashing that can be done. This actually hashes portions of a file, rather than the whole thing. This method allows for identification of portions of the code which may be useful in catching malware that has changed just a little in order to avoid detection by a person or Anti Virus application for example.

There are many tools out there to do this. WinMD5 is an easy to use tool which is freely available. SSDEEP can be used for fuzzy hashing. What tool you use is up to you. The mathematical computations such as MD5 or SHA are standard equations. All of the applications that perform these actions then should produce the same result. Some operating systems such as *nix varieties have these tools built right in on the command line such as md5 or shasum on the Macintosh I'm working from right now.

The next step that is good to take is to compare the hash values that you found with sites on the Internet to see if there are any matches or similar hashes found by others. This can go a long way in saving you time and effort if it is a known specimen. You can copy and paste your hash value into sites such as Virus Total, or Threat Expert to name a few. You can also run this through an internal database you might have to see if there are any hits. The purpose of this is to save yourself time and energy if this sample has already been analyzed and identified by someone else.

Next you will want to classify your specimen. This classification will be the file type, format, target architecture, language used, and compiler used. These characteristics will let you know what systems may be vulnerable and which may not. For example, if you find a PE file format, you can assume that *nix systems will most likely not be vulnerable to the sample. This is because the PE format is used on Microsoft Windows systems.

TrIDNet with the latest definition files is a good place to start this classification. Minidumper is another free tool which works well. One thing you will notice is that I generally run files though similar tools and compare the output. Sometimes one tool works better than other and will give you more information which may be helpful. I have also seen malware samples that are coded in such a way to recognize tools and change their behavior if they realize they are being analyzed.

GT2 is another good tool to run the sample through in this stage as well. This application attempts to identify the compiler, DLLs and other requirements for the code being analyzed.

You will want to scan the file next in attempt to see if it hits any known signatures. You can do this with multiple Anti Virus applications if you have multiple on your lab systems. You can also take this time to submit it to online sources which run the sample through a number of anti virus applications in effort to see if it's known. Depending on your organization, you may not want to submit the samples to these sources as they share their information freely with the public and anti virus vendors. If this sample is targeted in any way, this can blow your cover that you have found and are analyzing the sample.

If that is not a concern for you, some good examples of these services are Virus Total, Virscan, or Jotti. Again, this is just to name a few, there are many other good choices out there. Pick the ones you like and are comfortable with.

Another step to take is then to start to dig a little deeper. You will want to start looking for executable type, dll’s called, exports, imports, strings etc. This information will start to reveal characteristics of the sample which may lead you to get a better understanding of what it does. For example, if you run strings on the file you may reveal IP addresses used for call back and download components, you may find information that lets you know the file is packed or encrypted in some way, you may find other files that are needed or used in the infection. etc.

Some of the tools that you can use to see this information are BinText, dumpbinGUI, or DependencyWalker to name a few. Again, there are many more out there. These are just some ideas of tools that I like to use. For example if you are running on a *nix server, the command strings will work on most distributions right out of the box. Just open a terminal and run strings filename (where filename is the name of the file you are looking at) . You can also use the file command. Type file filename (again where filename is the name of the file you are looking at) to reveal the file type you are working with.

The next stage is to look for packing or encryption. This is often where the process becomes difficult. Some packers are easy to defeat. Others are not. Some encryption routines are easily seen and reversed. Others are custom and difficult. This is the time when a lot of analysts will enter into dynamic analysis. The reason for this is that they may not be able to unpack or decrypt the file by hand. In order for the file to run on a system, it needs to be unpacked or decrypted in memory to function.

Some tools to use to look for this type of information are PEiD, PE Detective, or Mandiant's Red Curtain. These tools are capable of detecting known packers, encryption or behaviors that are common to malicious files to make them difficult to analyze. You may have found this information out already from the output of previous tools.

That's all we have for Part 1 of the introduction to static analysis. Stay tuned for Part 2 comming soon!

Wednesday, October 13, 2010

What is this blog about?

Welcome to the internetopenurla.blogspot.com. The idea for this blog came out of a desire to show people, step by step, how to successfully reverse malware to fully understand it's capabilities and characteristics.

If you look around on the Internet, you will find little tips and tricks on what to look for to recognize malware. Some of the information will get a little into reversing malware as well. We have yet to find a source which will show how to reverse malware from start to finish with new samples in the wild.

We aim to do just that with this blog. Each month we will take a sample from the wild and reverse it in two ways. One of us will perform static analysis on the specimen and show step by step how we do that. Each step will contain the necessary information to deduce what the sample does one layer at a time, until we fully understand the capabilities for the code. This will include what tools were used, screen shots of the output of the tools and any custom settings used in the tools to get the necessary output.

The second way we will look at malware will be to put it through dynamic analysis. The results will include pcap captures of traffic that it generates, screen shots and discussions of what changes it made to the test system and any configuration changes that might have needed to be done to get the specimen to run properly in the lab.

At the end of the day, our aim is to help people understand the steps that it takes to successfully understand malware so you can reverse it yourselves. Up until now, it has seemed like this black art that only certain ninjas knew the secret of. We aim to clear that up so that others can start on the exciting journey of reversing malware.

What previous knowledge should you have?

Obviously it helps if you are familiar with programming in some language. C and assembly are very helpful to know, but you need not be able to write fully functional programs in each. You will find this a more necessary skill when doing static analysis.

Knowledge of how to use virtual environments such as VMWare or Virtual box, to name a few, will be helpful. We will show you how to build a lab utilizing these and other tools to do your analysis.

Other than that, we hope to teach you the things you need to know to become successful. Each post will explain each step carefully for the novice reverse engineer. This will include how to configure tools, how to change settings on tools to get expected results, and detailed explanation of the results so you understand what your looking at.

Who are we?

Curt Shaffer is an Enterprise Architect at Synaptek Corporation where he serves as a US Government contractor. His daily routine involves defining security architecture, incident response, IPS/IDS/DLP signature analysis, and malware reversing. Curt has over 12 years of professional experience in the industry and has consulted on wireless, network and systems infrastructure to a variety of markets including ISP's, federal and local government agencies and SMB's.

Jamy Klein is a Security Engineer at Qualcomm Inc. where he performs duties such as as security system design and operation including technologies such as encryption, DLP, web proxies, and processes such as malware investigation, and incident response. Jamy has 10 years of professional security experience working in various industries, including financial, government, insurance, medical, and high technology.