Reversing Malware: 2010

Thursday, December 30, 2010

Sample Analysis 1: Dynamic Analysis

For this portion of the analysis, I began by loading and taking a snapshot of my windows XP VM. The snapshot will allow me to revert later to a clean state.

Next I started up Regshot to take a baseline of the system. For this I used the option '1st shot and save."

I then started Process Explorer, TCP View, and Capture Bat and launched the malware specimen. The below screen shots will describe the results of each tool.

Process Explorer shows us that this piece of malware spawns the processes; mshta.exe, cmd.exe, at.exe, at.exe, etc.The malware quickly spawned and terminated several at.exe processes over and over.

TCP View shows us the mhsta process going out on UDP port 1052

Capture is a command line tool that that runs and displays changes made to the system as they happen. It is a tool that you must be actively monitoring at all times or output to a text file to detect changes made. In this case it caught several changes including the malware application launching a cmd prompt and deleting the same cmd prompt.

Next I used Reg Shot to take another shot of the VM. I then compared it to the first shot to reveal changes made. This showed several interesting things, including the malware adding itself to Internet Explorere as a Browser Helper Object (BHO).

While continuing to explore this specimen, the program broght up a very official looking window 'labeled Microsoft security essentials alert," that reported that process monitor was a trojan. A similar message appeared when I attempted to run task manager. The malware would not let either utility actually load. This may fool some users, but in this case I do not have security essentials installed on my virtual machine. This malware also terminated Process Monitor, for me how very helpful ;-) The screen shot below shows this very official looking window.

Next I decided to see what the AT scheduled tasks were all about. The various AT jobs have mshta.exe go out to crazyraccoonshow.com with 21 different jobs. Each job is set to run at different times, but on the same daily scheule. So, next I decided to try one of my favorite utilities, Fileinsight. Fileinsight is a graphical utility that is free from McAfee labs that includes a light weight debugger, script editor/viewer, and a graphical version of wget for windows.

Fileinsight was unable to get any results from the site. The web sites seem to to be either looking for specific mshta properties or the site has already been taken down, as I was unable to pull any content.

Since I already tried to pull the web code with File Insight, I next loaded up wireshark on my VM and my virtual machine copy of REMnux. I like REMnux a lot due to the fact that it has most analysis tools for Linux bundled into a pre-built environment.

Note: REMnux is a custom Linux distribution maintained by Lenny Zeltser, available at zeltser.com.

After starting Wireshark, and starting two of AT tasks, I began to see a lot of DNS lookups to various hosts such as update.celtro.dns1.us. Since this still did not tell me anything particularly useful, I set my windows VM's gateway to the IP address of my REMnux machine and then turned on the fakedns script on REMnux. I then went back to Wireshark and noticed that now that the malware was able to resolve DNS entries, it was attempting to communicate outbound with HTTPS to the same URL's, only the destination ports were cycling upward starting at 1045 and incrementing up by 1 each time it did not get a response. I also noticed that the malware was attempting to create another session using a random TCP port to port 80. This was interesting, but again didn't yield a whole lot of information, so next I setup a netcat listener using the command: nc -l -p 80

For those unfamiliar with Netcat, it is a generic listener. In this case the command I issued launched netcat and set it to listen (-l) on TCP port 80 (-p 80).

I did this to see if this traffic was really encrypted or if it was just using the source port pf 443 as a way to hide in normal traffic. I continued to wait to see if the malware process would eventually reach my listener. After noticing nothing in my netcat window, I decided to connect to my listener via telnet. My netcat listener was in fact receiving my telnet transmission. After several minutes I was still not seeing data on Wireshark. At this point I realized that the malware was doing a series of HTTPS requests followed by a series of HTTP requests.

Unfortunately since I was unable to decode what this particular malware was doing in it’s network communications, I stopped my analysis. I had found out that this particular malware was a fake AV tool that probably holds your system for ransom. This was sufficient information for me. Another thing to note on the behavior; this malware did not seem to be very persistent as I was able to terminate it without it automatically restarting.

At this point, our dynamic analysis is essentially over, if you wanted to keep your results to further analyze the logs, etc., you would need to copy the log data from your analysis tools to your host machine. I suggest copying the text and pasting it to a new document on your host machine, as transfering any files off of your infected VM is a risk. After you have obtained any needed data, you should revert your VM to your snapshot to ensure that you are back to a clean state.

Sunday, December 5, 2010

Sample Analysis 1 Static Results

The following is the static analysis details that I found with the Sample Analysis 1 binary that we posted previously. If you have done static analysis of this file as well, follow along and see if you found similar details. We attempt to get as detailed as possible, but we do have day jobs so there may be things inside the malware that we do not discuss. This is meant more to analyze the sample until we are happy that we understand the basic functionality of the malware.

Sample Analysis 1 Report:

MD5: 49d7498e4543027046795d076e47f1ac
Fuzzy hash:12288:AHlawHGMpk7lZWnIoWbq47TxC1+HK12XsfQJZUM0SsoSmjCbcZRcHPM:AHlnH47leIA4Y1D2XkmZ5dOaCHP
Virus Total Results: Show us 37 out of 41hits as a malicious file. Most of the descriptions call this fakeAV, surprise :)
Bin Text: When opening the file in BinText, it gave an error saying there was a problem reading the string resource file. The file may be compressed or in a non standard format. Looking though the strings that did show, didn't reveal much, though I did see references to the Delphi programming language. We also see some information mentioning use of the registry.

When I dropped this file into PEiD, I see the possible reason for BinText to complain. It looks like our sample is packed with UPX.

I then tried to unpack this just using the standard UPX package available from Sourceforge. The command line used is upx.exe -d <filename>. This appeared to unpack successfully. I did want to note here that when you unpack a UPX packed file it just unpacks the copy of the file you ran it against. This means you no longer have the packed version. If you want to save the packed file for some other reason, make a copy before you do this.

Now I put the file back into PEiD. Now we see Borland Delphi 6.0 - 7.0. This looks like we now have removed the packing.

Just for kicks, I want to dump this unpacked file back into BinText to see if there are any new strings to be seen. This time when I dropped the file in BinText, no errors! There are a ton of readable strings now! This sample seems like it has a ton of options. Looking through the strings, one major thing I notice is there are a lot of functions with GUI context such as OnMouseActivate, OnMouseDown, OnMouseUp, and PopUpMenu.

I also see a lot of references to web browser. This application seems to be very GUI driven. There are still some obfuscated strings in the unpacked version so at this time, I'll take the file into Ollydbg.

This sample has some protection schemes even though we have unpacked it. I have been jumping around in Ollydbg in order to find some way to bypass them. I have attempted to use the HideOD plugin. I also noticed some SEH calls that would terminate the application. To fix these things I told Ollydbg to ignore exceptions. You do this by going into the Options menu, then choose Debugging Options. Click the exception tab. Put a check in all of the options. At the bottom, you will see a section called ignore also following custom exceptions or ranges. Click the Add range button and enter 00000000 as the beginning and FFFFFFFF as the ending address. Your screen should look like the following:

For the HideOD plugin, we navigate to the Plugins menu, HideOD then choose options. Enable all of the options. You screen should look like the following:

You will want to restart Ollydbg for this to take. Once Ollydbg is open, you should be able to get a little further into the program.I started stepping over instructions and keeping an eye on the stack for interesting data. At memory location 403CE6 I found myself stuck in a loop. I looked through the loop and found TEST EBX, EBX followed by a JMP SHORT. If this is equal, then it will jump. I changed to stepping into (F7), until this test. At that point I double clicked the EBX register in the registers window. For those of you who aren't familiar with Ollydbg, this is the window on the upper right of the screen. I change the value of EBX to 00000000. This got me out of the loop.

After stepping further into the code, I noticed the ASCII text of Microsoft Security Essentials Alert on the stack.

After some time I kept finding myself at 7C92A2F5. This was decrementing EAX then jumping if not zero. I noticed that EAX had a value of 1 so I change this to 0. This seemed to get me past that loop as well.

This seems to have allowed me to get further along. While stepping into instructions I noticed a file created called agtyjkj.bat in the stack section. This file contained the following code:

:dsfgdfh
del "C:\Documents and Settings\installer\Desktop\adobeflashplayerv10.0.32.20.exe"
if exist "C:\Documents and Settings\installer\Desktop\adobeflashplayerv10.0.32.20.exe" goto dsfgdfh
del "C:\Documents and Settings\installer\Application Data\agtyjkj.bat"

This code looks like it tries to delete the original file and if it doesn't exist any more then it removes the bat file. While in that directory, I noticed another new file named hotfix.exe a quick hash of the file shows that it's the same as our original but renamed.

This also goes to show that sometimes even when you are doing static analysis, it might be more helpful to do a little dynamic analysis as well. This is especially true when you have a sample like this that has protections and obfuscation.

I decided at this time to dig through the stack section in Ollydbg to see what else might be learned from there. For those of you that might not be familiar with Ollydbg this is the window in the lower right hand side. I found a reference to at.exe as can be seen below:

If your not familiar with at.exe, this is the command line equivalent to the task scheduler. Depending on how this is called, these items may or may not show in the Scheduled Tasks folder in the control panel. If they don't, you can see them by issuing the at command on the command line. It turns out, that these are showing in the Scheduled Tasks. It looks like this sample created quite a few (possibly 72 tasks).

Looking at these tasks, we see mshta.exe being used to call some random urls. Most look like http://funnyraccoonshow.com/gspwjg.php?fjfnsl=815400370451178. MSHTA.exe is used to allow execution of .hta files. It looks like these tasks are set to run just about every hour. Without going further in dynamic analysis I would assume this is where the html comes from for the fake AV application. It looks like our sources at Virus Total were probably correct in their categorization.

That is about all I have time for today. I hope you saw from this analysis that it isn't always necessary to know assembly to statically analyze code. This is one of those samples where dynamic analysis would probably reveal more easier, but we see that we were able to come to the same conclusion just by looking at the code. Sure I used a little assembly to get out of some loops, but there were no ground breaking techniques done just simple register modification thanks to Ollydbg for allowing us to do so.

You can look for Jamy's post on what he found from dynamic analysis to come soon. In the mean time, we are trying to come up with a way to get these samples to you guys if the sites are taken down before you get them. I hope to have a solution before the next sample post.

Monday, November 29, 2010

Sample Analysis 1

I apologize for not posting a primer on OllyDBG. Things are pretty busy with work and life. I know that's not a good excuse but it's all I got :). In the mean time, here is a link to the sample we are currently analyzing. We will post our results, both static and dynamic in the next week or so. Check back. In the mean time, don't forget that we Tweet anytime there is a site update. If you want to follow us it's @inetopenurla.

The latest sample can be found here.

Check back soon for the analysis. Also, if you have a sample you would like us to analyze you can email it to us at inetopenurla (at) gmail[dot]com. Just put it in a password protected zip file with a password of infected.

I promise to add a primer to OllyDBG soon as well.

Sunday, November 14, 2010

Intro to Static Analysis Part 3

In this post I'm going to introduce you to IDA Pro. This is a disassembler application that is commonly used in the reverse engineering field. There are many other applications like this, but if you plan to do this as a job, it would be a good idea to at least learn this interface in my opinion. Next week we will look at another good and similar tool called OllyDBG.

In my example I actually am running the latest version of IDA Pro and it's sister product called Hex-Rays. Hex-Rays is a decompiler application which adds a nice feature set to IDA. You can download a free trial of IDA Pro to see what the newer version offers you. Alternately, they offer a free version which is a few features behind. You can download that here. You can do most of what we will go over on this post with the free version. If you are serious about reverse engineering, or do it for a living I would highly recommend getting the Hex-Rays add on. It really breaks out code in a nice readable format, especially for someone that may not be as strong in assembly programming. This is just my opinion, so take it for what it's worth.

Overall your really looking at under $4,000 bucks for a set of tools that is going to save you a ton of time once you become familiar with them.

So on to the analysis. As always these links direct you to known malicious software. We hold no responsibility on your machine getting infected to a point where you can not recover or credentials that may be stolen due to improper handling. Please only analyze this and all samples in a secured lab environment.

I went out to grab a new file from Malware Domain List. The malware I downloaded this week can be found here.

Let's open this in IDA Pro. You can do this in a few different ways. You can drag the file onto the IDA Desktop shortcut or you can open IDA Pro, Choose to Dissamble a new file by clicking the New Button.

Navigate to the file you want to disassemble and choose open:

In most cases IDA will automatically recognize the processor type and options needed to open the file. You can modify these if you know that this is not correct from earlier inspection. A quick note is that normally I move on to IDA Pro or something similar after doing the steps we outlined previously. Therefore I may already know some things about the file such as what architecture the file is created for or if it is packed or not.

If the malware is packed or encrypted, which a lot of malware today is, there are many more steps which you may need to do before you can open the file in IDA and analyze it fully. This post does not go into these details. We may add a post on beating obfuscation at another time.

This specimen did appear to be packed with UPX. So we just unpacked it with the following command:

In this case, we will keep the default options and click OK:

Depending on how big the malware is, this may be quick or it may take some time. The first couple things I do once this is done is to take a look around at a few screens. This time taken helps me understand how much work may be ahead of me by how much IDA has recognized automatically. I will let you know that my screen shots will be a little different than yours if you are using the Freeware version. Version 6 looks a bit different even though all of the windows and options are still there. You may just need to hunt around to see where your version is displaying the same information.

One of the first boxes I look at is the Names window. If you don't see the Names window, you can go to the view menu, click open subviews and choose names. Alternately you can hit Shift + F4. This Names window is going to show you names of APIs or type libraries it was able to recognize automatically. These may not be written with names that immediately tell their function, but sometimes they are. If they are pretty descriptive, then that may be less work that we need to do. Here is a screen shot of what this sample looks like when we first open it:

As you can see in the Names view, some of the names are pretty easy to distinguish, some are not. So you cannot always judge a function by that.

The next thing I generally look at is the Function subview. The functions window shows us the subroutines available in the sample. In some cases you will see names of these functions and in some cases, they will have generic names such as sub_xxxx where xxxx is the memory location of the routine. The reason IDA will show these names is if it matched a type library that IDA knows. The more named functions you have, the easier disassembly generally is.

In our case, IDA named a number of functions. This will help us significantly ahead because we don't need to figure out what they do necessarily. Here is the screen shot of the Functions window of our sample:

The next thing we can look at is the strings window. This will be a listing of all of the strings that IDA was able to recognize in the binary. We used strings in our previous posting, so this may not be new news to you. If you do not see the stings window, you can go to view, click open sub views and choose Strings. Alternately you can hit Shift + F12. Here is a screen shot of the strings of our current sample:

As we learned last week, we see some interesting things right away. In this sample there are some strings which can help us understand some of the functions, but unlike the last sample, there are only a few which immediately make sense. We can learn a lot from the strings in a file, but a word of warning is that some malware authors will also put Red Herring strings in a file to throw you off.

If you look across the top of the view window, but under the command menus, you see a multi colored line. This is your binary time line so to speak. This will show you where you are in the binary at the current time. You will notice a little yellow arrow. This arrow shows you exactly where you are at the moment. As you can see in the following screen shot, IDA dumped us off at a function called Start. This is because IDA recognized this function as a potential entry point into the binary. You can view the Exports sub view to see other possible entry points. In our case we only have one Export listed at this time.

The imports sub view shows us all of the functions or APIs that the binary is using. In our example, it looks like all of the imports are coming from the standard Microsoft libraries.

If we navigate to the Type Library sub view, we will see what IDA thinks was the compiler used to create the sample. In our case, it says MS SDK(Windows XP). This let's us know the binary file was created with the standard Microsoft SDK platform which is used to compile applications for operating systems such as Windows XP, Windows Server 2003 etc.

Now on to do job at hand. We will navigate to the IDA View and start to figure out what this thing is doing. To keep this post short, I'm only going to show a few pieces.

The start function appears to be setting up the stack with a number of variables. It then calls a sub routine at 004F8C0A. The call is to sub_407FB0. If you hover over the sub routine, you will see a small information box appear which will show you the details of the function:

Alternately you can navigate to the sub routine by double clicking the value. Which we will do here.

This sub routine appears to get a handle on an already loaded module. This module needs to be loaded before this call. The lpModuleName should contain the module we are trying to get a handle on. Here it appears to be 0. This appears to tell us that it returns the handle to the file used to create the process per the MSDN documentation on GetModuleHandleW.

Let's rename this sub routine get_handle_on_parent. We do this by right clicking on the sub routine name and choose rename.

We then rename the routine to what we want.

After clicking OK, you may get a warning that the name length exceeds the limit (15). Do you want to increase the limit. Click Yes. You will now see the more meaningful name in the code. Anywhere this sub routine is referenced will be changed automatically for you as well.

You will now run through the remaining sub routines and name them appropriately. This will help make more sense of the code. You may not get the sense of all of the routines. One word of advise that I would give is to spend a few minutes trying to figure out what it does. If you don't get it, don't sweat it. Move on to another and identify all of the routines you can. Maybe then, others will start to make sense and you will be able to figure them out.

I'm not going to go into detail on this sample. I was really using this as an introduction to IDA Pro. Next post, I will do the same thing but with OllyDBG. One last thing I will show is how the Hex-Rays decompiler helps make understanding functions a little easier. I have highlighted the sub routine that I want to understand in the following screen shot located at 00403D69:

I will now navigate to the view menu, open sub views and then choose Pseudocode. Alternately you can press F5. This will open a new window called Pseudocode-A for the first window and subsequently Pseudocode-B, Pseudocode-C etc. Below is what that window looks like for this routine:

As you can see, that looks a lot like standard programming which you may be more comfortable with than Assembly as I am. The more you clean up your functions and variable names, the easier this will be to read. If you know programming at all though, you can get the jest of it without cleaning much.

Again, you may be eager to learn what this sample does, but alas I am not going to fill that void for you this week. I just wanted to show some general options that are available in IDA Pro to help you understand what a binary does via static analysis.

If you have any questions, please let us know. In my next post, I will take you through some of the features of OllyDBG. After that post, we will begin our analysis only posts. I hope you have found this helpful.

Sunday, October 31, 2010

Intro to Static Analysis Part 2

This week I plan to go a bit more in detail on each of the steps from last week. I was going to get into deeper items such as using disassemblers and such, but I will go there in the next two posts. I really wanted everyone to see just how much you can learn from just doing the simple things I discussed on the last post.

Here are the steps from last post step by step with screen shots. For the sample, I just went to Malware Domain List and grabbed a suspected malware sample from that page.

I will warn you now as always. You should only be downloading and executing the following sample in a lab environment. If you don't know how to do that, watch for Jamy's next post where he will start you down that path.

This weeks sample can be downloaded here.

I don't know what this sample does right now. I'm looking at it for the first time just as you are :)

First I'm going to boot my lab environment and pull the sample into it. Then I'm going to run a hash application to get the digital fingerprint of the binary.

I just installed the Malcode Analyst Pack. This will give us a handy right click menu to obtain the MD5 Hash and Strings. So I right click the file and choose MD5 Hash. Here is the result:

As you can see, the hash value of our sample is 121340AA444B4D4153510C0BE58D4D61. We will jot this down in our notes.

Next we will take our fuzzy hash with SSDEEP. In this example we will use the SSDEEP Front End. This is a nice little GUI to make things easy. When we first open the application, we need to choose Create or Append Hash Value. We will choose Single File as our Input. Click the Choose Input button and navigate to where you have your sample, as seen below:

Next, we click the Choose Output button. This will open an Explorer window, where you can choose where you would like your output to go. Here we are going to choose to put it on the Desktop so we can find it easy. I generally name the file: <filename>_exe (or dll if it's a dll). This is just my method to know what the name is and what that sample it goes with. Click the Open button. You then need to click the Execute button.

A dialog box will pop up telling you that you are about to run a batch file, and that a DOS box is about to pop up. Click the yes button to continue.

A DOS box will pop up for a second and go away. You will be left back at the main screen of the SSDEEP Front End. You can choose Exit at this time.

You now have the file on your desktop. Double Click to open it. Windows will ask you what application you would like to use to open the file. Choose to select a program from a list and click OK. I normally use the Notepad application for these, but you can use any text editor or viewer that you want. I generally also choose to always use the selected program to open this type of file. That way I don't need to choose every time.

As you can see from the screen shot below. Our fuzzy hash is

3072:HPZJsRgSmHWii6X4/QDDu3vDTw/hkfSUJjLTJra:vZJkjiie44DC/A/hkfSUJjPJr,"mdktask.exe"

The next thing we are going to do is take our Hash value and search Virus Total to see if anyone has submitted this sample before. So we navigate to http://www.virustotal.com. We will click the Search link and paste our Hash value in and hit search.

As you can see from the following screen shot someone has submitted this sample and 25 our of 41 anti virus applications say it's a virus.

Next, we want to classify the file. To do this, we are going to use TRiD. Ensure you download the latest definition files from here. Once you run TriD, you will need to point to your definition xml file. Choose the Browse button on the bottom right of the application. Navigate to your definition files and choose the first listed xml file in the directory. I generally put the xml files in a folder called Defs in the TRiD application folder.

You will want to choose Browse on the top to choose our sample we want to analyze. After choosing the file, we click the Analyze button and get the results. As you can see below we have a match of 86% of a Windows 32 Executable file, potentially written in Visual Basic 6.

Normally at this point, we would run the sample through a few other similar applications to see if anything new is found. To keep this post relatively short, I'm going to bypass that. I'm also not going to upload the sample to any sites to be scanned for known virus signatures. Mainly because I've seen by our search on Virus Total that we can be pretty sure it is known.

The next step we will do is to open the file with BinText. Choose the Browse button and choose our sample. After that we hit the go button. This will show us some of the strings available in the binary file. In some cases you may not see what you like here due to packing or encryption. We will go into that more later. In the case of this sample we are able to see a good bit of detail as seen below:

Below are some of the things I saw that should be added to our notes as items of interest:

Form1. This tells us that potentially there is a visual form
Timer1. This lets us know there is a timer or countdown of some sort.
Winsock.
WinsockAPI. These two tell us there is some sort of network component.
modSMTP. This would let us know there could be an email component which starts to corroborate what we have found from our search on Virus Total earlier.
mod_Variaveis. A quick Google of this word looks like it translates to variables from Portuguese. We now have an idea of where it may have come from. Maybe from Brazil or somewhere like that.
getpeername. This function will retrieve a name of a socket that was created. This starts to show there are more facts to prove network connectivity.

I could go through each, but I'm going to shorten this to only pull out the remainder of very interesting points that I see.

C:\Arquivos de programas\Messenger\msmsgs.exe\3 (Microsoft Messenger...interesting)
DownloadFile. Looks like we are getting more malware.

Crypt. Looks like some hashing or encryption going on.
GetWinPlatform. Looking for a specific Windows version maybe?
strEmailTO
strEmailTO1
strEmailTO2
strEmailTO3. Now we see some email capabilities which shows us why VirusTotal called this an email worm possibly
D:\Programas Daniel\infect\infect Interlig - rato\Project1.vbp. Humm is this guy called Daniel? gatta love when people don't clean up :)
GOD DAMNIT, the internet doesn't work... Little bit of error checking?
catia@oticasopcao.com. Source Email maybe?
showbol2010.log. Log for us to see how things went maybe?
http://www.youtube.com/. Maybe a link used in the email?
www.youtube.com. A link to more malware sent in the email possibly.
InternetBanking.exe. The next executable file name to download if you click the link or when the executable file executes. InternetBanking is interesting. Maybe we have something trying to steal banking credentials?

There are plenty of other links in there as well as you will see if you go through this sample yourself.

atualizahook.cfg. A config file on how to set things up?
Subject:
From: more email functions
EHLO
AUTH LOGIN
MAIL FROM:<
RCPT TO:<. Email server communications.
Norton AntiVirus. Possibly to look like it's been scanned in transit?
lhe enviou o link do video no youtube. sent the video link on youtube in Portuguese.
This program cannot be run in DOS mode. We know this shows up at the beginning of PE files, but this is in the middle. Maybe this is our worm sending a copy of itself. Going through the rest of that section takes use through what we just went through previously. We may assume this is the case until we find out more.

Everything we have seen, and believe me there is a lot more in there, shows us that our results at Virus Total were pretty close so far. We could finish by looking at the PE format with PEiD or attempt to look for packing or encryption, but I think you can see we really don't need to at this time.

I will end this post here. As you can see we didn't need to know any programming to find out what this thing does so far. For the APIs we may not have known, a quick search on Google or MSDN gave use the capability of those pieces of code. Not all code is this simple and to be honest I'm happy I choose one so simple randomly as it made our day a bit easier.

In the next post I will take you into the debugger and disassembler. We will start by showing IDA Pro free version and OllyDBG. These are 2 of the more popular tools in this field. The fourth part of our introduction to static analysis will show some examples with a new sample much like we showed more details on the first post's tools.

I hoped you enjoyed the post. Look for the next post in a week or so. In the mean time, if you have any questions or comments, feel free to leave them on the blog. We will respond to all comments and questions that are reasonable. We are here to share the knowledge, so don't hesitate if you desire to understand something in more detail!

If you see a possible mistake, please let us know as well. We are not perfect and to be quite frank, I'm not a programmer and I don't play one on the Internet. Therefore, I may make assumptions or come to conclusions that might be debatable by others who may know more than I.

Dynamic Analysis part 2

Welcome to the second installment of the dynamic analysis section of out blog. In the last post, I discussed why you should use a VM solution and made some recommendations on choosing one. In this post, I will go over some information on building out a the VM's themselves, including recommended operating systems and tools to install.

When considering what virtual machines to install, you must consider what tools you will be running and what operating system the malware is targeted at. For this reason, you will most likely end up with various Microsoft Windows installations along with a few Unix variants, most likely Linux. I will not cover installations themselves as there are plenty of instructions out there for installing operating systems. I will also cover a few classes of tools and make some recommendations for each category.

For your Windows VM's, I recommend having both a current Windows 7 install and an older Windows XP installation. The reason for 7 is that malware is starting to directly target Windows 7, as it is starting to gain critical mass in enterprise and general user environments. For most situations, Windows XP will be sufficient though. Windows will be used both as an analysis platform and the place where you run most of your malware samples.

In regards to Linux, you will primarily be using the VM's for analysis or to provide target services for your malware samples. At this point you might be asking, what is a service for malware? A service in this context is just like any other IT service, such as an IRC or FTP service. Many types of malware use legitimate protocols such as FTP or IRC as transports for communication. While performing dynamic analysis, you want to offer up the services the malware is looking for, as a way to go deeper into your analysis. Pretty much any Linux will work well for these duties, as a result I recommend choosing the distribution that you are most familiar with.

Next I will cover recommended tools. The main classes of tools for dynamic analysis are process monitors, file monitors, and network monitors.

Process Monitors are used to monitor running processes on the system where you are running the malware. These tools tell you such things as memory in use, files open, CPU use, drivers used, and DLL's in use. Two of my favorite process monitor tools are System Internals Process Monitor and Process Hacker. Both tools provide very similar information, however I personally think Process Hacker does a better job than Process Monitor of showing you sub-processes spawned by execution of a file.

File Monitors are tools used to detect changes in files on disk. The primary use of file monitors as it relates to malware analysis is to detect changes to operating system files such as the windows registry or configuration files. The tools I generally recommend for this this task are Tripwire/AIDE and Capture Bat/Regshot. Tripwire and AIDE are general File Integrity scanners, they work by taking an MD5 has of all the files on a system, when a file changes they detect the change by comparing the new MD5 to the original. Capture Bat and Regshot work by taking an initial snap shot of the contents of specific files on the disk and comparing them to a later snapshot. Capture Bat and Regshot are both manual tools that require the user to take the first and second snapshots, and then require you to tell the tools to compare the snapshots.

The last class of tools I will cover are network monitors. Network monitors are tools that either capture packets on the wire or monitor TCP/IP sockets on the lcoal system. My favorite tools in this are are Wireshark and TCP View. Wireshark is a packet capture tool that runs in promiscuous mode on a network interface and captures all packets going across the wire. TCP View on the other hand sits on a local system and lists the TCP/IP ports that are open and the processes that have them open.

This post wraps up our basic analysis station configuration. In the next couple of weeks, I will show how these tools can be used to begin to perform dynamic analysis on sample malware. Stay tuned!

Sunday, October 17, 2010

Intro to Dynamic Analysis Part 1

Curt has covered static analysis quite well and briefly mentioned dynamic analysis. At this point you are probably wondering what is dynamic analysis? Simply put, it is the act of running the code and observing what happens.

Infecting a system with malware from the wild can be very dangerous. A malware infection on your system can cause everything from destruction of personal data to bot infection, to performance degradation, and all the way to complete data loss. At this point you might be saying, “I already know it is dangerous, but I need to analyze malware.” Many of us in the information security world have that same need whether it is for job duties or personal research to learn about threats in the wild, my goal is to give you some insight into building a malware analysis lab environment to start your dynamic analysis.

The first technology needed to start your own malware analysis, and I feel the most important, is a virtual machine (VM) environment. A VM environment provides several advantages over a physical environment including, configurable resources, advanced disaster recovery, and isolation.

Virtual environments are based on the concept of resource sharing and reutilization. This means that once a virtual environment is installed onto a physical system, you will have the ability to configure as many VM’s as you want by slicing up the physical systems resources. In addition, the VM environment kernel, also known as the hypervisor, allows all the VM’s to share memory and processor time. In practical use, this allows a research to for example have multiple Windows installations on a single system with only 2 GB of RAM, where as in a physical environment the same 2 GB system would only allow one installation.

Virtual environments provide several advantages over physical environments when it comes to disaster recovery. The most important advantage for malware analysis is the ability to snap shot. Snapshotting is the ability to capture the system configuration a specific point in time and to gracefully rollback to a snap shot. The advantage here is that a malware researcher can run malware in a live environment to determine what the malware does, then once done roll the system back to a clean state. Before virtual environments researchers had to rebuild their lab machines using install media to go back to a clean environment.

The last advantage that VM environments provide us is isolation. The resource sharing and control of virtual environments also gives us the capability to easily isolate machines from one another. With this capability we can easily take a machine we want to run malware on an isolate it from other systems. In the case of a bot net, we could add a system to the virtual environment to simulate the command and control function, while only allowing the command and control and our original infected host to communicate.

There are several good VM tools available both commercially and free. Which one you should use is completely up to you. My only recommendation is to look at either VMWare’s tools or Sun’s Virtual Box, both tools support the use of .vmdk files which means that you can use many of the pre-built virtual appliances available on the net.

look for part two, in which I will go over setting up your malware analysis VM.

Intro to Static Analysis Part 1

This is Part 1 of our Intro to Static Analysis.

There are a number of schools of thought on how to approach reversing malware. Some people jump right into dynamic analysis in effort to quickly learn what the specimen is doing so they can put rules in place on their network to stop it's functionality or see who else might be infected. Some of these people, will then perform static analysis to see if they may have missed something. Others leave it at the results found from their dynamic analysis.

Other people do static analysis first to fully understand the expected behavior so they know if something is happening with the sample when running it in a dynamic lab other than what is expected. They will then run dynamic analysis on it to see if their findings are correct.

Some people just submit the sample to places like Virus Total, Anubis, or CWSanbox, just to name a few . They then take action based on the results they get back from these tools. Finally there are some that just submit it to their Anti Virus vendor and wait for a signature to be released.

There is nothing wrong with any of these methods. Most are done because of either lack of time, skills or understanding of how to reverse malware. Some may think, why reinvent the wheel? This is all OK.

Here on this blog, we enjoy getting our hands dirty with looking at malware. We are not satisfied by just letting someone else do the work. There is nothing like the feeling you get when you have reversed a sample and know what it does and can stop it's functionality based on what you yourself have found. It is a great feeling we hope everyone reading will experience.

So what do I do? Speaking for myself, I tend to follow the method to do some static analysis first. I will then put the malware into a lab and see what it might do differently than what I have found. The personal reason that I do this is because sometimes you will find evidence that a sample is capable of realizing that it is being run in a virtual environment. I have seen this in the wild. I have seen samples that are capable of this which then either don't run at all, or run with fake results.

One sample I found that did this actually went through motions of downloading a new sample, executing it which would only lead to removing the old sample, putting the new one in place then downloading another sample that did the same thing. I only figured this out after I started to realize the samples it was grabbing started to cycle through the same names. This wasted a good bit of time while I chased that ghost. Had I looked at it statically first, then I possibly could have saved myself the time.

I'm not going to tell anyone which method to follow. Choose the path that you are most comfortable with and stick with it. I can only warn you of the things I have found in my time doing this.

This first post of mine is to get you ready to do static analysis. Jamy will be posting an introduction to dynamic analysis which will get into lab setup. I will refer you to his post to start there. If your doing dynamic or static analysis, it is a very good idea to start with creating a lab of non production systems to run your samples in. This could be virtual or physical or both.

Overall process of static analysis:

*****WARNING*******

I want to mention that I will be linking to tools on this blog. I cannot vouch for the validity of the copy you get. Some of these tools may include malware themselves. Try to get a copy of the tools from the original source. You will probably also want to scan the files and watch what your lab systems do after installing them. We will not be held responsible if you download a tool we mention that is infected with a virus. Be cautious and thorough in your investigation of each of these tools!

The first step in your process should be to start a log or collection of the details you are about to find. Some people do this is a text file, others may just jot things down in a notebook. I personally like to use a mind mapping software such as FreeMind. Lenny Zeltser, a SANS instructor and an overall excellent resource for reversing knowledge, has freely released a template for FreeMind specifically for analyzing malicious code. You can download it here.

If you are looking for very good training in this area, Lenny also offers a few courses with SANS that can be taken online or at a conference. He has specifically created, along with others, the Forensics 610: Reverse Engineering Malware , and the Security 569: Combating Malware in the Enterprise courses. I highly encourage you to take these courses if you are interested in malware analysis.

It doesn't matter which method you choose to write your notes down, but it is very important that you do. Documenting this will help you to keep on track and assist you in writing reports of your analysis if you do this professionally. Additionally if you choose a method that allows you to compile all of your findings centrally, you will be able to see trends or recognize similar behaviors of samples that could help in reversing future samples that exhibit similar characteristics.

This blog does not go into how to identify malicious code on your systems. We may do a post on that at a later date. In the mean time, there are plenty of good resources on the Internet to help you in this area.

With that aside, take note of the system you found the malware on. Take notice of the operating system, patch level, applications installed etc. Write down where you found the code (i.e. C:\windows\system32). Add any information that may be relevant on how the code was found (i.e. the system administrator noticed the system was running slowly, or found the system blue screened).

Next take a hash of the file or files found. A hash, in this case, is a mathematical computation on the bits of a specimen. This will help with identification of other copies of the malware even if the name is changed. You can think of this much like doctors and scientists look for specific characteristics of a virus in the human body. That way, other doctors or scientists can identify the same virus in another person.

It is generally accepted to perform a MD5 hash on the file. Some people will also do a SHA1 or other computations as well. There is also a newer method called fuzzy hashing or piecewise hashing that can be done. This actually hashes portions of a file, rather than the whole thing. This method allows for identification of portions of the code which may be useful in catching malware that has changed just a little in order to avoid detection by a person or Anti Virus application for example.

There are many tools out there to do this. WinMD5 is an easy to use tool which is freely available. SSDEEP can be used for fuzzy hashing. What tool you use is up to you. The mathematical computations such as MD5 or SHA are standard equations. All of the applications that perform these actions then should produce the same result. Some operating systems such as *nix varieties have these tools built right in on the command line such as md5 or shasum on the Macintosh I'm working from right now.

The next step that is good to take is to compare the hash values that you found with sites on the Internet to see if there are any matches or similar hashes found by others. This can go a long way in saving you time and effort if it is a known specimen. You can copy and paste your hash value into sites such as Virus Total, or Threat Expert to name a few. You can also run this through an internal database you might have to see if there are any hits. The purpose of this is to save yourself time and energy if this sample has already been analyzed and identified by someone else.

Next you will want to classify your specimen. This classification will be the file type, format, target architecture, language used, and compiler used. These characteristics will let you know what systems may be vulnerable and which may not. For example, if you find a PE file format, you can assume that *nix systems will most likely not be vulnerable to the sample. This is because the PE format is used on Microsoft Windows systems.

TrIDNet with the latest definition files is a good place to start this classification. Minidumper is another free tool which works well. One thing you will notice is that I generally run files though similar tools and compare the output. Sometimes one tool works better than other and will give you more information which may be helpful. I have also seen malware samples that are coded in such a way to recognize tools and change their behavior if they realize they are being analyzed.

GT2 is another good tool to run the sample through in this stage as well. This application attempts to identify the compiler, DLLs and other requirements for the code being analyzed.

You will want to scan the file next in attempt to see if it hits any known signatures. You can do this with multiple Anti Virus applications if you have multiple on your lab systems. You can also take this time to submit it to online sources which run the sample through a number of anti virus applications in effort to see if it's known. Depending on your organization, you may not want to submit the samples to these sources as they share their information freely with the public and anti virus vendors. If this sample is targeted in any way, this can blow your cover that you have found and are analyzing the sample.

If that is not a concern for you, some good examples of these services are Virus Total, Virscan, or Jotti. Again, this is just to name a few, there are many other good choices out there. Pick the ones you like and are comfortable with.

Another step to take is then to start to dig a little deeper. You will want to start looking for executable type, dll’s called, exports, imports, strings etc. This information will start to reveal characteristics of the sample which may lead you to get a better understanding of what it does. For example, if you run strings on the file you may reveal IP addresses used for call back and download components, you may find information that lets you know the file is packed or encrypted in some way, you may find other files that are needed or used in the infection. etc.

Some of the tools that you can use to see this information are BinText, dumpbinGUI, or DependencyWalker to name a few. Again, there are many more out there. These are just some ideas of tools that I like to use. For example if you are running on a *nix server, the command strings will work on most distributions right out of the box. Just open a terminal and run strings filename (where filename is the name of the file you are looking at) . You can also use the file command. Type file filename (again where filename is the name of the file you are looking at) to reveal the file type you are working with.

The next stage is to look for packing or encryption. This is often where the process becomes difficult. Some packers are easy to defeat. Others are not. Some encryption routines are easily seen and reversed. Others are custom and difficult. This is the time when a lot of analysts will enter into dynamic analysis. The reason for this is that they may not be able to unpack or decrypt the file by hand. In order for the file to run on a system, it needs to be unpacked or decrypted in memory to function.

Some tools to use to look for this type of information are PEiD, PE Detective, or Mandiant's Red Curtain. These tools are capable of detecting known packers, encryption or behaviors that are common to malicious files to make them difficult to analyze. You may have found this information out already from the output of previous tools.

That's all we have for Part 1 of the introduction to static analysis. Stay tuned for Part 2 comming soon!

Wednesday, October 13, 2010

What is this blog about?

Welcome to the internetopenurla.blogspot.com. The idea for this blog came out of a desire to show people, step by step, how to successfully reverse malware to fully understand it's capabilities and characteristics.

If you look around on the Internet, you will find little tips and tricks on what to look for to recognize malware. Some of the information will get a little into reversing malware as well. We have yet to find a source which will show how to reverse malware from start to finish with new samples in the wild.

We aim to do just that with this blog. Each month we will take a sample from the wild and reverse it in two ways. One of us will perform static analysis on the specimen and show step by step how we do that. Each step will contain the necessary information to deduce what the sample does one layer at a time, until we fully understand the capabilities for the code. This will include what tools were used, screen shots of the output of the tools and any custom settings used in the tools to get the necessary output.

The second way we will look at malware will be to put it through dynamic analysis. The results will include pcap captures of traffic that it generates, screen shots and discussions of what changes it made to the test system and any configuration changes that might have needed to be done to get the specimen to run properly in the lab.

At the end of the day, our aim is to help people understand the steps that it takes to successfully understand malware so you can reverse it yourselves. Up until now, it has seemed like this black art that only certain ninjas knew the secret of. We aim to clear that up so that others can start on the exciting journey of reversing malware.

What previous knowledge should you have?

Obviously it helps if you are familiar with programming in some language. C and assembly are very helpful to know, but you need not be able to write fully functional programs in each. You will find this a more necessary skill when doing static analysis.

Knowledge of how to use virtual environments such as VMWare or Virtual box, to name a few, will be helpful. We will show you how to build a lab utilizing these and other tools to do your analysis.

Other than that, we hope to teach you the things you need to know to become successful. Each post will explain each step carefully for the novice reverse engineer. This will include how to configure tools, how to change settings on tools to get expected results, and detailed explanation of the results so you understand what your looking at.

Who are we?

Curt Shaffer is an Enterprise Architect at Synaptek Corporation where he serves as a US Government contractor. His daily routine involves defining security architecture, incident response, IPS/IDS/DLP signature analysis, and malware reversing. Curt has over 12 years of professional experience in the industry and has consulted on wireless, network and systems infrastructure to a variety of markets including ISP's, federal and local government agencies and SMB's.

Jamy Klein is a Security Engineer at Qualcomm Inc. where he performs duties such as as security system design and operation including technologies such as encryption, DLP, web proxies, and processes such as malware investigation, and incident response. Jamy has 10 years of professional security experience working in various industries, including financial, government, insurance, medical, and high technology.