Sunday, November 14, 2010

Intro to Static Analysis Part 3

In this post I'm going to introduce you to IDA Pro. This is a disassembler application that is commonly used in the reverse engineering field. There are many other applications like this, but if you plan to do this as a job, it would be a good idea to at least learn this interface in my opinion. Next week we will look at another good and similar tool called OllyDBG.

In my example I actually am running the latest version of IDA Pro and it's sister product called Hex-Rays. Hex-Rays is a decompiler application which adds a nice feature set to IDA. You can download a free trial of IDA Pro to see what the newer version offers you. Alternately, they offer a free version which is a few features behind. You can download that here. You can do most of what we will go over on this post with the free version. If you are serious about reverse engineering, or do it for a living I would highly recommend getting the Hex-Rays add on. It really breaks out code in a nice readable format, especially for someone that may not be as strong in assembly programming. This is just my opinion, so take it for what it's worth.

Overall your really looking at under $4,000 bucks for a set of tools that is going to save you a ton of time once you become familiar with them.

So on to the analysis. As always these links direct you to known malicious software. We hold no responsibility on your machine getting infected to a point where you can not recover or credentials that may be stolen due to improper handling. Please only analyze this and all samples in a secured lab environment.

I went out to grab a new file from  Malware Domain List. The malware I downloaded this week can be found here.

Let's open this in IDA Pro. You can do this in a few different ways. You can drag the file onto the IDA Desktop shortcut or you can open IDA Pro, Choose to Dissamble a new file by clicking the New Button.

Navigate to the file you want to disassemble and choose open:

In most cases IDA will automatically recognize the processor type and options needed to open the file. You can modify these if you know that this is not correct from earlier inspection. A quick note is that normally I move on to IDA Pro or something similar after doing the steps we outlined previously. Therefore I may already know some things about the file such as what architecture the file is created for or if it is packed or not.

If the malware is packed or encrypted, which a lot of malware today is, there are many more steps which you may need to do before you can open the file in IDA and analyze it fully. This post does not go into these details. We may add a post on beating obfuscation at another time.

This specimen did appear to be packed with UPX. So we just unpacked it with the following command:

In this case, we will keep the default options and click OK:

Depending on how big the malware is, this may be quick or it may take some time. The first couple things I do once this is done is to take a look around at a few screens. This time taken helps me understand how much work may be ahead of me by how much IDA has recognized automatically. I will let you know that my screen shots will be a little different than yours if you are using the Freeware version. Version 6 looks a bit different even though all of the windows and options are still there. You may just need to hunt around to see where your version is displaying the same information.

One of the first boxes I look at is the Names window. If you don't see the Names window, you can go to the view menu, click open subviews and choose names. Alternately you can hit Shift + F4. This Names window is going to show you names of APIs or type libraries it was able to recognize automatically. These may not be written with names that immediately tell their function, but sometimes they are. If they are pretty descriptive, then that may be less work that we need to do. Here is a screen shot of what this sample looks like when we first open it:

As you can see in the Names view, some of the names are pretty easy to distinguish, some are not. So you cannot always judge a function by that.

The next thing I generally look at is the Function subview. The functions window shows us the subroutines available in the sample. In some cases you will see names of these functions and in some cases, they will have generic names such as sub_xxxx where xxxx is the memory location of the routine. The reason IDA will show these names is if it matched a type library that IDA knows. The more named functions you have, the easier disassembly generally is.

In our case, IDA named a number of functions. This will help us significantly ahead because we don't need to figure out what they do necessarily. Here is the screen shot of the Functions window of our sample:

The next thing we can look at is the strings window. This will be a listing of all of the strings that IDA was able to recognize in the binary. We used strings in our previous posting, so this may not be new news to you. If you do not see the stings window, you can go to view, click open sub views and choose Strings. Alternately you can hit Shift + F12. Here is a screen shot of the strings of our current sample:
As we learned last week, we see some interesting things right away.  In this sample there are some strings which can help us understand some of the functions, but unlike the last sample, there are only a few which immediately make sense. We can learn a lot from the strings in a file, but a word of warning is that some malware authors will also put Red Herring strings in a file to throw you off.

If you look across the top of the view window, but under the command menus, you see a multi colored line. This is your binary time line so to speak. This will show you where you are in the binary at the current time. You will notice a little yellow arrow. This arrow shows you exactly where you are at the moment. As you can see in the following screen shot, IDA dumped us off at a function called Start. This is because IDA recognized this function as a potential entry point into the binary. You can view the Exports sub view to see other possible entry points. In our case we only have one Export listed at this time.

The imports sub view shows us all of the functions or APIs that the binary is using. In our example, it looks like all of the imports are coming from the standard Microsoft libraries.

If we navigate to the Type Library sub view, we will see what IDA thinks was the compiler used to create the sample. In our case, it says MS SDK(Windows XP). This let's us know the binary file was created with the standard Microsoft SDK platform which is used to compile applications for operating systems such as Windows XP, Windows Server 2003 etc.
Now on to do job at hand. We will navigate to the IDA View and start to figure out what this thing is doing. To keep this post short, I'm only going to show a few pieces.

The start function appears to be setting up the stack with a number of variables. It then calls a sub routine at 004F8C0A. The call is to sub_407FB0. If you hover over the sub routine, you will see a small information box appear which will show you the details of the function:

Alternately you can navigate to the sub routine by double clicking the value. Which we will do here.
This sub routine appears to get a handle on an already loaded module. This module needs to be loaded before this call. The lpModuleName should contain the module we are trying to get a handle on. Here it appears to be 0. This appears to tell us that it returns the handle to the file used to create the process per the MSDN documentation on GetModuleHandleW.

Let's rename this sub routine get_handle_on_parent. We do this by right clicking on the sub routine name and choose rename.

We then rename the routine to what we want.
After clicking OK, you may get a warning that the name length exceeds the limit (15). Do you want to increase the limit. Click Yes. You will now see the more meaningful name in the code. Anywhere this sub routine is referenced will be changed automatically for you as well.

You will now run through the remaining sub routines and name them appropriately. This will help make more sense of the code. You may not get the sense of all of the routines. One word of advise that I would give is to spend a few minutes trying to figure out what it does. If you don't get it, don't sweat it. Move on to another and identify all of the routines you can. Maybe then, others will start to make sense and you will be able to figure them out.

I'm not going to go into detail on this sample. I was really using this as an introduction to IDA Pro. Next post, I will do the same thing but with OllyDBG. One last thing I will show is how the Hex-Rays decompiler helps make understanding functions a little easier. I have highlighted the sub routine that I want to understand in the following screen shot located at 00403D69:

I will now navigate to the view menu, open sub views and then choose Pseudocode. Alternately you can press F5. This will open a new window called Pseudocode-A for the first window and subsequently Pseudocode-B, Pseudocode-C etc. Below is what that window looks like for this routine:

As you can see, that looks a lot like standard programming which you may be more comfortable with than Assembly as I am. The more you clean up your functions and variable names, the easier this will be to read. If you know programming at all though, you can get the jest of it without cleaning much.

Again, you may be eager to learn what this sample does, but alas I am not going to fill that void for you this week. I just wanted to show some general options that are available in IDA Pro to help you understand what a binary does via static analysis.

If you have any questions, please let us know. In my next post, I will take you through some of the features of OllyDBG. After that post, we will begin our analysis only posts. I hope you have found this helpful.


  1. I'm not able to download the malware file now. Do you know of anywhere else I can get it? I'm a little late to the party and wanting to follow along with you, but can't seem to get the malware sample now.

  2. We do not keep copies of the malware. We are going to start posting them to Offensive Computing so they will maintain copies for us. Up until this point, we have just linked directly to the sites containing the infection. A number of people have requested this and we feel the best way is to post a link to the uploaded sample to Offensive Security. We are starting that with the next post.

  3. Excellent, thanks. Love this blog, so keep up the great work!