Saturday, July 03, 2010

Skillz Follow-up

Based on some of the events of the week, and in light of a follow-up post from Eric, I wanted to follow-up on my last blog post.

Earlier this week, I spent an entire day talking with a great group of folks on the other side of the country...a whole day dedicated to just talking about forensics! Some of what we talked about was malware (specifically, 'bot) related, and as part of that, we (of course) talked about some of the issues related to malware characteristics and timeline analysis. One of the aspects that ties all of these topics together is timestomping...when some malware gets installed, the file times (more appropriately, the $STANDARD_INFORMATION attributes within the MFT) are purposely modified. MS provides open APIs (GetFileTime/SetFileTime) that allow for this, and in some cases, the file times for the malware are copied from a legitimate system file.

So, tying this back to "skillz" my previous post, I'd mentioned modifying my own code to extract the information I wanted from the MFT. Interestingly, Eric's post addressed the issue of having the information available, and I tend to agree with the first comment to his post...too many times, some GUI analysis tools get over-crowded and there's just too much stuff around to really make sense of things. Rather than having a commercial analysis app into which I can load an image and have it tell me everything that's wrong, I tend to rely on the goals for the analysis that I work out with the customer...even if it means that I would use separate tools. I don't always need information from the MFT...but sometimes I might. So do I want to pay for a commercial application that's going to attempt to keep up with all of the possible wrong stuff that's out there, or can I use a core set of open-source tools that allow me to get the information I need, when I need it, because I know what I'm looking for?

So, what does the output of my code look like? Check out the image...does this make sense to folks? How about to folks familiar with the MFT? Sure, it makes sense to me...and because the code is open source, I can open the Perl script in an editor or Notepad and see what some of the information means. For example, on the first line, we see:

132315 FILE 2 1 0x38 4 1

What does that mean? The first number is the count, or number of the MFT record. The word "FILE" is the signature...the other possible entry is "BAAD". The number "2" is the sequence number, and the "1" is the link count. Where did I get this? From the code, and it's documentation. I can add further comments to the code, if need be, that describe what various pieces of information mean or refer to, or I can modify the output so that everything's explained right there in the output.

Or, because this stuff is open-source, another options is to just move everything to CSV output, so that it can be opened up as a spreadsheet.

Again, because this is open-source, another option that can be added to the output is to add the offset within the MFT where the entry is found...that's not really necessary, as that offset can be easily computed from the file count, and that may even be intuitively obvious to folks who understand the format of the MFT (and as such may not be necessary for the output).

Now, back to the image. The "0x0010" attribute is the $STANDARD_INFORMATION attribute, and the "0x0030" attribute is the $FILE_NAME attribute. Note the differences in the time stamps. Yet another option available...again, because this is open-source to convert the output, or just the $FILE_NAME attribute information, to the five-field TLN format.

So, one way to approach the issue of analysis is to say, hey, I paid for this GUI application and it should include X, Y, and Z in the output. As you can imagine, after a while, you're going to have one crowded UI, or you're going to have so many layers that you're going to loose track of where everything is located, or how to access it. Another way to approach your analysis is to start with your goals, and go from there...identify what you need, and go get it. Does this mean that you have to be a programmer? Not at all. It just means that you have to have a personal or professional network of friends in the industry...a network that you contribute to and can go to get information, etc.


du212 said...

for the MFT code, are you pointing it to a specific MFT record number or parsing all MFT entries?

Personally, the output is looks good to me, but i think headers are useful to people perhaps not as familiar with MFT structure.

I also tend to prefer CSV(or tab)linear output rather than top to bottom console output, but thats b/c Im usually looking at more than one record @ a time.

good starts tho...I have been following these $filename threads for a while as a simpler way to view/correlate this data(and others) is needed.

Looking ahead to Win7, it'd also be nice to (in a more automated fashion)link Journaling entries to MFT entries, but that is a topic for another thread.

H. Carvey said...

for the MFT code, are you pointing it to a specific MFT record number or parsing all MFT entries?

What's displayed is an excerpt...the code goes through the entire MFT.

...i think headers are useful...

Headers? How so?

Also, I'd suggest that if you're not familiar with the MFT, then maybe that's not something you should be looking at...there's too much of a chance of misinterpretation.

Finally...the code is open source, much like Dave Kovar's code. I haven't posted mine yet, but open source means that someone can go in and add whatever headers they deem necessary.

Thanks for reading and commenting!

Stefan said...

to convert the output, or just the $FILE_NAME attribute information, to the five-field TLN format.

Now that'd be cool. I can already see Kristinn coding an MFT module for log2timeline... :-)