Saturday, July 31, 2010

Ugh

Sorry, I just don't what other title to use...I wasn't able to come up with something witty or pithy, because all I kept thinking was "ugh".

The "ugh" comes from a question (two, actually, that are quite similar) that appear over and over again in the lists and online forums (forii??)...

I have an image of a system that may have been compromised...how do I prove that data was exfiltrated/copied from the system?

Admit it...we've all seen it. Some may have actually asked this question.

Okay, this is the part where, instead of directly answering the question, I tend to approach the answer from the perspective of getting the person who asked the question to reason through the process to the answer themselves. A lot of people really hate this, I know...many simply want to know which button to click in their forensic application that will give them a list of all of the files that had been copied from the system (prior to the image being acquired).

So, my question to you is...with just an image of supposedly victim system, how would you expect to demonstrate that data was copied or exfiltrated from that system?

Well, there are a couple of things I've done in the past. One is to look for indications of collected data, usually in a file. I've seen this in batch files, VBS scripts, and SQL injection commands...commands are run and the output is collected to a file. From there, you may see additional commands in the web server logs that indicate that the file was accessed...be sure to check the web server response code and the bytes sent by the server, if they're recorded.

In other instances, I've found that the user had attached files to web-based email. Some artifacts left after accessing a GMail account indicated that a file was attached to an email and sent to another address. In several instances, this was a resume...an employee was actively looking for a job and interviewing for that position while on company time. Based on file names and sizes (which may or may not be available), used in conjunction with file last accessed times, we've been able to provide indications that files were sent off of the system.

What else? Well, there's P2P applications...you may get lucky, and the user will have installed one that clearly delineates which files are to be shared. Again, this may only be an indication...you may have to access the P2P network itself and see if the file (name, size, hash) is out there.

What about copying? Most analysts are aware of USB devices by now; however, there is still apparently considerable confusion over what indications of the use of such devices reside within an image. One typical scenario is that a user plugs such a device in and copies files to the device...how would you go about proving this? Remember, you only have the image acquired from the system. The short answer is simply that you can't. Yes, you can show when a device was plugged in (with caveats) and you may have file last access times to provide additional indications, but how do you definitively associate the two, and differentiate the file accesses from, say, a search, an AV scan, or other system activity?

I hope that this makes sense. My point is that contrary to what appears to be popular belief, Windows systems do not maintain a list of files copied off of the system, particularly not in the Registry. If your concern is data exfiltration (insider activity, employee takes data, intruder gets on the system and exfils data...), consider the possible scenarios and demonstrate why they would or wouldn't be plausible (i.e., exfil via P2P would not be plausible if no P2P apps are installed). Reason through the analysis process and provide clear explanations and documentation as to what you did, what you found, and justify your findings.

Thursday, July 29, 2010

Exploit Artifacts redux, plus

As a follow-up to my earlier post where I discussed Stuxnet, I wanted to take a moment to update some of what's come out on this issue in the past 12 days.

Claus wrote a great post that seems pretty comprehensive with respect to the information that's out there and available. He points to the original posts from VirusBlokAda, as well as two posts from F-Secure, the first of which points to the issue targeting SCADA systems. According to the second post:

...the Siemens SIMATIC WinCC database appears to use a hardcoded admin username and password combination that end users are told not to change.

Does that bother anyone?

Okay, all that aside, as always I'm most interested in how we (forensic analysts, incident responders) can use this information to our benefit, particularly during response/analysis activities. This PDF from VirusBlokAda, hosted by F-Secure has some very good information on artifacts associated with this malware. For example, there are two digitally signed driver files, mrxnet.sys and mrxcls.sys (found in the system32\driver directory), as well as several hidden (via attributes) LNK files and a couple of .tmp files. There are also two other files in the system32\inf directory; oem6c.pnf and oem7a.pnf, both of which reportedly contain encrypted data. This MS Threat Encyclopedia entry indicates that these files (and others) may be code that gets injected into lsass.exe. This entry points to online hosts reportedly contacted by the worm/downloader itself, so keep an eye on your DNS logs.

As the two .sys files are drivers, look for references to them both in the Services keys. This also means that entries appear in the Enum\Root keys (thanks to Stefan for the link to ThreatExpert).

This post from Bojan of the SANS ISC has some additional information, particularly that the LNK files themselves are specially-crafted so that the embedded MAC times within the LNK file are all set to 0 (ie, Jan 1, 1970). Outside of that, Bojan says that there is nothing special about the LNK files themselves...but still, that's something.

MS also has a work-around for disabling the display of icons for shortcuts.

So, in summary, what are we looking for? Run RegRipper and see if there are entries in the Services and Enum\Root (I have a legacy.pl plugin, or you can write your own) keys for the drivers. Within the file system (in a timeline), look for the driver files, as well as the .tmp and .pnf files. Should you find those, check the Registry and the appropriate log file (setupapi.log on XP) for a recently connected USB device.

Speaking of artifacts, Rebhip.A looks like a lot of fun...

For those interested, here's some PoC code from Exploit-DB.com.

Addendum: Anyone write a plugin for Rebhip.A yet? Also, I almost missed Pete's post on Stuxnet memory analysis...

Updates

I haven't posted in a while, as I was on a mission trip with a team of wonderful folks involved in Compassion International. There wasn't a lot of connectivity where I was, and to be honest, it was good to get away from computers for a while.

However, in the meantime, things haven't stopped or slowed down in my absence. Matt has added support to F-Response for the Android platform. Also, within 24 hours of the release, a customer had posted a video showing F-Response for Android running on an HTC Desire. I have an Android phone...Backflip...but I have read about how the Android OS is rolling out on more than just phones. Andrew Hoog (viaForensics) has a site on Android Forensics.

The folks at the MMPC site posted about a key logger (Win32/Rebhip.A) recently. There's some information about artifacts that is very useful to forensic analysts and incident responders in the write-up for Rebhip.A. There are some very interesting/useful indicators at the site.

Det. Cindy Murphy has a really nice paper out on cell phone analysis...not Windows-specific, I know, but very much worth a mention. You can find info linked on Eric's blog, as well as a link to the paper on Jesse's blog. I've read through the paper, and though I don't do many/any cell phone exams, the paper is a very good read. If you have a moment, or will have some down time (traveling) soon, I'd highly recommend printing it out and reading it, as well as providing feedback and comments.

Claus has a couple of great posts, like this one on network forensics. As Claus mentions in his post (with respect to wirewatcher), network capture analysis is perhaps most powerful when used in conjunction with system analysis.

In addition, Claus also has a post about a mouse jiggler...comedic/lewd comments aside, I've been asked about something like this by LE in the past, so I thought I'd post a link to this one.

Finally (for Claus's site, not this post...), be sure to check out Claus's Security and Forensics Linkfest: Weekend Edition post, as he has a number of great gems linked in there. For example, there's a link to PlainSight, a recent update to Peter Nordahl-Hagen's tools, WinTaylor from Caine (great for CSI fans), as well as a tool from cqure.net to recover TightVNC passwords. There's more and I can't do it justice...go check it out.

Ken Pryor had a great post over on the SANS Forensic blog entitled I'm here! Now what? The post outlines places you can go for test images and data, to develop your skills. One site that I really like (and have used) is Lance's practicals (go to the blog, and search for "practical"), especially the first one. The first practical has some great examples of time manipulation and has provided a number of excellent examples for timeline analysis.

Do NOT go to Windows-ir.com

All,

I get a lot of comments about links to a domain I used to own called windows-ir.com. I no longer have anything to do with that domain, but I know that there are some older links on this blog that still point there.

Don't go there, folks. And please...no more comments about the site. I'm aware of it, but there's nothing I can do about the content there. If you're going in search of 4-yr-old content, feel free to do so, but please...no more comments about what's there.

Thanks.

Saturday, July 17, 2010

Exploit Artifacts redux

I posted yesterday, and included some discussion of exploit artifacts, the traces left by an exploit, prior to the secondary download. When a system is exploited, something happens...in many cases, that something is an arbitrary command being run and/or something being downloaded. However, the initial exploit will itself have artifacts...they may exist only in memory, but they will have artifacts.

Let's look at an example...from the MMPC blog post on Stuxnet:
What is unique about Stuxnet is that it utilizes a new method of propagation. Specifically, it takes advantage of specially-crafted shortcut files (also known as .lnk files) placed on USB drives to automatically execute malware as soon as the .lnk file is read by the operating system. In other words, simply browsing to the removable media drive using an application that displays shortcut icons (like Windows Explorer) runs the malware without any additional user interaction. We anticipate other malware authors taking advantage of this technique.

So the question at this point is, what is unique about the .lnk files that causes this to happen? A while back, there was an Excel vulnerability, wherein if you opened a specially-crafted Excel document, the vulnerability would be exploited, and some arbitrary action would happen. I had folks telling me that there ARE NO artifacts to this, when, in fact, there has to be a specially-crafted Excel document on the system that gets opened by the user. Follow me? So if that's the case, if I search the system for all Excel documents, and then look for those that were created on the system near the time that the incident was first noticed...what would I be looking for in the document that makes it specially-crafted, and therefore different from your normal Excel document?

So, we know that with Stuxnet, we can look for the two files in an acquired image (mrxcls.sys, mrxnet.sys) for indications of this exploit. But what do we look for in the malicious LNK file?

Why is this important? Well, when an incident occurs, most organizations want to know how it happened, and what happened (i.e., what data was exposed or compromised...). This is part of performing a root cause analysis, which is something that needs to be done...if you don't know the root cause of an incident and can't address it, it's likely to happen again. If you assume that the initial exploit is email-borne, and you invest in applying AV technologies to your email gateway, what effect will that have if it was really browser-borne, or the result of someone bringing in an infected thumb drive?

The MMPC site does provide some information as to the IIV for this issue:
In addition to these attack attempts, about 13% of the detections we’ve witnessed appear to be email exchange or downloads of sample files from hacker sites. Some of these detections have been picked up in packages that supposedly contain game cheats (judging by the name of the file).

Understanding the artifacts of an exploit can be extremely beneficial in determining and addressing the root cause of incidents. In the case of Stuxnet, this new exploit seems to be coupled with rootkit files that are signed with once-legitimate certificates. This coupling may be unique, but at the same time, we shouldn't assume that every time we find .sys files signed by RealTek, that the system had fallen victim to Stuxnet. Much like vulnerability databases, we need to develop a means by which forensic analysts can more easily determine the root cause of an infection or compromise; this is something that isn't so easily done.

Addendum
Another thing that I forgot to mention...we see in the MMPC post that the files are referred to, but nothing about the persistence mechanism. Do we assume that these files just sit there, or do we assume that they're listed as device drivers under the Services key? What about the directory that they're installed in? Do they get loaded into a "Program Files\RealTek" directory, or to system32, or to Temp? All of this makes a huge difference when it comes to IR and forensic analysis, and can be very helpful in resolving issues. Unfortunately, AV folks don't think like responders...as such, I really think that there's a need for an upgrade to vulnerability and exploit analysis, so that the information that would be most useful to someone who suspects that they're infected can respond accordingly.

Thursday, July 15, 2010

Thoughts and Comments

Exploit Artifacts
There was an interesting blog post on the MMPC site recently, regarding an increase in attacks against the Help and Support Center vulnerability. While the post talks about an increase in attacks through the use of signatures, one thing I'm not seeing is any discussion of the artifacts of the exploit. I'm not talking about a secondary or tertiary download...those can change, and I don't want people to think, "hey, if you're infected with this malware, you were hit with the hcp:// exploit...". Look at it from a compartmentalized perspective...exploit A succeeds and leads to malware B being downloaded onto the system. If malware B can be anything...how do we go about determining the Initial Infection Vector? After all, isn't that what customers ask us? Okay, raise your hand if your typical answer is something like, "...we were unable to determine that...".

I spoke briefly to Troy Larson at the recent SANS Forensic Summit about this, and he expressed some interest in developing some kind of...something...to address this issue. I tend to think that there's a great benefit to this sort of thing, and that a number of folks would benefit from this, including LE.

Awards
The Forensic 4Cast Awards were...uh...awarded during the recent SANS Forensic Summit, and it appears that WFA 2/e received the award for "Best Forensics Book". Thanks to everyone who voted for the book...I greatly appreciate it!

Summit Follow-up
Chris and Richard posted their thoughts on the recent SANS Forensic Summit. In his blog post, Richard said:

I heard Harlan Carvey say something like "we need to provide fewer Lego pieces and more buildings." Correct me if I misheard Harlan. I think his point was this: there is a tendency for speakers, especially technical thought and practice leaders like Harlan, to present material and expect the audience to take the next few logical steps to apply the lessons in practice. It's like "I found this in the registry! Q.E.D." I think as more people become involved in forensics and IR, we forever depart the realm of experts and enter more of a mass-market environment where more hand-holding is required?

Yes, Richard, you heard right...but please indulge me while I explain my thinking here...

In the space of a little more than a month, I attended four events...the TSK/Open Source conference, the FIRST conference, a seminar that I presented, and the SANS Summit. In each of these cases, while someone was speaking (myself included) I noticed a lot of blank stares. In fact, at the Summit, Carole Newell stated at one point during Troy Larson's presentation that she had no idea what he was talking about. I think that we all need to get a bit better at sharing information and making it easier for the right folks to get (understand) what's going on.

First, I don't think for an instant that, from a business perspective, the field of incident responders and analysts is saturated. There will always be more victims (i.e., folks needing help) than there are folks or organizations qualified and able to really assist them.

Second, one of the biggest issues I've seen during my time as a responder is that regardless of how many "experts" speak at conferences or directly to organizations, those organizations that do get hit/compromised are woefully unprepared. Hey, in addition to having smoke alarms, I keep fire extinguishers in specific areas of my house because I see the need to immediate, emergency response. I'm not going to wait for the fire department to arrive, because even with their immediate response, I can do something to contain the damage and losses. My point is that if we can get the folks who are on-site on-board, maybe we'll have fewer intrusions and data breaches that are (a) third-party notification weeks after the fact, and (b) actually have some preserved data available when we (responders) show up.

If we, as "experts", were to do a better job of bringing this stuff home and making it understandable, achievable and useful to others, maybe we'd see the needle move just a little bit in the good guy's favor when it comes to investigating intrusions and targeted malware. I think we'd get better responsiveness from the folks already on-site (the real _first_ responders) and ultimately be able to do a better job of addressing incidents and breaches overall.

Tools
As part of a recent forensic challenge, Wesley McGrew created pcapline.py to help answer the questions of the challenge. Rather than focusing on the tool itself, what I found interesting was that Wesley was faced with a problem/challenge, and chose to create something to help him solve it. And then provide it to others. This is a great example of the I have a problem and created something to solve it...and others might see the same problem as well approach to analysis.

Speaking of tools, Jesse is looking to update md5deep, and has posted some comments about the new design of the tool. More than anything else, I got a lot out of just reading the post and thinking about what he was saying. Admittedly, I haven't had to do much hashing in a while, but when I was doing PCI forensic assessments, this was a requirement. I remember looking at the list and thinking to myself that there had to be a better way to do this stuff...we'd get lists of file names, and lists of hashes...many times, separate lists. "Here, search for this hash..."...but nothing else. No file name, path or size, and no context whatsoever. Why does this matter? Well, there were also time constraints on how long you had before you had to get your report in, so anything that would intelligently speed up the "analysis" without sacrificing accuracy would be helpful.

I also think that Jesse is one of the few real innovators in the community, and has some pretty strong arguments (whether he's aware of it or not) for moving to the new format he mentioned as the default output for md5deep. It's much faster to check the size of the file first...like he says, if the file size is different, you're gonna get a different hash. As disk sizes increase, and our databases of hashes increase, we're going to have to find smart ways to conduct our searches and analysis, and I think that Jesse's got a couple listed right there in his post.

New Attack?
Many times when I've been working an engagement, the customer wants to know if they were specifically targeted...did the intruder specifically target them based on data and/or resources, or was the malware specifically designed for their infrastructure? When you think about it, these are valid concerns. Brian Krebs posted recently on an issue where that's already been determined to be the case...evidently, there's an issue with how Explorer on Windows 7 processes shortcut files, and the malware in question apparently targets specific SCADA systems.

At this point, I don't which is more concerning...that someone knows enough about Seimens systems to write malware for them, or that they know that SCADA systems are now running Windows 7...or both?

When I read about this issue, my first thoughts went back to the Exploit Artifacts section above...what does this "look like" to a forensic examiner?

Hidden Files
Not new at all, but here's a good post from the AggressiveVirusDefense blog that provides a number of techniques that you can use to look for "hidden" files. Sometimes "hidden" really isn't...it's just a matter of perception.

DLL Search Order as a Persistence Mechanism
During the SANS Forensic Summit, Nick Harbour mentioned the use of MS's DLL Search Order as a persistence mechanism. Now he's got a very good post up on the M-unition blog...I won't bother trying to summarize it, as it won't do the post justice. Just check it out. It's an excellent read.

Friday, July 09, 2010

SANS Forensic Summit Take-Aways

I attended the SANS Forensic Summit yesterday...I won't be attending today due to meetings and work, but I wanted to provide some follow-up, thoughts, etc.

The day started off with the conference intro from Rob Lee, and then a keynote discussion from Chris Pogue of TrustWave and Major Carole Newell, Commander of Headquarters Division the Broken Arrow Police Dept. This was more of a discussion and less of a presentation, and focused on communications between private sector forensic consultants and (local) LE. Chris had volunteered to provide his services, pro bono, to the department, and Major Newell took him up on his offer, and they both talked about how successful that relationship has been. After all, Chris's work has helped put bad people in jail...and that's the overall goal, isn't it? Private sector analysts supporting LE has been a topic of discussion in several venues, and it was heartening to hear Maj Newell chime in and provide her opinion on the subject, validating the belief that this is something that needs to happen.

There were a number of excellent presentations and panels during the day. During the Malware Reverse Engineering panel, Nick Harbour of Mandiant mentioned seeing the MS DLL Search Order being employed as a malware persistence mechanism. I got a lot from Troy Larson's and Jesse Kornblum's presentations, and sat next to Mike Murr while he tweeted using the #forensicsummit tag to keep folks apprised of the latest comments, happenings, and shenanigans.

Having presented and been on a panel, it was great opportunity to share my thoughts and experiences and get comments and feedback not only from other panelists, but also from the attendees.

One of the things I really like about this conference is the folks that it brings together. I got to reconnect with friends, and talk to respected peers that I haven't seen in a while (Chris Pogue, Matt Shannon, Jesse Kornblum, Troy Larson, Richard Bejtlich), or have never met face-to-face (Dave Nardoni, Lee Whitfield, Mark McKinnon). This provides a great opportunity for sharing and discussing what we're all seeing out there, as well as just catching up. Also, like I said, it's great to discuss things with other folks in the industry...I think that a lot of times, if we're only engaging with specific individuals time and again, we tend to loose site of certain aspects of what we do, and what it means to others...other responders, as well as customers.

If someone asked me to name one thing that I would recommend as a change to the conference, that would be the venue. While some folks live and/or work close to downtown DC and it's easy to get to the hotel where the conference is held, there are a number of locations west of DC that are easily accessible from Dulles Airport (and folks from Arlington and Alexandria will be going against traffic to get there).

Other than that, I think the biggest takeaways, for me, were:

1. We need to share better. I thought I was one of the few who thought this, but from seeing the tweets on the conference and talking to folks who are there, it's a pretty common thread. Sharing between LE and the private sector is a challenge, but as Maj Newell said, it's one that everyone (except the bad guys) benefits from.

2. When giving presentations, I need to spend less time talking about what's cool and spend more time on a Mission Guide (a la Matt Shannon) approach to the material. Throwing legos on the table and expecting every analyst to 'get it' and build the same structure is a waste of time...the best way to demonstrate the usefulness and value of a tool or technique is to demonstrate how it's used.

Thanks to Rob and SANS for putting on another great conference!

Follow-ups
Foremost on Windows (Cygwin build)

Wednesday, July 07, 2010

More Timeline Stuff

I'll be at the SANS Forensic Summit tomorrow, giving a presentation on Registry and Timeline Analysis in the morning, and then participating on a panel in the afternoon. Over all, it looks like this will be another excellent conference, due to the folks attending, their presentations, and opportunities for networking.

I talk (and blog) a lot about timelines, as this is a very powerful technique that I, and others, have found to be very useful. I've given presentations on the subject (including a seminar last week), written articles about it, and used the technique to great effect on a number of investigations. In many instances, this technique has allowed me to "see" things that would not normally be readily apparent through a commercial forensic analysis tool, nor via any other technique.

One of the aspects of Windows systems is that there a wide range of data sources that provide time stamped events and indicators. I mean, the number of locations within a Windows system that provides this sort of information is simply incredible.

To meet my own needs, I've updated my toolkit to include a couple of additional tools. For one, I've created a script that directly parses the IE index.dat files, rather than going through a third-party tool (pasco, Web Historian, etc.). This just cuts down on the steps required, and the libmsiecf tools, mentioned in Cory's Going Commando presentation, does not appear to be readily available to run on Windows systems.

Parsing EVT files is relatively straightforward using tools such as evtparse.pl, and Andreas provides a set of Perl-based tools to parse EVTX (Event Logs from Vista and above) files. As an alternative, I wanted to write something that could easily parse the output of LogParser (free from MS), when run against EVT or EVTX files, using a command such as the following:

logparser -i:evt -o:csv "SELECT * FROM D:\Case\File\SysEvent.EVT" > output.csv

Keep in mind that LogParser uses the native API on the system to parse the EVT/EVTX files, so if you're going to parse EVTX files extracted from a Vista or Windows 2008 or Windows 7 system, you should do so on a Windows 7 system or VM. The output from the LogParser command is easily read and output to a TLN format, and the output from the script I wrote is identical to that of evtparse.pl. This can be very useful, as LogParser can be installed on and run from a DVD or thumb drive, and used in live IR (change "D:\Case\File\SysEvent.EVT" to "System" or "Application"), as well as run against files extracted from acquired images (or files accessible via a mounted image). However, keep in mind that LogParser uses the native API, so if sysevent.evt won't open in the Event Viewer because it is reportedly "corrupted" (which has been reported for EVT files from XP and 2003), then using evtparse.pl would be the preferable approach.

The next tool I'm considering working on is one to parse the MFT and extract the time stamps from the $FILE_NAME attribute into TLN format. This would undoubtedly provide some insight into the truth about what happened on a system, particularly where some sort of timestomping activity has occurred (a la Clampi). This will take some work, as the full paths need to be reassembled, but it should be useful nonetheless.

Tuesday, July 06, 2010

Links

Malware in PDFs
As a responder and forensic analyst, one of the things I'm usually very interested in (in part, because customers want to know...) is determining how some malware (or someone) was first able to get on a system, or into an infrastructure...what was the Initial Infection Vector? I've posted about this before, and the SANS ISC had an interesting post yesterday, as well, regarding malware in PDF files. This is but one IIV, but

Does this really matter beyond simply determining the IIV for malware or an intrusion? I'd say...yes, it does. But why is that? Well, consider this...this likely started small, with someone getting into the infrastructure, and then progressed from there.

PDF files are one way in...Brian Krebs pointed out another IIV recently, which apparently uses the hcp:// protocol to take advantage of an issue in the HextoNum function and allow an attacker to run arbitrary commands. MS's solution/workaround for the time being is to simply delete a Registry key. More information on exploiting the vulnerability can be seen here (the fact that the vulnerability is actively being exploited is mentioned here)...this is a very interesting read, and I would be interested to see what artifacts there may be to the use of an exploit as described in the post. Don's mentioned other artifacts associated with exploiting compiled HelpHTML files, in particular how CHM functionality can be turned into a malware dropper. But this is a bit different, so I'd be interested to see what analysts may be able to find out.

Also, if anyone knows of a tool or process for parsing hh.dat files, please let me know.

Free Tools
For those interested, here's a list of free forensic tools at ForensicControl.com. I've seen where folks have looked for this sort of thing, and the disadvantage of having lists like this out there is that...well...they're they're out there, and not in one centralized location. I know some folks have really liked the list of network security tools posted at InSecure.org, and it doesn't take much to create something like that at other sites. For example, consider posting something on the ForensicsWiki.

Speaking of tools, Claus has a great post from the 4th that mentions some updates to various tools, including ImDisk, Network Monitor, and some nice remote control utilities. If you're analyzing Windows 2008 or Windows 7 systems, you might want to take a look at AppCrashView from Nirsoft...I've been able to find a good deal of corroborating data in Dr. Watson logs on Windows XP/2003 systems, and this looks like it might be just as useful, if not more so.

Shadow Analyzer
There's been some press lately about a tool called "Shadow Analyzer", developed by Lee Whitfield and Mark McKinnon, which is to be used to access files in Volume Shadow Copies. I also see that this has been talked about on the CyberCrime101 podcast...should be a good listen!

On that note, ShadowExplorer is at version 0.7.

Parsing NTFS Journal Files
Seth recently posted a Python script for parsing NTFS Journal Transaction Log files (ie, $USNJRNL:$J files). I don't know about others but I've been doing a lot of a parsing of NTFS-related files, whether it's the MFT itself, or running $LogFile through BinText.

I'm sure that one of the things that would help with folks adopting tools like this, particularly those that may require Python or Perl being installed, is an explanation or examples of how the information can be useful to an examiner/analyst. So, if folks do find that these tools are useful, post something that lets others know why/how you used it, and what you found that supported your examination goals.

Saturday, July 03, 2010

Skillz Follow-up

Based on some of the events of the week, and in light of a follow-up post from Eric, I wanted to follow-up on my last blog post.

Earlier this week, I spent an entire day talking with a great group of folks on the other side of the country...a whole day dedicated to just talking about forensics! Some of what we talked about was malware (specifically, 'bot) related, and as part of that, we (of course) talked about some of the issues related to malware characteristics and timeline analysis. One of the aspects that ties all of these topics together is timestomping...when some malware gets installed, the file times (more appropriately, the $STANDARD_INFORMATION attributes within the MFT) are purposely modified. MS provides open APIs (GetFileTime/SetFileTime) that allow for this, and in some cases, the file times for the malware are copied from a legitimate system file.

So, tying this back to "skillz"...in my previous post, I'd mentioned modifying my own code to extract the information I wanted from the MFT. Interestingly, Eric's post addressed the issue of having the information available, and I tend to agree with the first comment to his post...too many times, some GUI analysis tools get over-crowded and there's just too much stuff around to really make sense of things. Rather than having a commercial analysis app into which I can load an image and have it tell me everything that's wrong, I tend to rely on the goals for the analysis that I work out with the customer...even if it means that I would use separate tools. I don't always need information from the MFT...but sometimes I might. So do I want to pay for a commercial application that's going to attempt to keep up with all of the possible wrong stuff that's out there, or can I use a core set of open-source tools that allow me to get the information I need, when I need it, because I know what I'm looking for?

So, what does the output of my code look like? Check out the image...does this make sense to folks? How about to folks familiar with the MFT? Sure, it makes sense to me...and because the code is open source, I can open the Perl script in an editor or Notepad and see what some of the information means. For example, on the first line, we see:

132315 FILE 2 1 0x38 4 1

What does that mean? The first number is the count, or number of the MFT record. The word "FILE" is the signature...the other possible entry is "BAAD". The number "2" is the sequence number, and the "1" is the link count. Where did I get this? From the code, and it's documentation. I can add further comments to the code, if need be, that describe what various pieces of information mean or refer to, or I can modify the output so that everything's explained right there in the output.

Or, because this stuff is open-source, another options is to just move everything to CSV output, so that it can be opened up as a spreadsheet.

Again, because this is open-source, another option that can be added to the output is to add the offset within the MFT where the entry is found...that's not really necessary, as that offset can be easily computed from the file count, and that may even be intuitively obvious to folks who understand the format of the MFT (and as such may not be necessary for the output).

Now, back to the image. The "0x0010" attribute is the $STANDARD_INFORMATION attribute, and the "0x0030" attribute is the $FILE_NAME attribute. Note the differences in the time stamps. Yet another option available...again, because this is open-source code...is to convert the output, or just the $FILE_NAME attribute information, to the five-field TLN format.

So, one way to approach the issue of analysis is to say, hey, I paid for this GUI application and it should include X, Y, and Z in the output. As you can imagine, after a while, you're going to have one crowded UI, or you're going to have so many layers that you're going to loose track of where everything is located, or how to access it. Another way to approach your analysis is to start with your goals, and go from there...identify what you need, and go get it. Does this mean that you have to be a programmer? Not at all. It just means that you have to have a personal or professional network of friends in the industry...a network that you contribute to and can go to get information, etc.