Windows Incident Response: analysis

Showing posts with label analysis. Show all posts

Saturday, January 23, 2016

Analysis

A bit ago, I posted about doing analysis, and that post didn't really seem to get much traction at all. What was I trying for? To start a conversation about how we _do_ analysis. When we make statements to a client or to another analyst, on what are we basing those findings? Somewhere between the raw data and our findings is where we _do_ analysis; I know what that looks like for me, and I've shared it (in this blog, in my books, etc.), and what I've wanted to do for some time is go beyond the passivity of sitting in a classroom, and start a conversation where analysts engage and discuss analysis.

I have to wonder...is this even possible? Will analysts talk about what they do? For me, I'm more than happy to. But will this spark a conversation?

I thought I'd try a different tact this time around. In a recent blog post, I mentioned that two Prefetch parsers had recently been released. While it is interesting to see these tools being made available, I have to ask...how are analysts using these tools? How are analysts using these tools to conduct analysis, and achieve the results that they're sharing with their clients?

Don't get me wrong...I think having tools is a wonderful idea. We all have our favorite tools that we tend to gravitate toward or reach for under different circumstances. Whether it's commercial or free/open source tools, it doesn't really matter. Whether you're using a dongle or a Linux distro...it doesn't matter. What does matter is, how are you using it, and how are you interpreting the data?

Someone told me recently, "...I know you have an issue with EnCase...", and to be honest, that's simply not the case. I don't have an issue with EnCase at all, nor with FTK. I do have an issue with how those tools are used by analysts, and the issue extends to any other tool that is run auto-magically and expected to spit out true results with little to no analysis.

What do the tools really do for us? Well, basically, most tools parse data of some sort, and display it. It's then up to us, as analysts, to analyze that data...interpret it, either within the context of that and other data, or by providing additional context, by incorporating either additional data from the same source, or data from external sources.

RegRipper is a great example. The idea behind RegRipper (as well as the other tools I've written) is to parse and display data for analysis...that's it. RegRipper started as a bunch of scripts I had sitting around...every time I'd work on a system and have to dig through the Registry to find something, I'd write a script to do the actual work for me. In some cases, a script was simply to follow a key path (or several key paths) that I didn't want to have to memorize. In other cases, I'd write a script to handle ROT-13 decoding or binary parsing; I figured, rather than having to do all of that again, I'd write a script to automate it.

For a while, that's all RegRipper did...parse and display data. If you had key words you wanted to "pivot" on, you could do so with just about any text editor, but that's still a lot of data. So then I started adding "alerts"; I'd have the script (or tool) do some basic searching to look for things that were known to be "bad", in particular, file paths in specific locations. For example, an .exe file in the root of the user profile, or in the root of the Recycle Bin, is a very bad thing, so I wanted those to pop out and be put right in front of the analyst. I found...and still find...this to be an incredibly useful functionality, but to date,

Here's an example of what I'm talking about with respect to analysis...I ran across this forensics challenge walk-through recently, and just for sh*ts and grins, I downloaded the Registry hive (NTUSER.DAT, Software, System) files. I ran the appcompatcache.pl RegRipper plugin against the system hive, and found the following "interesting" entries within the AppCompatCache value:

C:\dllhot.exe Tue Apr 3 18:08:50 2012 Z Executed
C:\Windows\TEMP\a.exe Tue Apr 3 23:54:46 2012 Z Executed
c:\windows\system32\dllhost\svchost.exe Tue Apr 3 22:40:25 2012 Z Executed
C:\windows\system32\hydrakatz.exe Wed Apr 4 01:00:45 2012 Z Executed
C:\Windows\system32\icacls.exe Tue Jul 14 01:14:21 2009 Z Executed

Now, the question is, for each of those entries, what do they mean? Do they mean that the .exe file was "executed" on the date and time listed?

No, that's not what the entries mean at all. Check out Mandiant's white paper on the subject. You can verify what they're saying in the whitepaper by creating a timeline from the shim cache data and file system metadata (just the $MFT will suffice); if the files that had been executed were not deleted from the system, you'll see that the time stamp included in the shim cache data is, in fact, the last modification time from the file system (specifically, the $STANDARD_INFORMATION attribute) metadata.

I use this as an example, simply because it's something that I see a great deal of; in fact, I recently experienced a "tale of two analysts", where I reviewed work that had previously been conducted, by two separate analysts. The first analyst did not parse the Shim Cache data, and the second parsed it, but assumed that what the data meant was that the .exe files of interested had been executed at the time displayed alongside the entry.

Again, this is just an example, and not meant to focus the spotlight on anyone. I've talked with a number of analysts, and in just about every conversation, they've either known someone who's made the same mistake misinterpreting the Shim Cache data, or they've admitted to misinterpreting it themselves. I get it; no one's perfect, and we all make mistakes. I chose this one as an example, because it's perhaps one of the most misinterpreted data sources. A lot of analysts who have attended (or conducted) expensive training courses have made this mistake.

Pointing out mistakes isn't the point I'm trying to make...it's that we, as a community, need to engage in a community-wide conversation about analysis. What resources do we have available now, and what do we need? We can't all attend training courses, and when we do, what happens most often is that we learn something cool, and then don't see it again for 6 months or a year, and we forget the nuances of that particular analysis. Dedicated resources are great, but they (forums, emails, documents) need to be searched. What about just-in-time resources, like asking a question? Would that help?

Wednesday, January 20, 2016

Resources, Link Mashup

Monitoring
MS's Sysmon was recently updated to version 3.2, with the addition of capturing opens for raw read access to disks and volumes. If you're interested in monitoring your infrastructure and performing threat hunting at all, I'd highly recommend that you consider installing something like this on your systems. While Sysmon is not nearly as fully-featured as something like Carbon Black, employing Sysmon along with centralized log collection and filtering will provide you with a level of visibility that you likely hadn't even imagined was possible previously.

This page talks about using Sysmon and NXLog.

The fine analysts of the Dell SecureWorks CTU-SO recently had an article posted that describes what the bad guys like to do with Windows Event Logs, and both of the case studies could be "caught" with the right instrumentation in place. You can also use process creation monitoring (via Sysmon, or some other means) to detect when an intruder is living off the land within your environment.

The key to effective monitoring and subsequent threat hunting is visibility, which is achieved through telemetry and instrumentation. How are bad guys able to persist within an infrastructure for a year or more without being detected? It's not that they aren't doing stuff, it's that they're doing stuff that isn't detected due to a lack of visibility.

MS KB article 3004375 outlines how to improve Windows command-line auditing, and this post from LogRhythm discusses how to enable Powershell command line logging (another post discussing the same thing is here). The MS KB article gives you some basic information regarding process creation, and Sysmon provides much more insight. Regardless of which option you choose, however, all are useless unless you're doing some sort of centralized log collection and filtering, so be sure to incorporate the necessary and appropriate logs into your SEIM, and get those filters written.

Windows Event Logs
Speaking of Windows Event Logs, sometimes it can be very difficult to find information regarding various event source/ID pairs. Microsoft has a great deal of information available regarding Windows Event Log records, and I very often can easily find the pages with a quick Google search. For example, I recently found this page on Firewall Rule Processing events, based on a question I saw in an online forum.

From Deus Ex Machina, you can look up a wide range of Windows Event Log records here or here. I've found both to be very useful. I've used this site more than once to get information about *.evtx records that I couldn't find any place else.

Another source of information about Windows Event Log records and how they can be used can often be one of the TechNet blogs. For example, here's a really good blog post from Jessica Payne regarding tracking lateral movement...

With respect to the Windows Event Logs, I've been looking at ways to increase instrumentation on Windows systems, and something I would recommend is putting triggers in place for various activities, and writing a record to the Windows Event Log. I found this blog post recently that discusses using PowerShell to write to the Windows Event Log, so whatever you trap or trigger on a system can launch the appropriate command or run a batch file the contains the command. Of course, in a networked environment, I'd highly recommend a SEIM be set up, as well.

One thought regarding filtering and analyzing Windows Event Log records sent to a SEIM...when looking at various Windows Event Log records, we have to look at them in the context of the system, rather than in isolation, as what they actually refer to can be very different. A suspicious record related to WMI, for example, when viewed in isolation may end up being part of known and documented activity when viewed in the context of the system.

Analysis
PoorBillionaire recently released a Windows Prefetch Parser, which is reportedly capable of handling *.pf files from XP systems all the way up through Windows 10 systems. On 19 Jan, Eric Zimmerman did the same, making his own Prefetch parser available.

Having tools available is great, but what we really need to do is talk about how those tools can be used most effectively as part of our analysis. There's no single correct way to use the tool, but the issue becomes, how do you correctly interpret the data once you have it?

I recently encountered a "tale of two analysts", where both had access to the same data. One analyst did not parse the ShimCache data at all as part of their analysis, while the other did and misinterpreted the information that the tool (whichever one that was) displayed for them.

So, my point is that having tools to parse data is great, but if the focus is tools and parsing data, but not analyzing and correctly interpreting the data, what have the tools really gotten us?

Creating a Timeline
I was browsing around recently and ran across an older blog post (yeah, I know it's like 18 months old...), and in the very beginning of that post, something caught my eye. Specifically, a couple of quotes from the blog post:

...my reasons for carrying this out after the filesystem timeline is purely down to the time it takes to process.

...and...

The problem with it though is the sheer amount of information it can contain! It is very important when working with a super timeline to have a pivot point to allow you to narrow down the time frame you are interested in.

The post also states that timeline analysis is an extremely powerful tool, and I agree, 100%. What I would offer to analysts is a more deliberate approach to timeline analysis, based on what Chris Pogue coined as Sniper Forensics.

Speaking of analysis, the folks at RSA released a really good look at analyzing carrier files used during a phish. The post provides a pretty thorough walk-through of the tool and techniques used to parse through an old (or should I say, "OLE") style MS Word document to identify and analyze embedded macros.

Powershell
Not long ago, I ran across an interesting artifact...a folder with the following name:

C:\Users\user\AppData\Local\Microsoft\Windows\PowerShell\CommandAnalysis\

The folder contained an index file, and a bunch of files with names that follow the format "PowerShell_AnalysisCacheEntry_GUID". Doing some research into this, I ran across this BoyWonder blog post, which seems to indicate that this is a cache (yeah, okay, that's in the name, I get it...), and possibly used for functionality similar to auto-complete. It doesn't appear to illustrate what was run, though. For that, you might want to see the LogRhythm link earlier in this post.

As it turned out, the folder path I listed above was part of legitimate activity performed by an administrator.

Sunday, January 12, 2014

Malware RE - IR Disconnect

Not long ago, I'd conducted some analysis that I had found to be...well, pretty fascinating...and shared some of the various aspects of the analysis that were most fruitful. In particular, I wanted to share how various tools had been used to achieve the findings and complete the analysis.

Part of that analysis involved malware known as PlugX, and as such, a tweet that pointed to this blog post recently caught my attention. While the blog post, as well as some of the links in the post, contains some pretty fascinating information, I found that in some ways, it illustrates a disconnect between the DFIR and malware RE analysis communities.

Caveat
I've noticed this disconnect for quite some time, going back as far as at least this post...however, I'm also fully aware that AV companies are not in the business of making the job of DFIR analysts any easier. They have their own business model, and even if they actually do run malware (i.e., perform dynamic analysis), there is no benefit to them (the AV companies) if they engage in the detailed analysis of host-based artifacts. The simple fact and the inescapable truth is that an AV vendors goals are different from those of a DFIR analyst. The AV vendor wants to roll out an updated .dat file across the enterprise in order to detect and remove all instances of the malware, whereas a DFIR analyst is usually tasked with answering such questions as "...when did the malware first infect the system/infrastructure?", "...how did it get in?", and "...what data was taken?"

These are very different questions that need to be addressed, and as such, have very different models for the businesses/services that address them. This is not unlike the differences between the PCI assessors and the PCI forensic analysts.

Specifically, what some folks on one side find to be valuable and interesting may not be useful to folks on the other side. As such, what's left is two incomplete pictures of the overall threat to the customer, with little (if any) overlap between them. In the end, this simply leads not only both sides to having an incomplete view of what happened, and the result is that what's provided to the customer...the one with questions that need to be answered...aren't provided the value that could potentially be there.

I'd like to use the Cassidian blog post as an example and walk through what I, as a host-based analysis guy, see as some of the disconnects. I'm not doing this to highlight the post and say that something was done wrong or incorrectly...not at all. In fact, I greatly appreciate the information that was provided; however, I think that we can all agree that there are disconnects between the various infosec sub-communities, and my goal here is to see if we can't get folks from the RE and IR communities to come together just a bit more. So what I'll do is discuss/address the content from some of the sections if the Cassidian post.

Evolution
Seeing the evolution of malware, in general, is pretty fascinating, but to be honest, it really doesn't help DFIR analysts understand the malware, to the point where it helps them locate it on systems and answer the questions that the customer may have. However, again...it is useful information and is part of the overall intelligence picture that can be developed of the malware, it's use, and possibly even lead to (along with other information) attribution.

Network Communications
Whenever an analyst identifies network traffic, that information is valuable to SOC analysts and folks looking at network traffic. However, if you're doing DFIR work, many times you're handed a hard drive or an image and asked to locate the malware. As such, whenever I see a malware RE analyst give specifics regarding network traffic, particularly HTTP requests, I immediately want to know which API was used by the malware to send that traffic. I want to know this because it helps me understand what artifacts I can look for within the image. If the malware uses the WinInet API, I know to look in index.dat files (for IE versions 5 through 9), and depending upon how soon after some network communications I'm able to obtain an image of the system, I may be able to find some server responses in the pagefile. If raw sockets are used, then I'd need to look for different artifacts.

Where network communications has provided to be very useful during host-based analysis is during memory analysis, such as locating open network connections in a memory capture or hibernation file. Also, sharing information between malware RE and DFIR analysts has really pushed an examination to new levels, as in the case where I was looking at an instance where Win32/Crimea had been used by a bad guy. That case, in particular, illustrated to me how things could have taken longer or possibly even been missed had the malware RE analyst or I worked in isolation, whereas working together and sharing information provided a much better view of what had happened.

Configuration
The information described in the post is pretty fascinating, and can be used by analysts to determine or confirm other findings; for example, given the timetable, this might line up with something seen in network or proxy logs. There's enough information in the blog post that would allow an accomplished programmer to write a parser...if there were some detailed information about where the blob (as described in the post) was located.

Persistence
The blog post describes a data structure used to identify the persistence mechanism of the malware; in this case, that can be very valuable information. Specifically, if the malware creates a Windows service for persistence. This tells me where to look for artifacts of the malware, and even gives me a means for determining specific artifacts in order to nail down when the malware was first introduced on to the system. For example, if the malware uses the WinInet API (as mentioned above), that would tell me where to look for the index.dat file, based on the version of Windows I'm examining.

Also, as the malware uses a Windows service for persistence, I know where to look for other artifacts associated (Registry keys, Windows Event Log records, etc.) with the malware, again, based on the version of Windows I'm examining.

Unused Strings
In this case, the authors found two unused strings, set to "1234", in the malware configuration. I had seen a sample where that string was used as a file name.

Other Artifacts
The blog post makes little mention of other (specifically, host-based) artifacts associated with the malware; however, this resource describes a Registry key created as part of the malware installation, and in an instance I'd seen, the LastWrite time for that key corresponded to the first time the malware was run on the system.

In the case of the Cassidian post, it would be interesting to hear if the FAST key was found in the Registry; if so, this might be good validation, and if not, this might indicate either a past version of the malware, or a branch taken by another author.

Something else that I saw that really helped me nail down the first time that the malware was executed on the system was the existence of a subkey beneath the Tracing key in the Software hive. This was pretty fascinating and allowed me to correlate multiple artifacts in order to develop a greater level of confidence in what I was seeing.

Not specifically related to the Cassidian blog post, I've seen tweets that talk about the use of Windows shortcut/LNK files in a user's Startup folder as a persistence mechanism. This may not be particularly interesting to an RE analyst, but for someone like me, that's pretty fascinating, particularly if the LNK file does not contain a LinkInfo block.

Once again, my goal here is not to suggest that the Cassidian folks have done anything wrong...not at all. The information in their post is pretty interesting. Rather, what I wanted to do is see if we, as a community, can't agree that there is a disconnect, and then begin working together more closely. I've worked with a number of RE analysts, and each time, I've found that in doing so, the resulting analysis is more complete, more thorough, and provides more value to the customer. Further, future analysis is also more complete and thorough, in less time, and when dealing with sophisticated threat actors, time is of the essence.

Wednesday, December 18, 2013

Shellbags

Dan recently posted what has to be one of the most thorough/comprehensive blog articles regarding the
Windows shellbags artifacts. His post specifically focuses on shellbag artifacts from Windows 7, but the value of what he wrote goes far beyond just those artifacts.

Dan states at the very beginning of his post that his intention is to not focus on the structures themselves, but instead address the practical interpretation of the data itself. He does so, in part through thorough testing, as well as illustrating the output of two tools (one of which is commonly used and endorsed in various training courses) side-by-side.

Okay, so Dan has this blog post...so why I am I blogging about his blog post? First, I think that Dan's post...in both the general and specific sense...is extremely important. I'm writing this blog post because I honestly believe that Dan's post needs attention.

Second, I think that Dan's post is just the start. I opened a print preview of his post, and with the comments, it's 67 pages long. Yes, there's a lot of information in the post, and admittedly, the post is as long as it is in part due to the graphic images Dan includes in his post. But this post needs much more attention than "+1", "Like", and "Good job!" comments. Yes, a number of folks, including myself, have retweeted his announcement of the post, but like many others, we do this in order to get the word out. What has to happen now is that this needs to be reviewed, understood, and most importantly, discussed. Why? Because Dan's absolutely correct...there are some pretty significant misconceptions about these (and admittedly, other) artifacts. Writing about these artifacts online and in books, and discussing them in courses will only get an analyst so far. What happens very often after this is that the analyst goes back to their office and doesn't pursue the artifacts again for several weeks or months, and by the time that they do pursue them, there are still misconceptions about these artifacts.

Shell Items
This discussion goes far beyond simply shellbags, in part because the constituent data structures, the shell items, are much more pervasive on Windows systems that I think most analysts realize, and they're becoming more so with each new version. We've known for some time that Windows shortcut/LNK files can contain shell item ID lists, and with Windows 7, Jump Lists were found to include LNK structures. Shell items can also be found in a number of Registry values, as well, and the number of locations has increased between Vista to Windows 7, and again with Windows 8/8.1

Consider a recent innovation to the Bebloh malware...according to the linked article, the malware deletes itself when it's loaded in memory, and then waits for a shutdown signal, at which point it writes a Windows shortcut/LNK file for persistence. There's nothing in the article that discusses the content of the LNK file, but if it contains only a shell item ID list and no LinkInfo block (or if the two are not homogeneous), then analysts will need to understand shell items in order to retrieve data from the file.

These artifacts specifically need to be discussed and understood to the point where an analyst sees them and stops in their tracks, knowing in the back of their mind that there's something very important about them, and that the modification date and time don't necessarily mean what they think. It would behoove analysts greatly to take the materials that they have available on these (and other) artifacts, put them into a format that is most easily referenced, keep it next to their workstation and share it with others.

Publishing Your Work
A very important aspect of Dan's post is that he did not simply sit back and assume that others, specifically tool authors and those who have provided background on data structures, have already done all the work. He started clean, by clearing out his own artifacts, and walking through a series of tests without assuming...well...anything. For example, he clearly pointed out in his post that the RegRipper shellbags.pl plugin does not parse type 0x52 shell items; the reason for this is that I have never seen one of these shell items, and if anyone else has, they haven't said anything. Dan then made his testing data available so that tools and analysis processes can be improved. The most important aspect of Dan's post is not the volume of testing he did...it's the fact that he pushed aside his own preconceptions, started clean, and provided not just the data he used, but a thorough (repeatable) write-up of what he did. This follows right in the footsteps of what others, such as David Cowen, Corey Harrell and Mari DeGrazia, have done to benefit the community at large.

Post such as Dan's are very important, because very often artifacts don't mean what we may think they mean, and our (incorrect) interpretation of those artifacts can lead our examination in the wrong direction, resulting is the wrong answers being provided as a result of the analysis.

Monday, May 13, 2013

Understanding Data Structures

Sometimes at conferences or during a presentation, I'll provide a list of tools for parsing a specific artifact (i.e., MFT, Prefetch files, etc.), and I'll mention a tool or script that I wrote that presents specific data in a particular format. Invariably when this happens, someone asks for a copy of the tool/script. Many times, these scripts may not be meant for public consumption, and are only intended to illustrate what data is available within a particular structure. As such, I'll ask why, with all of the other available tools, someone would want a copy of yet another tool, and the response is most often, "...to validate the output of the other tools." So, I'm left wondering...if you don't understand the data structure that is being accessed or parsed, how is having another tool to parse it beneficial?

Tools provide a layer of abstraction over the data, and as such, while they allow us access to information within these data structures (or files) in a much more timely manner than if we were to attempt to do so manually, they also tend to separate us from the data...if we allow this to happen. For many of the more popular data structures or sources available, there are likely multiple tools that can be used to display information from those sources. But the questions then become, (a) do you understand the data source(s) being parsed, and (b) do you know what the tool is doing to parse those data structures? Is the tool using an MS API to parse the data, or is it doing so on a binary level?

A great example of this is what many of us will remember seeing when we have extracted Windows XP Event Logs from an image and attempted to open them in the Event Viewer on our analysis system. In some cases, we'd see a message that told us that the Event Log was corrupted. However, it was very often the case that the file wasn't actually corrupted, but instead that our analysis system did not have the appropriate message DLLs installed for some of the records. Microsoft does, however, provide very clear and detailed definitions of the Event Log structures, and as such, tools that do not use the Windows API to parse the Event Log files can be used to much greater effect, to include parsing individual records from unallocated space. This could not be done without an understanding of the data structures.

Not long ago, Francesco contacted me about the format of automaticDestinations Jump List files, because he'd run a text search across an image and found a hit "in" one of these files, but parsing the file with multiple tools gave no indication of the search hit. It turned out that understanding the format of MS compound file binary files provides us with a clear indication of how to map unallocated 'sectors' within the Jump List file itself, and determine why he'd seen a search hit 'in' the file, but that hit wasn't part of the output of the commonly-used tools for parsing these files.

Another great example of this came my attention this morning via the SQLite: Hidden Data in Plain Sight blog post from the Linuxsleuthing blog. This blog post further illustrates my point; however, in this case, it's not simply a matter of displaying information that is there but not displayed by the available tools. Rather, it is also a matter of correlating the various information that is available in a manner that is meaningful and valuable to the analyst.

The Linuxsleuthing blog post also asks the question, how do we overcome the shortcomings of the common SQLite Database analysis techniques? That's an important question to ask, but it should also be expanded to just about any analysis technique available, and not isolated simply to SQLite databases. What we need to consider and ask ourselves is, how do we overcome the shortcomings of common analysis techniques?

Tools most often provide a layer of abstraction over available data (structures, files, etc.), allowing for a modicum of automation and allowing the work to be done in a much more timely manner than using a hex editor. However, much more is available to us than simply parsing raw data structures and providing some of the information to the analyst. Tools can parse data based on artifact categories, as well as generate alerts for the analyst, based on known-bad or known-suspicious entries or conditions. Tools can also be used to correlate data from multiple sources, but to really understand the nature and context of that data, the analyst needs to have an understanding of the underlying data structures themselves.

Addendum
This concept becomes crystallized when looking at any shell item data structures on Windows systems. Shell items are not documented by MS, and yet are more and more prevalent on Windows systems as the versions progress. An analyst who correctly understands these data structures and sees them as more than just "a bunch of hex" will reap the valuable rewards they hold.

Shell items and shell item ID lists are found in the Registry (shellbags, itempos* values, ComDlg32 subkey values on Vista+, etc.), as well as within Windows shortcut artifacts (LNK files, Win7 and 8 Jump Lists, Photos artifacts on Windows 8, etc.). Depending upon the type of shell item, they may contain time stamps in DOSDate format (usually found in file and folder entries), or they may contain time stamps in FILETIME format (found in some variable type entries). Again, tools provide a layer of abstraction over the data itself, and as such, the analyst needs to understand the nature of the time stamp, as well as what that time stamp represents. Not all time stamps are created equal...for example, DOSDate time stamps within the shell items are created by converting the file system metadata time stamps from the file or folder that is being referred to, reducing the granularity from 100 nanoseconds to 2 seconds (i.e., the seconds value is multiplied times 2).

Resources
Windows Shellbag Forensics - Note: the first colorized hex dump includes a reported invalid SHITEM_FILEENTRY, in green; it's not actually invalid, it's just a different type of shell item.

Saturday, January 19, 2013

BinMode: Parsing Java *.idx files

One of the Windows artifacts that I talk about in my training courses is application log files, and I tend to sort of gloss over this topic, simply because there are so many different kinds of log files produced by applications. Some applications, in particular AV, will write their logs to the Application Event Log, as well as a text file. I find this to be very useful because the Application Event Log will "roll over" as it gathers more events; most often, the text logs will continue to be written to by the application. I talk about these logs in general because it's important for analysts to be aware of them, but I don't spend a great deal of time discussing them because we could be there all week talking about them.

With the recent (Jan, 2013) issues regarding a Java 0-day vulnerability, my interest in artifacts of compromise were piqued yet again when I found that someone had released some Python code for parsing Java deployment cache *.idx files. I located the *.idx files on my own system, opened a couple of them up in a hex editor and began conducting pattern analysis to see if I could identify a repeatable structure. I found enough information to create a pretty decent parser for the *.idx files to which I have access.

Okay, so the big question is...so what? Who cares? Well, Corey Harrell had an excellent post to his blog regarding Finding (the) Initial Infection Vector, which I think is something that folks don't do often enough. Using timeline analysis, Corey identified artifacts that required closer examination; using the right tools and techniques, this information can also be included directly into the timeline (see the Sploited blog post listed in the Resources section below) to provide more context to the timeline activity.

The testing I've been able to do with the code I wrote has been somewhat limited, as I haven't had a system that might be infected come across my desk in a bit, and I don't have access to an *.idx file like what Corey illustrated in his blog post (notice that it includes "pragma" and "cache control" statements). However, what I really like about the code is that I have access to the data itself, and I can modify the code to meet my analysis needs, much the way I did with the Prefetch file analysis code that I wrote. For example, I can perform frequency analysis of IP addresses or URLs, server types, etc. I can perform searches for various specific data elements, or simply run the output of the tool through the find command, just to see if something specific exists. Or, I can have the code output information in TLN format for inclusion in a timeline.

Regardless of what I do with the code itself, I know have automatic access to the data, and I have references included in the script itself; as such, the headers of the script serve as documentation, as well as a reminder of what's being examined, and why. This bridges the gap between having something I need to check listed in a spreadsheet, and actually checking or analyzing those artifacts.

Resources
ForensicsWiki Page: Java
Sploited blog post: Java Forensics Using TLN Timelines
jIIr: Almost Cooked Up Some Java, Finding Initial Infection Vector

Interested in Windows DF training? Check it out: Timeline Analysis, 4-5 Feb; Windows Forensic Analysis, 11-12 Mar.

Saturday, January 12, 2013

There Are Four Lights: The Analysis Matrix

I've talked a lot in this blog about employing event categories when developing, and in particular, when analyzing timelines, and the fact is that we can use these categories for much more that just adding analysis functionality to our timelines. In fact, using artifact and event categories can greatly enhance our overall analysis capabilities. This is something that Corey Harrell and I have spent a great deal of time discussing.

For one, if we categorize events, we can raise our level of awareness of the context of the data that we're analyzing. Having categories for various artifacts can help us increase our relative level of confidence in the data that we're analyzing, because instead of looking at just one artifact, we're going to be looking at various similar, related artifacts together.

Another benefit of artifact categories is that they help us remember what various artifacts relate to...for example, I developed an event mapping file for Windows Event Log records, so as a tool parses through the information available, it can assign a category to various event records. This way, you no longer have to search Google or look up on a separate sheet of paper what that event refers to...you have "Login" or "Failed Login Attempt" right there next to the event description. This is particularly useful, as of Vista, Microsoft began employing a new Windows Event Log model, which means that there are a LOT more Event Logs than just the three main ones we're used to seeing. Sometimes, you'll see one event in the System or Security Event Log that will have corresponding events in other event logs, or there will be one event all by itself...knowing what these events refer to, and having a category listed for each, is extremely valuable, and I've found it to really help me a great deal with my analysis.

One way to make use of event categories is to employ an analysis matrix. What is an "analysis matrix"? Well, what happens many times is that analysts will get some general (re: "vague") analysis goals, and perhaps not really know where to start. By categorizing the various artifacts on a Windows system, we can create an analysis matrix that provides us with a means for at least begin our analysis.

An analysis matrix might appear as follows:

	Malware Detection	Data Exfil	Illicit Images	IP Theft
Malware	X		X
Program Execution	X	X		X
File Access		X	X	X
Storage Access		X	X	X
Network Access				X

Again, this is simply a notional matrix, and is meant solely as an example. However, it's also a valid matrix, and something that I've used. Consider "data exfiltration"...the various categories we use to describe a "data exfiltration" case may often depend upon what you learn from a "customer" or other source. For example, I did not put an "X" in the row for "Network Access", as I have had cases where access to USB devices was specified by the customer...they felt confident that with how their infrastructure was designed that this was not an option that they wanted me to pursue. However, you may want to add this one...I have also conducted examinations in which part of what I was asked to determine was network access, such as a user taking their work laptop home and connecting to other wireless networks.

The analysis matrix is not intended to be the "be-all-end-all" of analysis, nor is it intended to be written in stone. Rather, it's intended to be something of a living document, something that provides analysts with a means for identifying what they (intend to) do, as well as serve as a foundation on which further analysis can be built. By using an analysis matrix, we have case documentation available to us immediately. An analysis matrix can also provide us with pivot points for our timeline analysis; rather than combing through thousands of records in a timeline, we now not only have a means of going after that information which may be most important to our examination, but it also helps us avoid those annoying rabbit holes that we find ourselves going down sometimes.

Finally, consider this...trying to keep track of all of the possible artifacts on a Windows system can be a daunting task. However, it can be much easier if we were to compartmentalize various artifacts into categories, making it an easier task to manage by breaking it down into smaller, easier-to-manage pieces. Rather than getting swept up in the issues surrounding a new artifact (Jump Lists are new as of Windows 7, for example...) we can simply place that artifact in the appropriate category, and incorporate it directly into our analysis.

I've talked before in the blog about how to categorize various artifacts...in fact, in this post, I talked about the different ways that Windows shortcut files can be categorized. We can look at access to USB devices as storage access, and include sub-categories for various other artifacts.

Interested in Windows DFIR training? Check it out...Timeline Analysis, 4-5 Feb; Windows Forensic Analysis, 11-12 Mar.

Wednesday, October 31, 2012

Shellbag Analysis, Revisited...Some Testing

I blogged previously on the topic of Shellbag Analysis, but I've found that in presenting on the topic and talking to others, there may be some misunderstanding of how these Registry artifacts may be helpful to an analyst. With Jamie's recent post on the Shellbags plugin for Volatility, I thought it would be a good idea to revisit this information, as sometimes repeated exposure is the best way to start developing an understanding of something. In addition, I wanted to do some testing in order to determine the nature of some of the metadata associated with shellbags.

In her post, Jamie states that the term "Shellbags" is commonly used within the community to indicate artifacts of user window preferences specific to Windows Explorer. MS KB 813711 indicates that the artifacts are created when a user repositions or resizes an Explorer windows.

ShellItem Metadata
As Jamie illustrates in her blog post, many of the structures that make up the SHELLITEMS (within the Shellbags) contain embedded time stamps, in DOSDate format. However, there's still some question as to what those values mean (even though the available documentation refers to them as MAC times for the resource in question) and how an analyst may make use of them during an examination.

Having some time available recently due to inclement weather, I thought I would conduct a couple of very simple tests in to begin to address these questions.

Testing Methodology
On a Windows 7 system, I performed a number of consecutive, atomic actions and recorded the system time (visible via the system clock) for when each action was performed. The following table lists the actions I took, and the time (in local time format) at which each action occurred.

Action	Time
Create a dir: mkdir d:\shellbag	12:54pm
Create a file in the dir: echo "..." > d:\shellbag\test.txt	1:03pm
Create another dir: mkdir d:\shellbag\test	1:08pm
Create a file in the new dir: echo "..." > d:\shellbag\test\test.txt	1:16pm
Delete a file: del d:\shellbag\test.txt	1:24pm
Open D:\shellbag\test via Explorer, reposition/resize the window	1:29pm
Close the Explorer window opened in the previous step	1:38pm

The purpose of having some time pass between actions is so that they can be clearly differentiated in a timeline.

Once these steps were completed, I restarted the system, and once it came back up, I extracted the USRCLASS.DAT hive from the relevant user account into the D:\shellbag directory for analysis (at 1:42pm). I purposely chose this directory in order to determine how actions external to the shellbags artifacts affect the overall data seen.

Results
The following table lists the output from the shellbags.pl RegRipper plugin for the directories in question (all times are in UTC format):

Directory	MRU Time	Modified	Accessed	Created
Desktop\My Computer\D:\shellbag	2012-10-29 17:29:25	2012-10-29 17:24:26	2012-10-29 17:24:26	2012-10-29 16:55:00
Desktop\My Computer\D:\shellbag\test	2012-10-29 17:29:29	2012-10-29 17:16:20	2012-10-29 17:16:20	2012-10-29 17:08:18

Let's walk through these results. First, I should remind you that that MRU Time is populated from Registry key LastWrite times (FILETIME format, granularity of 100 ns) while the MAC times are embedded within the various shell items (used to reconstruct the paths) in DOSDate time format (granularity of 2 seconds).

First, we can see that the Created dates for both folders correspond approximately to when the folders were actually created. We can also see that the same thing is true for the Modified dates. Going back to the live system and typing "dir /tw d:\shell*" shows me that the last modification time for the directory is 1:42pm (local time), which corresponds to changes made to that directory after the USRCLASS.DAT hive file was extracted.

Next, we see that MRU Time values correspond approximately to when the D:\shellbag\test folder was opened and then resized/repositioned via the Explorer shell, and not to when the Explorer window was actually closed.

Based on this limited test, it would appear that the DOSDate time stamps embedded in the shell items for the folders correspond to the MAC times of that folder, within the file system, at the time that the shell items were created. In order to test this, I deleted the d:\shellbag\test\test.txt file at 2:14pm, local time, and then extracted a copy of the USRCLASS.DAT and parsed it the same way I had before...and saw no changes in the Modified times listed in the previous table.

In order to test this just a bit further, I opened Windows Explorer, navigated to the D:\shellbag folder, and repositioned/resized the window at 2:21pm (local time), waited 2 minutes, and closed the window. I extracted and parsed the USRCLASS.DAT hive again, and this time, the MRU Time for the D:\shellbag folder had changed to 18:21:48 (UTC format). Interestingly, that was the only time that had changed...the Modified time for the D:\shellbag\test folder remained the same, even though I had deleted the test.txt file from that directory at 2:14pm local time ("dir /tw d:\shellbag\te*" shows me that the last written time for that folder is, indeed, 2:14pm).

Summary
Further testing is clearly required; however, it would appear that based on this initial test, we can draw the following conclusions with respect to the shellbag artifacts on Windows 7:

1. The embedded DOSDate time stamps appear to correspond to the MAC times of the resource/folder at the time that the shell item was created. If the particular resource/folder was no longer present within the active file system, an analyst could use the Created date for that resource in a timeline.

2. Further testing needs to be performed in order to determine the relative value of the Modified date, particularly given that events external to the Windows Explorer shell (i.e., creating/deleting files and subfolders after the shell items have been created) may have limited effect on the embedded dates.

3. The MRU Time appears to correspond to when the folder was resized or repositioned. Analysts should keep in mind that (a) there are a number of ways to access a folder that do not require the user to reposition or resize the window, and (b) the MRU Time is a Registry key LastWrite time that only applies to one folder within the key...the Most Recently Used folder, or the one listed first in the MRUListEx value.

I hope that folks find this information useful. I also hope that others out there will look at this information, validate it through their own testing, and even use it as a starting point for their own research.

Monday, September 03, 2012

Links, Tools, Etc.

Windows 8 Forensics
There is some great information circulating about the Interwebs regarding Windows 8 forensics. There's this YouTube video titled A Forensic First Look, this blog post that addresses reset and refresh artifacts, the Windows 8 Forensics Guide (this PDF was mentioned previously on this blog), this blog post on the Windows 8 TypedURLsTime Registry key, and Kenneth Johnson's excellent PDF on Windows 8 File History...in addition to a number of other available resources. Various beta and pre-beta versions of Windows 8 have been out for some time, and with each release there seems to be something new...when I went from the first version available for developers to the Consumer Preview, one of the first things I noticed was that I was no longer able to disable the Metro interface.

So what does all this mean? Well, just like when Windows XP was released, there were changes that would affect how we within the digital analysis community would do our jobs, and the same thing has been true since then with every new OS release. While our overall analysis process wouldn't change, there are aspects of the new operating system and it's included technologies that require us to update the specifics of those processes.

Timeline Analysis
Over at the Sploited blog, there's an excellent post on how to incorporate Java information into your TLN-format timeline, in order to help determine the exploit used to compromise a system. In addition to the information available in the two previous posts (here, and here, respectively), this post includes code for parsing .idx files, and incorporating log entries into a TLN-format timeline.

Just to be clear, this is NOT a RegRipper plugin (there is often times confusion about this...), but is instead a file parser that you can use to incorporate data into your timeline, similar to parsing Prefetch file metadata. As such, it can very often add some much-needed detail and context to your analysis.

Posts such as this go hand-in-hand with the excellent work that Corey Harrell has done in determining exploit footprints on compromised systems.

PList Parser
If you do forensics on iDevices, or you get access to iDevice backups via iTunes on a system, you might want to take a look at Maria's PList Parser. Parsing these files can provide you with a great deal of insight into the user's behavior while using the device. Maria said that she used RegRipper as the inspiration for her tool, and it's great to see tools like this become available.

ScheduledTask File Parser
Jamie's released a .job file parser, written in Python. These files, on WinXP and 2003 systems, are in a binary format (in later versions of Windows, they're XML) and like other files (ie, Prefetch files) can contain some significant metadata. In the past, I've found analysis of these artifacts to particularly useful when responding to incidents involving certain threat actors that moved laterally within the compromised infrastructure...one way of doing so was to schedule tasks on remote systems.

Not only does Jamie provide an explanation of what a .job file "looks like", but she also provides references so that folks can look this information up themselves, and develop a deeper understanding of what the tool is doing, should they choose to do so. Also, don't forget the great work Jamie has done with her MBR parser, particularly if you're performing some sort of malware detection on an acquired image.

Registry Analysis
I ran across this write-up on Wiper recently via Twitter,

In the write-up, the authors state:

"...we came up with the idea to look into the hive slack space for deleted entries."

Hhhmm...okay. My understanding of "slack space", with respect to the file system, is that it's usually considered to be what's left over between the logical and physical space consumed by a file. Let's say that there's a file that's 892 bytes, and in order to save it to disk, the system will allocate 2 512 byte sectors, or 1024 bytes. As such, the slack space would be that 132 bytes that remains between the logical end of the file, and end of the second physical sector.

Now, this can be true for the hive files themselves, as some data may exist between the logical end of the hive, and the end of the last physical sector. This may also be true for value data, as well...if the 1024 bytes are allocated for a value, but only 892 bytes are actually written to the allocated space, there may be slack space available.

However, if you look at the graphic associated with the comment (excellent use of Yaru, guys!), the first 4 bytes (DWORD) of the selected data are a positive value, indicating that the key was deleted. As such, the key becomes part of the unallocated space of the hive file, just like the sectors of a deleted file become part of the unallocated space of a volume or disk. So, the value appears to have been part of unallocated space of the hive file, rather than slack space.

With respect to overall Registry analysis, perhaps "...we came up with the idea..." isn't the most systematic approach to that analysis. Admittedly, the authors found something very interesting, but I'd be interested to know if the authors found an enum\Root\Legacy_RAHDAUD64 key in that Registry hive they were looking at, or if they found a Windows Event Log record with source "Service Control Manager" and an ID of 7035 (indicating a service start message had been sent), and then opted to check for deleted keys in the hive after determining that there was no corresponding visible keys for a service of that name in the System hive.

Looking for Suspicious EXEs
Adam wrote an interesting blog post on finding suspicious PE files via clustering...in short, assuming PE files may have been subject to timestomping (ie, intentional modification of MFT $STANDARD_INFORMATION attribute time stamps), and attempting to detect these files by "clustering" the PE file compile times.

You can read more about methods for detecting malicious files by reading Joel Yonts' GIAC Gold Paper, Attributes of Malicious Files.

Wednesday, August 15, 2012

ShellBag Analysis

What are "shellbags"?
To get an understanding of what "shellbags" are, I'd suggest that you start by reading Chad Tilbury's excellent SANS Forensic blog post on the topic. I'm not going to try to steal Chad's thunder...he does a great job of explaining what these artifacts are, so there's really no sense in rehashing everything.

Discussion of this topic goes back well before Chad's post, with this DFRWS 2009 paper. Before that, John McCash talked about ShellBag Registry Forensics on the SANS Forensics blog. Even Microsoft mentions the keys in question in KB 813711.

Without going into a lot of detail, a user's shell window preferences are maintained in the Registry, and the hive and keys being used to record these preferences will depend upon the version of the Windows operating system. Microsoft wants the user to have a great experience while using the operating system and applications, right? If a user opens up a window on the Desktop and repositions and resizes that window, how annoying would it be to shut the system down, and have to come back the next day and have to do it all over again? Because this information is recorded in the Registry, it is available to analysts who can parse and interpret the data. As such, "ShellBags" is sort of a colloquial term used to refer to a specific area of Registry analysis.

Tools such as Registry Decoder, TZWorks sbag, and RegRipper are capable of decoding and presenting the information available in the ShellBags.

How can ShellBags help an investigation?
I think that one of the biggest issues with ShellBags analysis is that, much like other lines of analysis that involve the Windows Registry, they're poorly understood, and as such, underutilized. Artifacts like the ShellBags can be very beneficial to an examiner, depending upon the type of examination they're conducting. Much like the analysis of other Windows artifacts, ShellBags can demonstrate a user's access to resources, often well after that resource is no longer available. ShellBag analysis can demonstrate access to folders, files, external storage devices, and network resources. Under the appropriate conditions, the user's access to these resources will be recorded and persist well after the accessed resource has been deleted, or is no longer accessible via the system.

If an organization has an acceptable use policy, ShellBags data may demonstrate violations of that policy, by illustrating access to file paths with questionable names, such as what may be available via a thumb drive or DVD. Or, it may be a violation of acceptable use policies to access another employee's computer without their consent, such as:

Desktop\My Network Places\user-PC\\\user-PC\Users

...or to access other systems, such as:

Desktop\My Network Places\192.168.23.6\\\192.168.23.6\c$

Further, because of how .zip files are handled by default on Windows systems, ShellBag analysis can illustrate that a user not only had a zipped archive on their system, but that they opened it and viewed subfolders within the archive.

This is what it looks like when I accessed Zip file subfolders on my system:

Desktop\Users\AppData\Local\Temp\RR.zip\DVD\RegRipper\DVD

Access to devices will also be recorded in these Registry keys, including access to specific resources on those devices.

For example, from the ShellBags data available on my own system I was able to see where I'd accessed an Android system:

Desktop\My Computer\F:\Android\data

...as well as a digital camera...

Desktop\My Computer\Canon EOS DIGITAL REBEL XTi\CF\DCIM\334CANON

...and an iPod.

Desktop\My Computer\Harlan s iPod\Internal Storage\DCIM

Another aspect of ShellBags analysis that can be valuable to an examination is by the analyst developing an understanding the actual data structures, referred to as "shell item ID lists", used within the ShellBag. It turns out that these data structures are not only used in other values within the Registry, but they're also used in other artifacts, such as Windows shortcut/LNK files, as well as within Jump List files. Understanding and being able to recognize and parse these structures lets an analyst get the most out of the available data.

Locating Possible Data Exfil/Infil Paths via ShellBags
As information regarding access to removable storage devices and network resources can be recorded in the ShellBags, this data may be used to demonstrate infiltration/infection or data exfiltration paths.

For example, one means of getting data off of a system is via FTP. Many Windows users aren't aware that Windows has a command line FTP client, although some are; in my experience, it's more often intruders who are conversant in the use of the command line client. One way that analysts look for the use of the FTP client (i.e., ftp.exe) is via Prefetch files, as well as via the MUICache Registry key.

However, another way to access FTP on a Windows system is via the Windows Explorer shell itself. I've worked with a couple of organizations that used FTP for large file transfers and had us use this process rather than use the command line client. A couple of sites provide simple instructions regarding how to use FTP via Windows Explorer:

MS FTP FAQ
HostGator: Using FTP via Windows Explorer

Here's what a URI entry looks like when parsed (obfuscated, of course):

Desktop\Explorer\ftp://www.site.com

One of the data types recorded within the ShellBags keys is a "URI", and this data structure includes an embedded time stamp, as well as the protocol (ie, ftp, http, etc.) used in the communications. The embedded time stamp appears (via timeline analysis) to correlate with when the attempt was made to connect to the FTP site. If the connection is successful, you will likely find a corresponding entry for the site in the NTUSER.DAT hive, in the path:

HKCU/Software/Microsoft/FTP/Accounts/www.site.com

Much like access via the keyboard, remote access to the system that provides shell-based control, such as via RDP, will often facilitate the use of other graphical tools, including the use of the Windows Explorer shell for off-system communications. ShellBag analysis may lead to some very interesting findings, not only with respect to what a user may have done, but also other resources an intruder may have accessed.

Summary
Like other Windows artifacts, ShellBags persist well after the accessed resources are no longer available. Knowing how to parse and interpret this Registry data

Parsing ShellBags data can provide you with indications of access to external resources, potentially providing indications of one avenue of off-system communications. If the concern is data infiltration ("how did that get here?"), you may find indications of access to an external resource, followed by indications of access to Zip file subfolders. ShellBags can be used to demonstrate access to resources where no other indications are available (because they weren't recorded some where else, or because they were intentionally deleted), or they can be used in conjunction with other resources to build a stronger case. Incorporation of ShellBag data into timelines for analysis can also provide some very valuable insight that might not otherwise be available to the analyst.

Resources
ForensicsWiki Shell Item page

Thursday, June 14, 2012

Timeline Analysis, and Program Execution

I mentioned previously that I've been preparing for an upcoming Timeline Analysis course offered through my employer. As part of that preparation, I've been using the tools to walk through the course materials, and in particular one of the hands-on exercises that we will be doing in the course.

One of the things I'd mentioned in my previous post is that Rob Lee has done a great deal of work for SANS, particularly in providing an Excel macro to add color-coding of different events to log2timeline output files. I've had a number of conversations and exchanges with Corey Harrell and others (but mostly Corey) regarding event categorization, and the value of adding these categories to a timeline in order to facilitate analysis. This can be particularly useful when working with Windows Event Log data, as there a good number of events recorded by default, and all of that information can be confusing if you don't have a quick visual reference.

As I was running through the exercises, I noticed something very interesting in the timeline with respect to the use of the Autoruns tool from SysInternals; specifically, that there were a good number of artifacts associated with both the download and use of the tool. I wanted to extract just those artifacts directly associated with Autoruns from the timeline events file, in order to demonstrate how a timeline can illustrate indications of program execution. To do so, I ran the following command:

type events.txt | find "autoruns" /i > autoruns_events.txt

...and then to get my timeline...

parse -f autoruns_events.txt > autoruns_tln.txt

...and got the following:

Tue May 29 12:56:02 2012 Z

  FILE                       - ..C. [195166] C:/Windows/Prefetch/AUTORUNS.EXE-1CF578DD.pf

  FILE                       - ..C. [44056] C:/Windows/Prefetch/AUTORUNSC.EXE-C5802224.pf

Tue May 15 21:14:55 2012 Z

  REG      johns-pc         john - M... HKCU/Software/Sysinternals/AutoRuns 

  REG      johns-pc         john - [Program Execution] Software\SysInternals\AutoRuns (EulaAccepted)

Tue May 15 21:14:07 2012 Z

  FILE                       - MA.B [195166] C:/Windows/Prefetch/AUTORUNS.EXE-1CF578DD.pf

Tue May 15 21:13:57 2012 Z

  PREF     johns-PC          - [Program Execution] AUTORUNS.EXE-1CF578DD.pf last run (1)

  REG      johns-pc         john - [Program Execution] UserAssist - C:\tools\autoruns.exe (1)

Tue May 15 21:13:53 2012 Z

  FILE                       - M.C. [640632] C:/tools/autoruns.exe

  FILE                       - M.C. [26] C:/tools/autoruns.exe:Zone.Identifier

  REG      johns-pc     - M... [Program Execution] AppCompatCache - C:\tools\autoruns.exe

Tue May 15 21:13:42 2012 Z

  FILE                       - MAC. [877] C:/Users/john/AppData/Roaming/Microsoft/Windows/Recent/Autoruns.lnk

  JumpList johns-pc         john - C:\Users\john\Downloads\Autoruns.zip

Tue May 15 21:13:32 2012 Z

  FILE                       - MA.B [44056] C:/Windows/Prefetch/AUTORUNSC.EXE-C5802224.pf

Tue May 15 21:13:28 2012 Z

  PREF     johns-PC          - [Program Execution] AUTORUNSC.EXE-C5802224.pf last run (1)

  REG      johns-pc         john - [Program Execution] UserAssist - C:\tools\autorunsc.exe (1)

Tue May 15 21:13:23 2012 Z

  FILE                       - M.C. [49648] C:/tools/autoruns.chm

  FILE                       - M.C. [26] C:/tools/autoruns.chm:Zone.Identifier

  FILE                       - M.C. [559736] C:/tools/autorunsc.exe

  FILE                       - M.C. [26] C:/tools/autorunsc.exe:Zone.Identifier

  REG      johns-pc     - M... [Program Execution] AppCompatCache - C:\tools\autorunsc.exe

Tue May 15 21:12:10 2012 Z

  FILE                       - ...B [877] C:/Users/john/AppData/Roaming/Microsoft/Windows/Recent/Autoruns.lnk

  FILE                       - ..C. [535772] C:/Users/john/Downloads/Autoruns.zip

  FILE                       - ..C. [26] C:/Users/john/Downloads/Autoruns.zip:Zone.Identifier

Tue May 15 21:11:59 2012 Z

  FILE                       - MA.B [535772] C:/Users/john/Downloads/Autoruns.zip

  FILE                       - MA.B [26] C:/Users/john/Downloads/Autoruns.zip:Zone.Identifier

Wed May  9 15:08:16 2012 Z

  FILE                       - .A.B [640632] C:/tools/autoruns.exe

  FILE                       - .A.B [26] C:/tools/autoruns.exe:Zone.Identifier

  FILE                       - .A.B [559736] C:/tools/autorunsc.exe

  FILE                       - .A.B [26] C:/tools/autorunsc.exe:Zone.Identifier

Sat Nov  5 17:52:32 2011 Z

  FILE                       - .A.B [49648] C:/tools/autoruns.chm

  FILE                       - .A.B [26] C:/tools/autoruns.chm:Zone.Identifier

What I find most interesting about this timeline excerpt is that it illustrates a good deal of interaction with respect to the download and launch of the tool within it's eco-system, clearly demonstrating Locard's Exchange Principle. Now, there are also a number of things that you don't see...for example, this timeline is comprised solely of those lines that included the word "autoruns" (irrespective of case) somewhere in the line; as such, we won't see things such as the query to the "Image File Execution Options" key, to determine if there's been a debugger assigned to the tool, nor do you see ancillary events or those that might be encoded. However, what we do see will clearly allow us to "zoom in" on a specific time window within the overall timeline, and see what other events may be listed there.

The timeline is clearly very illustrative. We can see the download of the tool (in this case, via Chrome to a Windows 7 platform), and the assignment of the ":Zone.Identifier" ADSs, something that with XP SP2 was done only via IE and Outlook. Beyond the file system metadata, we start to see even more context, simply by adding additional data sources such as the Registry AppCompatCache value data, UserAssist value data, information derived from the SysInternals key in the user's Registry hive, Jump Lists, etc. In this case, the Jump List info in the timeline was extracted from the DestList stream found in the Jump List for the Windows Explorer shell, as zipped archives will often be treated as if they were folders.

Another valuable aspect of this sort of timeline data is that it is very useful in the face of the use of counter-forensics techniques, even those that may be unintentional (i.e., performed by an administrator, not to hide data, but to "clean up" the system). Let's say that this tool had been run, and then deleted; remove all of the "FILE" entries that point to C:/tools from the above timeline, and what do you have left? You have those artifacts that persist beyond the deletion of files and programs, and provide clear indicators that the tools had been used. We can apply this same sort of analysis to other situations where tools had been run (programs executed) on a system, and then some steps taken to obviate or hide the data.

M... [Program Execution] AppCompatCache - C:\tools\autorunsc.exe

The "M..." refers to the fact that, as pointed out by Mandiant, when the tool is run, the file modification time for the tool is recorded in the data structure within the AppCompatCache value. The "[Program Execution]" category identifier, in this case, indicates that the CSRSS flag was set (you'll need to read Mandiant's white paper). The existence of the application prefetch file for the tool, as well as the UserAssist entry, help illustrate that the program had been executed.

One of the unique things about the SysInternals tools is that after they were taken over by Microsoft, they began to have EULA acceptance dialogs added to them. Now, there is a command line switch that you can use to run the CLI versions of the tools and accept the EULA, but the tools will create their own subkey beneath the SysInternals key in the Software hive, and set the "EulaAccepted" value. Even if the tool is renamed, these same artifacts will be left on a system.

File system metadata was extracted from the acquired image using TSK fls.exe. As such, we know that the MACB times are from the $STANDARD_INFORMATION attribute within the MFT, which are highly mutable; that is to say, easily modified to arbitrary values. We can see from the timeline that Autoruns.zip was downloaded on 15 May, and according to the SysInternals web site, an updated version of the tool was posted on 14 May. The files were extracted from the zipped archive, carrying with them some of their original file times, which is why we see ".A.B" times prior to the date that the archive was downloaded. Had the file times been modified to arbitrary values (i.e., "stomped"), rather than the files being deleted, we would still see the other artifacts listed in the timeline, in that order. In essence, we'd have a "signature" for program execution.

Other sources of data that would not appear in a timeline can include, for example, the user's MUICache key. This key simply holds a list of values, and in a number of exams, I've found references to malware that was run on the system, even after the actual files had been removed. Also, if the AutoRuns files had been deleted, I could parse the AutoRuns.lnk Windows shortcut file to get the path to, as well as the MA.B times for, the target file. In order to illustrate that, what follows is the raw output of an LNK file/stream parser:

atime                         Tue May 15 21:11:59 2012
basepath                    C:\Users\
birth_obj_id_node       08:00:27:dd:64:d1
birth_obj_id_seq         9270
birth_obj_id_time        Tue May 15 21:09:27 2012
birth_vol_id                 2C645C57D81C5047B7DDE13C2834AAD2
commonpathsuffix       john\Downloads\Autoruns.zip
ctime                           Tue May 15 21:11:59 2012
filesize                        535772
machineID              john-pc
mtime                         Tue May 15 21:11:59 2012
netname                     \\JOHN-PC\Users
new_obj_id_node        08:00:27:dd:64:d1
new_obj_id_seq          9270
new_obj_id_time        Tue May 15 21:09:27 2012
new_vol_id                 2C645C57D81C5047B7DDE13C2834AAD2
relativepath                ..\..\..\..\..\Downloads\Autoruns.zip
vol_sn                        F405-DAC1
vol_type                   Fixed Disk

The "mtime","atime", and "ctime" values correspond to the MA.B times, respectively, of the target file, which in this case is the Autoruns.zip archive. As such, I could either go back and add the LNK info to my timeline, or automatically have that information added during the initial process of collecting data for the timeline. In this case, what I would expect to see would be MA.B times from both the file system and the LNK file metadata at exactly the same time. Remember, the absence of an artifact where we expect to find one is itself an artifact, and as such, if the Autoruns.zip file system metadata was not available, that would tell me something and perhaps take my analysis in another direction.

[Note: I know you're looking at the above output and thinking, "wow, that looks like a MAC address in the output!" You're right, it is. In this case, looking up the OUI leads us to Cadmus Systems, and yes, the system was from a VM running in VirtualBox. Also, there's a good deal of additional information available in the LNK file metadata, to include the fact that the target file was on a fixed disk, as opposed to a removable or network drive.]

The Value of Multiple Data Sources
Regarding the value of data from multiple sources (even additional locations within the same source, in a comment to his post regarding a RegRipper plugin that he'd written, Jason Hale points out, quite correctly:

I didn't think there was a whole lot of value in the information from the TypedURLsTime key itself (other than knowing that computer activity was occurring at that time) without correlating it with the values in TypedURLs.

Jason actually wrote more than one plugin to extract the TypedURLsTime value data (this key is specific to Windows 8 systems). I've looked at the plugin that outputs in TLN format, for inclusion in a timeline...I use a different source identifier in version I wrote (I use "REG", for consistency...Jason uses "NTUSER.DAT"). However, we both reached point B, albeit via different routes. This will definitely be something I'll be including in my Windows 8 exams.

Key Concepts
1. Employing multiple data sources to develop a timeline of system activity provides context, as well as increases our relative confidence in the data itself.
2. Employing multiple data sources can demonstrate program execution.
3. Employing multiple data sources can illustrate and overcome the use of counter-forensics activities, however unintentional those activities may be.