Monday, July 30, 2012

Adding Value to Timelines

Timeline analysis is valuable to an analyst, in that a timeline of system events provides context, situational awareness, and an increased relative confidence in the data with which the analyst is engaged.

We can increase the value of a timeline by adding events to that timeline, but adding events for it's own sake isn't what we're particularly interested in.  Timeline analysis is a form of data reduction, and adding events to our timeline, for it's own sake, is moving away from that premise.  What we want to do is add events of value, and we can do that in a couple of ways.

Categorizing Events
Individual events within a timeline, in and of themselves, can have little meaning, particular if we're unfamiliar with those specific events.  We try to minimize the amount of information that's in an event, in order to get as many events as we can on our screen and within our field of vision, in order to get some context or situational awareness around that particular event.  As we see events over and over again, we develop something of an "expert" or "experience" recognition system in our minds...we recognize that some events, or groups of events, are most often associated with various system or user activities.  For example, we begin to recognize, through repetition and research, that one event (or a series of events) indicates that a USB device was connected to a system, or a program was installed, or that a user accessed a file with a particular program.  In our minds, we begin to group these events into categories.

Consider this...given the myriad of events listed in the Windows Event Log, particularly on Windows 7 and 2008 R2 systems, having the ability to map events to categories, based on event source and ID pairs, can be extremely valuable to an analyst.  An analyst can do the research regarding an event once, and then add the event source/ID pair, along with an event category to the event mapping file, along with a credible reference.  From that point on, the event mapping file gets used over and over again, automatically mapping event source/ID pairs to the category that the analyst identified.  If there's any question about the meaning or context of a particular event, the reference is right there and available in the event mapping file.

As an example of this event mapping, we may find through analysis and research that the event source WPD-ClassInstaller with the ID 24576 within the System Event Log refers to a successful driver installation, and as such, we might give this event a category ID of "[Driver Install]" for easy reference.  We might also then know to look for events with source UserPnp and IDs 20001 and 20003 in order to identify the USB device that was installed.  This event mapping also allows us to identify specific events of interest, events that we may want to focus on in our exams.

We can then codify this "expert system" (perhaps a better term is an "experience system") by adding category IDs to events.  One benefit of this quicker recognition; we're no longer relying on memory, but instead adding our experience to our timeline analysis process, thereby adding value to the end result.

Note: In the above paragraph, I am not referring to adding category information to an event after the timeline has been generated.  Instead, I am suggesting that category IDs be added to events, so that they "live" with the event.

Another benefit is that by sharing this "experience system" with others, we reduce their initial cost of entry into analyzing timelines, and increase the ability of the group to recognize patterns in the data. By adding the ability to recognize patterns to the group as a whole, we then provide a greater capability for processing the overall data.

Now, some events may fit into several categories at once.  For example, the simple fact that we have an *.automaticDestinations-ms Jump List on a Windows 7 or 2008 R2 system indicates an event category of "Program Execution"; after all, the file would not exist unless an application had been executed.  Depending upon which application was launched, other event categories may also apply to the various entries found within the DestList stream of the Jump List file.  For MS Word, the various entries refer to files that had been accessed; as such, each entry might fall within a "File Access" event category.  As Jump Lists are specific to a user, events extracted from the DestList stream or from the corresponding LNK streams within the Jump List file may also fall within a more general "User Activity" event category.

Incorporating Metadata
One of the things missing from the traditional approach to creating timelines is the incorporation of file metadata into the timeline itself.

Let's say that we run the TSK took fls.exe against an image in order to get the file system metadata for files in a particular volume.  Now we have what amounts to the time stamps from the $STANDARD_INFORMATION attribute (yes, we're assuming NTFS) within the MFT.  This is clearly useful, but depending upon our goals, we can potentially make this even more useful by accessing each of the files themselves and (a) determining what metadata may be available, and (b) providing the results of filtering that metadata.

Here's an example...let's say that you're analyzing a system thought to have been infected with malware of some kind, and you've already run an AV scan or two and not found anything conclusive.  What are some of the things that you could look for beyond simply running an AV scan (or two)?  If there are multiple items that you'd look for, what's the likelihood that you'll remember all of those items, for every single case?  How long does it take you to walk through your checklist by hand, assuming you have one?  Let's take just one potential step in that checklist...say, scanning user's temporary directories.  You open the image in FTK Imager, navigate in the tree view to the appropriate directory, and you see that the user has a lot of files in their temp directory, all with .tmp extensions.  So you start accessing each file via the FTK Imager hex view and you see that some of these files appear to be executable files.  Ah, interesting.  Wouldn't it be nice to have that information in your timeline, to have something that says, "hey, this file with the .tmp extension is really an executable file!"

Let's say you pick a couple of those files at random, export them, and after analysis using some of your favorite tools, determine that some of them are packed or obfuscated in some way.  Wouldn't it be really cool to have this kind of information in your timeline in some way, particularly within the context that you're using for your analysis? 

For an example of why examining the user's temporary folder might be important, take a look at Corey Harrell's latest Malware Root Cause Analysis blog post.

Benefits
Some benefits of adding these two practices to our timeline creation and analysis process is that we automate the collection and presentation of low-hanging fruit, increasing the efficiency at which we do so, and reduce the potential for mistakes (forgetting things, following the wrong path to various resources, etc.).  As such, root cause analysis becomes something that we no longer have to forego because "it takes too long".  We can achieve that "bare metal analysis".

Summary
When creating timelines, we want to look at adding value, not volume (particularly not for volume's sake).  Yes, there is something to be said regarding the value of seeing as much activity as possible that is related to an event, particularly when external sources of information regarding certain aspects of an event may fall short in their descriptions and technical details.  Having all of the possible information may allow you to find a unique artifact that will allow you to better monitor for future activity, to find indications of the incident across your enterprise, or to increase the value of the intelligence you share with other organizations.


2 comments:

dfirfpi said...

I agree entirely with you. Categorizing Events is invaluable for the analysis. Moreover do once use many times is always a winning approach.

"Timeline analysis is a form of data reduction, and adding events to our timeline, for it's own sake, is moving away from that premise"

Probably it's a defensive approach but I prefer to create timelines with all information I can get. Then by using an entry point (a timestamp, an indicator of something) I start analyzing the timeline filtering out (thus reducing) useless entries. I fear that getting data reduction when creating timelines (and not when analyzing them) could make me missing something during events reconstruction.

More precisely I usually create a global timeline (all in) and "local" timelines (file system, log files, logged user, etc.) which I mix if there is the need.

H. Carvey said...

francesco,

Probably it's a defensive approach...

I have no problem whatsoever with having a reason to add events to a timeline.

For example, let's say that I have a system with three user accounts, and I know that the event in question occurred three months ago. If one of the user accounts was used to set up the system, and had not been logged into nor accessed in another way in three years, I might not want to add data from that profile to the timeline.

...could make me missing something during events reconstruction.

I'd suggest that it's more important to look at your toolset, rather than the overall amount of data that you're adding. Are your tools capable of extracting the data that is available?

When I said that it's a data reduction technique, what I meant was, you can go from a 500GB hard drive to about 1 GB (or less) of overall data to look at.

Thanks for your comments.