Wednesday, August 17, 2011

Jump List Analysis

Every now and again, I see questions about Windows forensic analysis such as "what's new/different in Windows 7?"  There are a number of things that are different about Windows 7, some of which may significantly impact how analysts approach an examination involving Windows systems.  While there are some aspects of Windows systems that are just different (Windows Event Logs, Registry, etc.), there are some things that are new technologies.

One of those new technologies is Jump Lists. Windows 7 Jump Lists (see the "Jump Lists" section of this post) are a new and interesting artifact of system usage that may have some significant value during forensic analysis where user activities are of interest.  Jump Lists consist primarily of two file types...the *.automaticDestinations-ms (autodest) files, which are created by the operating system when the user performs certain actions, such as opening files, using the Remote Desktop Connection tool, etc.  The specific Jump Lists produced appear to be associated through file extension analysis...if one use double-clicks a text file on one system, it may open in Notepad, whereas on another system it may open in another editor (I like UltraEdit).  The contents of these Jump Lists appear in application context menus on the TaskBar, as well as the Start Menu.  According to Troy Larson, senior forensic dude for MS, these files follow the OLE/compound document format, with individual numbered streams following the LNK file format.  The autodest files also contain a DestList stream, which according to research performed by Jimmy Weg, appears to be an MRU list, of sorts.

There are tools available to view the contents of the autodest files.  For example, you can use MiTeC's SSViewer to open the files and see the various streams.  From here, you would then need to save the numbered streams and use an LNK file viewer to see the contents of the streams.  There's also Mark Woan's JumpLister, which allows you to view the contents of the numbered streams right there in the tool, automatically parsing the LNK formats.  Chris Brown also added this capability to ProDiscover, including a Jump List Viewer in the tool that parses the contents of the numbered streams.

There are also custom Jump Lists, *.customDestinations-ms (customdest) files, which are created when a user "pins" a file to an application, such as via the TaskBar.  Per Troy, these files appear to consist of stacked segments (not in an OLE container) that are LNK file formats.

Both types of files start with a series of hex characters that are the application identifier, or AppID.  This is an identifier that refers to the specific application that the user was using.  While I've found some short lists of references to AppIDs, I haven't yet found a comprehensive list.  Most of what I have found refers to "fixing" Jump Lists by deleting the appropriate files and starting over.

Addendum: Mark McKinnon recently updated the ForensicsWiki page for Jump List IDs.

In an effort to develop a better understanding of the autodest files, I began digging into the Jump List file structure, and wrote some Perl code that parses the *.automaticDestinations-ms (autodest) Jump List files on a binary level.  This parsing capability consists of two Perl modules; the first parses the autodest Jump List files (maintained in MS OLE/Compound File format) and the DestList stream within those files.  The second module parses the numbered streams, which are maintained in the Windows shortcut/LNK file format.  By combining these two modules, I'm able to parse the autodest Jump List files, correlate the DestList stream entries to the numbered streams, and present the available information in any format (TLN, CSV, XML, etc.) I choose.

So far, this is the only tool that I'm aware of that parses the DestList streams.  I had done some research into the format, and it appears that I was able to figure out at least part of the structure of these streams.  I've also found that various applications maintain different information within the contents of the streams...some maintain file names, other maintain string identifiers that appear to be used similar to a GUID.  One thing of interest, and perhaps significant value to an analyst, is that there's a FILETIME object embedded within each structure, and based on Jimmy Weg's research and input, this appears to be an MRU time.  Each individual structure within the DestList stream has a number that is associated with a numbered stream, so the information can easily be correlated to develop a complete picture of what the Jump List contains.

Here's an interesting example of how the information in Jump Lists can be useful; when a user uses the Remote Desktop Connection tool, the "1bc392b8e104a00e.automaticDestinations-ms" Jump List file is created.  The DestList stream of the Jump List file contains the "MRU Time" for each connection, as well as an identifier string.  However, we can correlate each DestList entry to the corresponding numbered stream within the Jump List file, which is itself maintained in the Windows shortcut/LNK file format; as such, we can extract information such as the basename and command line (if it exists) of the shortcut.  If we combine the two, this would appears as:

C:\Windows\System32\mstsc.exe /v:"10.1.1.23"

The information that is available depends upon how the connection was made; for example, rather than an IP address, the command line element of the LNK stream may contain a system name.  However, what we do have is an action associated with a specific user, that occurred at a specific time.  As this is a Windows 7 system, we may also be able to find additional, historic MRU data in Jump Lists accessed via Volume Shadow Copies.

The code is Perl-based and doesn't use any proprietary or platform-specific modules; while it does make heavy use of seek(), read(), substr(), and unpack(), all of these functions are available in all versions of Perl.  Ideally, this code should run on Windows, Linux, and Mac systems equally well (I don't have a Mac to use for testing).

I opted to create Perl modules for this capability because it is a much more flexible method that allows me to incorporate it into other tools.  For example, I can incorporate the modules into a Perl script (which I have done) that will parse through either individual autodest Jump List files or all such files found in a directory, and list the information they contain in any manner that I choose.  Or, I can write a ProDiscover ProScript.  Or, I can (will) include this in my forensic scanner.  Or, to paraphrase Beyonce, "If you like then you better put a GUI on it!"

Output formats are also a matter of personal choice now.  I'm focusing on TLN and CSV formats for the time being, but there's nothing that restricts me to these formats; XML is a possibility (I simply don't have a style sheet format in mind, so I may not pursue this output format).

Issues
Jump Lists are fairly new...although Windows 7 has been out for a while now, I haven't seen a great deal of discussion or questions in public forums or lists looking for more information about these artifacts.  However, some issues have already come up.  For example, I was contacted recently by someone who indicated that one of available tools for parsing Jump Lists "didn't work".  Initial correspondence indicated that at least one Jump List may have been recovered from unallocated space, but it turned out that the three "problem" Jump Lists were from a live acquisition image, and the applications in question could have been open on the desktop during the acquisition.

This presents an interesting and valid issue...how do you deal with Jump Lists from live acquisition images, where the apps were open during the acquisition (live acquisition may be required for a number of reasons, such as whole disk encryption, etc.)?  Or, what about Jump Lists carved from unallocated space?

The answer is that you need to understand the binary format of the Jump Lists (or know someone who is), because that's really the only way to resolve these issues.  When a tool "doesn't work", you need to either have the understanding of the formats to troubleshoot the issue yourself, or go to the tool author for assistance, or go to another resource for that assistance.  If you're squeamish about sharing information about the issue, or the "problem" Jump List file, even with confidentiality agreements in place, then you're really limiting yourself, and by extension, your analysis.  However, this applies to every facet of an examination (Registry, Event Log, USB device analysis, etc.), not just Jump Lists.  So, the answer is to develop the capability internally, or develop trusted resources that you can reach to for assistance.

Summary
From an analyst's perspective, Jump Lists are a new technology and artifact that need to be better understood.  However, at this point, we have considerable information that clearly indicates that these artifacts have value and should be parsed, and the embedded information included in timelines for analysis.  In many ways, Jump Lists contain analytic attributes similar to the Registry and also to Prefetch files, and are tied to specific user actions.  Further research is required, but it appears at this point that Jump Lists also represent a persistent artifact that remains after files and applications are deleted.  In one test, I installed iTunes 10 on my system, and listened to two CyberSpeak podcasts via iTunes.  The Jump Lists persisted even after I removed the application from my system.

Resources
Code Project: Jump Lists
AppID list 1
ForensicsWiki Jump Lists page

Sunday, August 14, 2011

Updates and Links

ECSAP
I had a great time speaking on timeline analysis at an event last week...it was a great opportunity to get out in front of some folks and talk about this very important and valuable analysis technique.  My impression was that most of the folks in the room hadn't really done this sort of analysis beyond perhaps entering some interpreted times and data into a spreadsheet.

One take-away for me from the conference speaking is that people like to get free stuff.  In this case, I had one DiskLabs Find Evidence keyboard key left, and as I tend to do with conferences where I speak, I also gave away copies of my books...I gave away one copy of DFwOST and one of WRF (both of which were signed).  I hope that the continuing promises of free stuff kept folks coming back into the room from breaks... 

Along those lines, something I would offer back up to conference attendees is that speakers are people just like you, and they like to get free stuff, too...in particular, feedback.  Did what they say make sense?  Was the presentation material something that you feel you can use?  A simple "yes" doesn't really constitute feedback.  Some of us (not just me) have also written books or tools, which we may refer to...and getting feedback on those is always a bonus.  But again..."cool" isn't really "feedback".

I can't speak for every presenter, but I value honest, considered feedback, even if it's negative, over a positive albeit empty statement.  If what I talked about simply isn't useful, please...let me know.  If it's too easy or too hard to understand...let me know.  I think that most folks who present would welcome some honest feedback on what they covered.

Investigation Plans
Chris posted recently on the need to develop an investigation plan prior to doing any analysis.  Chris even outlines an exercise to clarify that, and to keep it firmly planted in your mind while you conduct your analysis.  I tend to do something very similar...I copy what I'm supposed to do (the goals) from the statement of work or contract to the top of the MSWord document I use for case notes, usually right below my description of the exhibits I've received.  From there, I also write a description of my initial approach to the analysis...keyword searches I may want to run, as well as anything of note that I may have available, such as a time frame to work with (i.e., "online banking fraud was found to have begun on 20 March, so begin timeline analysis of the system prior to that date...").

Analysis plans are not set in stone...they are not rigid scripts that you need to follow lock-step, beginning to end.  We all know that no plan survives first contact...the idea of an analysis plan is to get us started and keep us focused on the end goal, what we hope to achieve and what question(s) we need to answer for our customer.  Too many times, we won't have a plan and we'll find something "interesting" and begin running ourselves down that rabbit hole, and by the time we take a breath to look around, the original goals of the exam are nowhere in sight, but we've consumed considerable time getting there.

Timelines
Having presented recently on the topic of timelines, and working on some code to more fully exploit Windows 7 Jump Lists as forensic resources, the creation and use of timelines have been on my mind a lot recently.  I prepared and delivered a 2-day, hands-on course in timelines in June, and recently (ECSAP) presented a 2-hr compressed version of the class...which really doesn't do the subject justice (next time I'll push for at least a 4 hr slot).  One of the things I've been thinking about is how useful timelines can be from both an investigative and an analytic perspective, and how a timeline can be used to answer a wide range of questions.

One example involves artifacts from the use of USB devices on Windows systems; I've seen a number of questions in forums and lists in which the original poster (OP) identifies an anomaly that is either interesting, or needs to be explained as part of the examination...something appears odd with respect to the observed artifacts.  Often the question is, "what could have caused this?", and the answer may be found by developing a timeline of system activity, and identifying surrounding events and context to the observed artifacts.

Malware
Here's a great write-up on the Malware FreakShow 3 presentation provided by two TrustWave SpiderLabs researchers at DefCon19.  The presentation addresses malware found on Windows-based point-of-sale ("POS"...take that any way you like...) devices.

Tools
Need to find Facebook artifacts?  Take a look over on the TrustedSignal blog...there's a post indicating that a Python script has been updated and is available.  You never know when you're going to need something like this...

Resources and Links
Ken's (re)view of GFIRST...what I really like about this post is the amount of his perspective that Ken encapsulates in his post, giving his views and insights of what he saw and experienced.  Too many times in the community when someone talks about an event they attended or a book they read, the review is very mechanical..."the speaker talked about..." or "the book contains eight chapters; chapter 1 covers...".  I think this is odd, in a way, because when I talk to folks about what they want to see in a presentation or book, very often when they're looking for is the author's insights...so, in a way, this sort of goes back to what I was saying in the ECSAP section of this post.

Here's an interesting post regarding not just a trick used by malware to confuse a potential victim, but the post also describes the use of MoonSol's DumpIt and the Volatility Framework.

The DFS guys posted their materials from GFIRST and OMFW...thanks to Andrew and Golden for doing that.  There are a couple of great slide decks available; if you want to see how they investigated a data exfil incident (speaking of analysis plans), take a look at their slide pack from GFIRST...it's like reading their case notes from an exam.  You'll have to excuse a misspelling or two (slide 19 mentions the setup.api log file; spelled correctly on slide 40), but for the most part, their examination of the USB history, et al, from an XP system is a very good view into an actual investigation, and well worth writing into a process or checklist.

A couple of thoughts from the presentation:
slide 38 - instead of writing a "wrapper script" to import the information into Excel, it might be easier to modify the usbstor.pl plugin (use of a "wrapper script" mentioned again in slide 79)
slide 40 - the LastWrite times of the USBStor subkeys are not used to determine the last time the devices were plugged into the system; this is further indicated by the USB Analysis Process illustrated in slide 43
slide 90 - path should read HKLM\System\CurrentControlSet\Services\lanmanserver\Shares

Speaking of OMFW, gleeda posted to her blog and included links to her, MHL, and Moyix's slides.

DFwOST
Richard Bejtlich posted his impressions (not a full-on review) of DFwOST.  Thanks for the vote of confidence on a second edition, Richard...also, I do agree with what he mentioned with respect to the images, but as with WRF, there's not a lot that the authors can do about what the publisher or printer does with images.

Monday, August 08, 2011

Links and Updates

Working Remotely
Thanks to a tweet from Richard Bejtlich, I ran across this very interesting post titled, "Working Remotely".  The post makes a great deal of sense to me, as I joined ISS (now part of IBM) in Feb, 2006, and that's how we rolled at the time.  My boss lived about 2 miles from me, and there was an office in Herndon (with a REALLY great, albeit unused, classroom facility), but we had team members all over...Atlanta, Kansas City, Norfolk, and then as we expanded, Chicago, Corpus Christi, and Tulsa.  We lived near airports (our job was to fly out to perform emergency incident response), and FedEx (or insert your favorite shipment vendor) rounded out our "offices".

Even when we weren't flying, many of us were constantly in touch...so much so that when one person needed assistance with an engagement, it was easy for us to provide whatever support was needed.  Encryption made it very easy to send data for analysis, or for someone to provide insight to, or to write a script to parse a much wider sample of data.  Imagine being on an engagement and needing support...so you send someone a sample of data, and when you wake up, there's a parsing tool in your inbox. 

Something that the article points out is that it takes a certain kind of person to work remotely and that's very true...but when you find them, you need to do everything you can to not just keep them, but grow them.  The article also points out that if you want the best of the best, don't restrict yourself to your local area, or to those who are willing to relocate.  And in today's age, remote communications is relatively easy...if you don't want to bring everyone together once a year (more or less) due to the cost of gas and air fare, Skype and $20 web cam can do a LOT!

Jump Lists
Jimmy Weg has done some testing of Windows 7 Jump Lists (and shared his findings on the Win4n6 group list), and found (thus far) that the DestList stream structure within the Automatic Destination (autodest) Jump List does appear to be an MRU of sorts.  In his testing using Notepad to open text files, the FILETIME object written to the structure for each file correlated to when he opened the files.

When testing Windows Media Player, Jimmy found that there were no MRU entries for the application in the user's Registry hive, nor were any Windows shortcuts/LNK files created in the user's Recent folder.  Jimmy also found that applications such as OpenOffice (AppID: 1b4dd67f29cb1962) created Jump Lists, as well. 

Jimmy mentions Mark Woan's JumpLister application in his post for viewing numbered stream information found within the autodest Jump Lists; this is a very good tool, as is the MiTeC Structured Storage Viewer, although SSView doesn't parse the contents of each stream.  I like to use SSView at this point, although I have written Perl code that will parse the "autodest" Jump List files (those ending in "*.automaticDestinations-ms"), as it is based on the MS OLE format, and each numbered stream is based on the LNK file format.  I have also written code for parsing the DestList stream structure, as well, and thanks to Jimmy's testing, the validity and usefulness of that code is beginning to come to light.  My hope is that by having shared what I've found with respect to the DestList structure thus far, others will continue the research and identify other structure elements that can be of value to an analyst, and share that information.  I've also found some deprecation issues with Perl 5.12, with respect to some of the current Perl modules that handle parsing OLE documents; as such, I've taken a look at the MS documentation on the compound document binary specification, and I'm working on writing a platform-independent Jump List parser.

Troy Larson, senior forensic analyst at Microsoft, added that the DestList stream entries are either an MRU or MFU (most frequently used) list, depending upon the application, and that the order of activities in the DestList stream is reflected when you right-click on a pinned application (to the TaskBar).  The order of items in the DestList stream is apparently determined by how recently/frequently the activity (document opened, etc.) is performed.  Troy went on to mention that as of Windows 7, other methods of tracking files have been deprecated in favor of the API used to create Jump Lists.

CyberSpeak
Ovie's posted a new CyberSpeak podcast, this one addressing the launch of CDFS, which I mentioned in my last blog post.  If you have any questions about this organization, I'd recommend that you download the podcast, and give it a listen. Ovie interviews Det. Cindy Murphy, who's been a member of LE since 1985, and invited me to WACCI last year.

If you want to learn more about CDFS, give this podcast a listen.

Ovie, it's good to have you back, my friend.

Hostile Forensics
Mark Lachniet has released a whitepaper through the SANS Forensics blog site titled, "Hostile Forensics". This is the name given to "penetration-based forensics", in which the forensic analyst uses penetration techniques in order to gain access to a computer system in order to further exploit that system through forensic analysis techniques.

The PDF whitepaper, currently in version 1.0, is available online here.  The paper is 43 pages long, but if this is something that you're interested in, it's well worth the time it takes to read it.  Mark lays out the structure for his proposal, which he states is the result of a "thought experiment". 

Tools

It looks as if x0ner has released PDF X-RAY, an API for static analysis of PDF documents for malicious code.


On a similar note, Cuckoo is a freely available sandbox for analyzing PDF files and malware that runs in VirtualBox.  Cuckoo has it's own web site, as well.  If you're performing malware analysis, this may be something that you'd like to take a look at, along with Yara.  These are all great examples of the use of open-source and free tools for solving problems. 

Friday, August 05, 2011

Friday Updates

Meetup
This past Wed was a great NoVA Forensics Meetup, thanks to Sam Pena's efforts in putting the presentation together.  Sam put the effort into pulling together some information about the background and exploits of LulzSec and Anonymous, and then put forth some great questions for discussion.  After the background material slides, we moved the chairs into a circle and carried on from there!  A great big thanks to Sam for stepping up and giving the presentation, and for everyone who attended.  Also, thanks to ReverseSpace and Richard Harman for hosting.

Next month's meeting will feature a presentation botnets from Mitch Harris, and I've already received two offers for presentations on mobile devices, so stay tuned!

For those interested in attending, here's the FAQ:
- Anyone can attend...you don't need to be part of an organization or anything like that
- There are no fees
- We meet the first Wed of each month, starting at 7pm, at the ReverseSpace location; if you need more information, please see the NoVA Forensics Meetup page off of this blog

CDFS
"CDFS" stands for the "consortium of digital forensics specialists", and is a group dedicated to serving the DF community and providing leadership to guide the future of the profession.  Find out more about the focus and goals of the group by checking out the FAQ.  Also, see Eric Huber's commentary, as well (Eric's on the board).

Eric went on to describe the organization recently on G+:
CDFS isn't another organization offering certification, training, conferences and the like. It's an attempt by the various organizations and individuals to essentially act as a trade organization for the industry.

If you're like me and looking around the site, you're probably wondering, okay, I can become a member for $75 (for an individual) a year, but what does that get me?  Well, apparently, there are efforts afoot to yoke our profession with licensing...now, I say "yoke" because it sounds as if the licensing is being done without a great deal of involvement from our community, sort of like "taxation without representation".  I'm sure that I'm like 99.9999% of the community, and have no idea what's going on in those regards, but you know something, as I think about it, I do think that I'd like to have a vote in how that goes.  I'm not sure that I necessarily want to sit back and wait for someone else to make that decision for me, and then follow along (or not) with whatever licensing requirements are put in place, however arbitrarily. 

If you're curious about how you can be involved as a member, I'm sure that the Objectives page offers some insight as to where efforts will likely be directed.

OMFW
The 2011 OMFW was held recently, ahead of the DFRWS conference in New Orleans.  I had the great fortune of attending the original OMFW in 2008, and from what I hear, this one was just as good if not better.  OMFW pulls together the leaders in memory analysis, and brings them together in one place.  I can't speak to the format of this year's workshop, but if it was anything like the one in 2008, I'm sure that it was fast-paced and full of great information.

Speaking of information, MHL's presentation information (and Prezi) can be accessed here (ignore the publication date of the blog post), and Moyix's presentation can be found here.

Gleeda has graciously made her slides available, as well...she covered timelines, the Registry, Volatility and memory analysis all in one presentation!  What's not to love about that!

Let's not forget that Volatility 2.0 is now available (and Rob has added it to the recently updated SIFT appliance).

Tools
Ever been looking for malware in an image, only to find Symantec AV logs indicating that the malware had been detected and quarantined?  Well, check out the Security Braindump blog post on carving the Symantec VBN files.  Based on what BugBear has provided in the post, it should be pretty straightforward for anyone with a modicum of coding skill to write a decoder for this, if it's something that they need.

If you do any work at all with network traffic captures (i.e., capturing data, analyzing that data, analyzing data captured by others, etc.), then you must be sure to look at NetworkMiner.  Along with Wireshark, this is a very valuable (and free) component to your network traffic analysis arsenal. 

PFIC
I've mentioned before that I'll be speaking at PFIC 2011, along with Chad Tilbury.  It turns out that not only will I be speaking, I'll also be giving a lab, as well.  My talk will be on "Scanning for Low-hanging Fruit during an Investigation", and my lab will be "Intro To Windows Forensics", which will be geared toward first responders.  I'm really looking forward to this opportunity to engage with other practitioners from across the DFIR spectrum...I had a great time at PFIC last year, and had a great dinner one night thanks to Chad.

Timelines
I'm sure that at one point during the conference, the topic of timelines will come up (BTW...I'm doing a lecture/demo next week on timelines).  I think that understanding the "why" and "how" of creating timelines is very important for any analyst or examiner, in part because I have seen a number of exams where malware on the system has taken a number of steps to avoid detection and to foil the responder's investigation.  For example, file names and Registry keys are created with random names, file MAC times ($STANDARD_INFORMATION attribute in the MFT) are "stomped", and there are even indications that the malware attempted to "clean up" it's activity by deleting files.  In most cases, on-board AV never detected the infection, albeit in a few instances, the AV alerted on files being executed from at temp directory (but there was only a detection event, no action was taken) rather than detecting the malware based on some file signature.  In all cases, the AV was up-to-date at the time of infection, although MRT wasn't.  Often, the malware itself isn't detected when the analyst mounts and scans the image; rather, a secondary or tertiary file is detected instead.

In every case, a timeline allowed the analyst to "see" a number of related events grouped together, and based on the types of events, evaluate the relative level of confidence and context of that data and determine what is missing.  For example, finding a Prefetch file for an executable, or a reference to an oddly-named file in a Registry autostart location often leads the analyst to ask, "what's missing?" and go looking for it.

Tuesday, August 02, 2011

Updates and Links

Meetup
Just a reminder to everyone who wasn't able to make it to any of the big conferences going on in New Orleans or Las Vegas this week (or if you returned in time)...the NoVA Forensics Meetup for Aug 2011 will be Wed, 3 Aug, starting at 7pm. 

Be sure to check out the NoVA Forensics Meetup page to see what's going on.

Remember, anyone can come, and you don't need to be part of a group or anything.  There are no fees or anything like that.

All Things Open Source
Sergio Hernando posted some Perl code for performing Chrome forensics, specifically processing the history file via Perl.  For me, it's not so much that Sergio wrote this in Perl, because I can follow instructions and get Python or whatever else installed...no, what I like about this is that not only did Sergio take the time to explain what he was doing, but he shows it through an open-source mechanism.

I really like solutions to DFIR problems that use free or open-source tools, because in most cases, they also don't add so many layers of abstraction that ultimately, all you really know that went on was, "I pushed a button."  Solutions such as what Sergio has provided give us more than just that abstract view into what was done...in this case, it's more along the lines of "...I accessed this SQLite database because it contained this information, and this is what was found/determined, in the context of this other data over here...".

The script can be found at Sergio's Google Code site.

Also, be sure to take a look at Sergio's blog post on using Perl to parse the Firefox Download Manager database.

Techniques
For those of you who weren't able to make it to any of the conferences going about this time of the year (OMFW/DFRWS, BlackHat, etc.), looking out across the landscape of presentations, there were definitely some very interesting topics and titles.  While actually being at the conference affords you the opportunity to experience the flavor of the moment, and to mingle with others in the community, many of the conferences do provide copies of the presentations after the conference, and there's always supporting information available from additional sources.

For example, take this presentation on document exploitation attacks...this sounds like a very interesting presentation.  However, there's also some other information available, as well...for example, take a look at this post from the Cisco Security blog; I found this to be a very interesting open-source solution for extracting EXEs from (in this case, MS Word) documents.  Let's also not forget the Didier Stevens has done considerable work on detecting and extracting suspicious elements from PDF documents.

RegRipper
Speaking of open source and techniques, Corey Harrell put together a great post on how he uses RegRipper to gather information about the operating system he's analyzing.  This is a great use of the tool, and another great example of how an analyst can use the tools that are available to get the job done.

Volatility
For those of you who many not have known, the Open Memory Forensic Workshop (OMFW) was held recently, just prior to DFRWS in New Orleans.  Perhaps one of the most exciting things to come out of the conference (for those of us who couldn't attend) is Volatility 2.0! If you notice, under Downloads, there's a standalone Win32 executable available.

Volatility is one of the best of the open source projects out there.  Not only is the framework absolutely amazing, providing the capability to analyze Windows physical memory in ways that aren't available anywhere else, but it's also a shining example of how a small community of dedicated folks can come together and make this into the project that it is.  If you have any questions at all, start by checking out the Wiki, and if you do use this framework, consider contributing back to the project.

Thursday, July 28, 2011

WFA 3/e

I've mentioned a couple of times, in this blog as well as in a couple of lists, that I'm working on completing Windows Forensic Analysis 3/e, and I thought it would be a good time to give a little bit of information regarding the book. 

First off, while the title includes "third edition", this edition is not one where if you purchased the second edition, you're out of luck.  Rather, the third edition is a companion book to the second edition, so you'll want to have both of them on your shelf (or Kindle).  Where a great deal of 2/e was focused on Windows XP, in 3/e I'm focusing primarily on Windows 7.

The third edition has 8 chapters, as follows:

1. Analysis Concepts - Seeing comments from those who've read DFwOST thus far, and seeing the mileage that Chris Pogue is getting from his Sniper Forensics presentations, it appears that there are a lot of analysts out there who like to hear about the concepts that drive analysis.  It's one thing to say to do timeline analysis and talk about how, but I think that it's something else entirely to discuss why we do timeline analysis, as that's the difference between an analyst who creates a timeline, and one who has a reason, justification and analysis goal for creating a timeline.

2.  Live Response - With this chapter, I wanted to take something of a different approach; rather than writing yet another chapter that gives consultants hints on doing IR, I wanted to provide some thoughts as to how organizations can better prepare for those inevitable DFIR activities.  If 2011 thus far hasn't been enough of an example, maybe it's worth saying again...it's not a matter of if your organization will face a compromise, but when.  I would take that a step further and suggest that if you don't have visibility into your systems and infrastructure, you may have already been compromised.  As a consultant, the biggest issue I've seen during IR is the level of preparedness...there's a huge difference between companies that accept that incidents will occur and take steps to prepare, and those who have "not me" culture/attitude; the latter usually ends up paying out much more, in terms of fees, fines, and court costs.  This is something consultants talk about with their customers, but it's a whole new world when you actually see it in action.

3.  Volume Shadow Copies - This chapter is somewhat self-explanatory.  I was doing some research that involved accessing VSCs, and found pretty much the only way to do what I wanted to do required significant resources (ie, $$).  What I did with this chapter is show how VSCs can be accessed within an acquired image without using expensive solutions, as well as provide some insight into how accessing the VSCs can really provide some very valuable information to an analyst.

4.  File Analysis - This chapter is very similar to the corresponding chapter in WFA 2/e, but focuses on some of the files you're likely to see on Windows 7 systems.  I also reference some of the files that you'll find on Windows XP systems, but are different on Windows 7 systems (ie, format, content).  I cover Jump Lists in this chapter, and not just the LNK-like streams, but also the DestList streams (which appear to be some sort of MRU listing for shortcuts).

5.  Registry Analysis - I know, a lot's been said about Registry analysis, particularly in WRF, but this time, instead of doing a break down of what can be found in the Registry, on a hive-by-hive basis, I'm taking more of a solutions-based approach. For example, I see a LOT of folks in the forums and lists who don't understand the role that the USBStor subkeys play in USB device analysis, so what I've done is take the more common analysis processes (that I see, based on questions asked in lists...) and I'm trying to provide solutions across all hives.

One of the things I face with writing chapters such as this one is that folks will say things like, "I want to know about blah...", and very often, there's already information out there on the subject.  One great example is the Registry ShellBags...Chad Tilbury recently posted on this topic to the SANS Forensic Blog, so given that, I have wonder, "what do you want to know?" and "how much are you willing to support the effort to present/share that topic?"  Now, by "support", I mean through such efforts as providing example hives, or just taking a few minutes to elaborate on your thoughts or questions.

6. Malware Detection - I have had a good number of "here's a hard drive that we think is infected with malware..." exams, and given that there are a number of folks out there who likely get similar cases (LE gets CP cases that evolve into the "Trojan Defense", etc.) I wanted to put together a good resource to help address this issue.  This is not a malware analysis chapter...MHL et al did a fantastic job with this topic in the Malware Analyst's Cookbook, and I'm not about to try to parrot what they've done.  Instead, this chapter addresses the topic of detecting malware within in acquired image, and I even provide a checklist of steps you can use.

Note: Many of the tools mentioned in the book are available online, and those items that are not specifically available now (the malware detection checklist, etc.) will be provided online, as well.  I really don't like the idea of providing a DVD with the book, because there are simply too many issues with getting the materials to people who purchase only the ebook, or leave their DVD at home when they go to work...

7.  Timeline Analysis - In this chapter, I not only present how to create a timeline, but I also discuss the concepts behind why we'd want to create a timeline, as well as some of the uses of timelines folks may not be too familiar with.  I presented these concepts and use case scenarios during a course I taught recently, and they seemed to have been very well received.

8.  Application Analysis - Another class of question I see a lot of in the lists has to do with application artifacts; when you think about it, there isn't too terribly much difference between some classes of dynamic malware analysis, and what you'd do to analyze an application for artifacts.

Now, there are some things that I don't cover in the book, in part because they're covered or addressed through other media or resources.  One example is memory analysis...there are a number of resources already available that cover how to capture physical memory, as well as perform analysis of a Windows memory dump, using the freely available tools. 

I wanted to provide something of a preview, because I do get a lot of those, "...does it cover...??" questions, most often from people at conferences, who are holding a copy of the book while they're asking the question.  The simple fact is that no book can cover everything, and it's especially difficult when analysts don't communicate their needs or desires beforehand.  I've done the best I can to collect up those sorts of things from lists, forums, as well as people I've talked to at conferences...but I know that the question is still going to come up, even after the book is printed.

One thing I would like to add is that, as with my other books, the focus is almost exclusively on free and open source tools to get the job done.  Like I said earlier, many of the tools are already available online, and those other items I've developed and mentioned in the book will be posted to the web when the book goes final.

From the lists and forums, I see a lot of questions regarding Windows 7, specifically, "What has changed from Windows XP?"  Truthfully, this is the WRONG question to ask, albeit a popular one.  But if you really want to know, from an analyst perspective, WFA 3/e goes final (manuscript submitted) in October, so it should be available around the beginning of 2012.

HTH

Monday, July 25, 2011

Updates

WRF Review
Andrew Hay posted a glowing review of Windows Registry Forensics on Amazon recently.  I greatly appreciate those who have purchased the book and taken the time to read it, and I especially appreciate those who have taken the time to write a review.

DFwOST
Speaking of books, it looks as if DFwOST has been picked up as a text book!  Pretty cool, eh?  I certainly hope Cory's proud of this...this is a great testament to the efforts that he put into the book, as he was the lead on this...I was just along for the ride.

One of the interesting things about this is that I've heard that other courses may be picking this book up as a resource, in part due to the focus on open source...many of the digital forensics courses out there are held at community colleges that simply cannot afford to purchase any of the commercial forensic analysis applications.  Also, I do appreciate the "tool monkey" comment from the blog post linked above...let's start folks out with an understanding of what's going on under the hood, and progress from there.  The age of Nintendo forensics is over, folks!

If that's the case for you, either as an instructor or individual practitioner, consider my other books, as well...I focus on free and open source tools almost exclusively, because...well...I simply don't have access to the commercial tools.

NoVA Forensics Meetup
Just a reminder...our next meetup is Wed, 3 Aug, starting at 7pm.  One of our members who attended our last meetup has offered to facilitate a discussion regarding some recent cyber activity and how it affects what we do.  I'm really looking forward to this, as I think that it's a great way for everyone to engage.

For location information, be sure to check out the NoVA Forensics Meetup page on the right-hand side of this blog.

PFIC
The agenda for PFIC 2011 has been posted, and I'll be presenting on Tuesday afternoon.  My presentation will be (hopefully) taking the "Extending RegRipper" presentation a bit further.  It works as it is now, but one of the things I want to do is provide a means for the analyst to designate (via both the UI and CLI) to select which user profiles to include in scans.

Bank Fraud
Yet another bank is being sued by a small business following online banking fraud.  Brian Krebs had done considerable work in blogging about other victims (most recently, the town of Eliot, ME).  What should concern folks about this is that once the victim is breached and the money transfers complete, a battle ensues between the victim and the bank.  What isn't happening is this equation is that even with all the press surrounding this, there continue to be victims, and instead of focusing on better security up front, efforts are expended toward suing the bank for "inadequate security measures". Should the bank have had some sort of anomaly detection in place that said, "hey, this connection isn't from an IP address we recognize..."?  Sure.  Should there be some other sort of authentication mechanism that isn't as easily subverted?  Sure.  There are a lot of things that should have been in place...just ask anyone who does PCI forensic assessments, or even just IR work.

One of the things Brian has recommended in his blog is to do all online transactions via a bootable live CD.  I think that this is a great idea...say your Windows system gets infected with something...if you boot the system to a live Linux distribution, this won't even "see" the malware.  Conduct your transactions and shut the system down, and you're done.

Another measure to consider is something like Carbon Black.  Seriously.  Give the guys at Kyrus a call and ask them about their price point.

Cell Phones As Evidence
Christa Miller recently had a Cops2.0 article published regarding how LEOs should approach cell phones/smart phones.  Reading the article, I think that all of it is excellent advice...but you're probably wondering, "what does this have to do with Windows IR or DF work?"  Well, something for analysts to consider is this...if you're analyzing a Windows computer (ie, laptop) confiscated as part of a search warrant, be sure to look to see if a phone has been sync'd to the system.  Did the user install iTunes, download music, and then load the music on their iPhone?  If so, the phone was likely synced/backed up, as well.  Is the Blackberry Desktop Manager installed?  Did the user back their phone up?  If so, the backup files may proved to be significant and valuable resources during an investigation.

Did you map all of the USB removable storage devices that had been connected to the system?  You don't need to have the management software installed to copy images and videos (hint, hint) off of a phone...just connect it via a USB cable and copy the images (which will likely have some very useful EXIF data available).

analyzemft 2.0 Released!
Matt Sabourin updated David Kovar's analyzemft.py to make it completely OO!  David has done some great work putting the tool together, and Matt's extended it a bit by making it OO, so that it can be called from other Python scripts.

The project is now hosted on Google Code.

Thursday, July 21, 2011

Evading Investigators and Analysts

I recently had an opportunity to spend two days with some great folks in the Pacific Northwest, talking about timeline creation and analysis.  During this time, we talked about a couple of ancillary topics, such as critical thinking, questioning assumptions, looking at what might be unusual on a system, and we even touched briefly on anti-forensics.  If you've read my blog for any considerable length of time, you know that my thoughts on anti-forensics are that most tactics are meant to target the analyst, or the analyst's training.

Along the lines of anti-forensics, a good deal of reading can be found online.  For example, in 2005, the Grugq gave a presentation at BlackHat that addresses anti-forensics.  James Foster and Vincent Lui gave a presentation on a similar subject.  Also, I recently ran across a very interesting article regarding evading the forensic investigator.  One of the most important statements in the article is:

"Here it is important to note that the software makes it possible to store..."

Why is this statement important?  Well, if you consider the statement critically for a moment, you'll see that it points out that this technique for hiding data from an analyst requires additional software to be added to the system.  That's right...this isn't something that can be done using just tools native to the system...something must be added to the system for this technique to be used.  What this means is that while the analyst may not necessarily be able to immediately determine that data may have been hidden using these techniques, they would likely be able to determine the existence of this additional software.  After all, what would be the purpose of hiding data with no way for the person hiding it to access it?  If you wanted to deny the owner access to the data, why not simply wipe it to a level at which it is prohibitively expensive to retrieve?

What are some ways that an analyst might go about finding the additional software?  Well, because many times I deal solely with acquired images, I use a malware detection checklist for those times where my analysis calls for it, and one of the things I check is the MUICache keys for the users, which can provide me with an indication of software that may have been executed, but perhaps not via the shell.  This is just one item on the checklist, however...there are others, and because the checklist is a living document (I can add to it, and note when some things work better than others), there may be additional items added in the future.

Another way to address this is through incident preparation.  For example, if the audit capability for the Windows system includes Process Tracking (and the Event Logs are of sufficient size), then you may find an indication of the software being executing there (using evtrpt.pl against Windows XP and 2003 .evt files works like a champ!).  Another possible proactive approach is to use something like Carbon Black from Kyrus Technology; installing something like this before an incident occurs will provide you with (as well as your responders) with considerable, definitive data.  To see how effective a proactive approach involving Carbon Black can be, take a look at this video.

Beyond this, however, its critical that analysts have the training and knowledge to do their jobs in a quick, efficient, and accurate manner, even in the face of dedicated attempts to obfuscate or hinder that analysis.  One of the things we talked a great deal about in the class is the use of multiple data sources to add context to the data, as well as increase the relative level of confidence in that data.  Chris Pogue talks about this a bit in one of his recent blog posts.  Rather than looking at a single data point and wondering (or as is done in many cases, speculating) "...what could have caused this?", it's better to surround that data point with additional data from other sources in order to not only see the context in which the event took place, but we also have to keep in mind that some data is more mutable (easily changed) than others.

When I teach a class like this, I learn a lot, not just in putting the course materials together, but also from engaging with the attendees.  At one point, I was asked on about how many cases do I create a timeline...my response was, "all of them", and that's fairly accurate.  Now, timeline analysis may not be my primary analysis technique...sometimes, I may have something available (an Event Log record, a file name, etc.) to get me started and the goals of my analysis simply dictate that a timeline be created.  Looking back over my recent examinations, I've created a timeline in just about every instance...either to have a starting point for my analysis, or to provide context or validation to my findings, or even to look for secondary or tertiary artifacts to support my findings.  However, the important thing to keep in mind here is that I'm not letting the technique or tool drive my analysis...quite the opposite, in fact.  I'm using the technique for a specific purpose and because it makes sense.


Another analyst told me not long ago, "...I've been doing timelines for years."  I'm sure that this is the case, as the concepts behind timeline analysis have been around for some time and used in other areas of analysis, as well.  However, I'm willing to bet that most of the analysts that have created timelines have done so by manually entering events that they discover into a spreadsheet, and that the events that they've added are based on limited knowledge of the available data sources.  Also, it has been clear for some time that the value of timelines as an analysis tool isn't completely recognized or understood, in part by looking at questions asked in online forums; many of these questions could be answered by creating a timeline.  So, while many are talking about timeline analysis, I think that its imperative that more of us do it. 

Another thing I have learned through engaging with other analysts over the years is that a lot of this stuff that some of us talk about (timeline or Registry analysis) is great, but in some cases, someone will go to training and then not use what they learned for 6 - 9 (or even more) months.  By then, what they learned has become a fog.  These techniques are clearly perishable skill sets that need to be exercised and developed, or they will just fade away.

An example of this that I've seen a great deal of recently in online forums has to do with tracking USB devices on Windows systems.  In the spring of 2005, Cory Altheide and I published some research that we'd conducted regarding USB device artifacts on Windows systems.  Since then, more has been written about this subject, and Rob Lee has posted (and provided worksheets) to the SANS Forensics blog, not only covering thumb drives, but also drive enclosures.  USB artifacts have been discussed in books such as Windows Forensic Analysis (1/e, 2/e), and Windows Registry Forensics, and I'm including yet another discussion in the upcoming third edition of WFA.  I think that this is important, because with all of the information available (including this page on the Forensics Wiki), there continues to be a misunderstanding of the artifacts and analysis process regarding these devices.  For example, variations on the same question have appeared in multiple forums recently, specifically asking why all (in one case, 20 or more) device keys listed beneath the USBStor subkey have the same LastWrite time, and how could the user have plugged all of the devices into the system at the same time.  The problem with this line of analysis is that the LastWrite times for the subkeys beneath the USBStor key are NOT used to determine when the devices were last connected to the system!  What I normally suggest is that analysts engage with the various resources available, and if they want to know what could be responsible for the keys all having the same LastWrite times, generate a timeline.  Seriously.  Timelines aren't just for analysis anymore...they're a great testing tool, as well.

As a side note, RegRipper has all of the necessary plugins to make  USB device analysis pretty straightforward and simple.  I've been working on a chapter on Registry Analysis for my upcoming Windows Forensic Analysis 3/e, and I haven't had to produce any new plugins.  So, even if you're using Rob's SANS checklists, you can get the data itself using RegRipper.

Resources
IronGeek: Malicious USB Devices

Matthieu recently released DumpIt, which is a fusion of the 32- and 64-bit Windows memory dumping utilities that will create a memory dump in the current working directory (which makes this a great utility to run from a thumb or wallet drive).

Wednesday, July 20, 2011

Carbon Black

I attended a Carbon Black (Cb) demo recently, at the invitation of the great folks of Kyrus.  The demo was intended to show some of the improvements to Cb, in particular the GUI available to quickly and easily mine through the available logs.

For those of you who haven’t heard of Cb…where’ve ya been???  Cb is a sensor that monitors the execution process on Windows systems, and reports on processes and file writes.  A coming update to Cb will also report on writes to the Registry, as well as network connections (source/dest IPs and ports, with a time stamp).

In the demo, Mike Viscuso demonstrated how the GUI can be used to quickly track down a FakeRean installation, and even track it back to an a Java “issue” (more analysis would be needed to determine if it were an exploit) delivered via Firefox.  Mike when through this slowly, and had he gone through this full speed, it would have only taken a couple of seconds.  In the demo, Mike identified three stages of the infection, and identified the executable files associated with each.  The first stage executable was identified by 14 of 42 AV engines on VirusTotal (for each stage, Mike submitted the hash of the file, not the file itself).  The second stage executable was not identified by any of the AV engines, and in fact was reported to not have been previously submitted.  Finally, the third stage executable was identified by 5 of 42 AV engines.

Now, compare that to the “normal” IR approach, and how long it would take to dump memory from that system…and this is assuming that you’ve got your IR toolkit prepped and ready to go, and you have personnel trained in the proper collection and analysis techniques.  How about obtaining a copy of the executable?  Cb does this for you; the traditional approach to doing this could take several hours, under the best conditions.

Finally, Mike could have issued a query to determine if the particular files in question had been “seen” on any other systems in the enterprise, an answer to which would be available within seconds.

Deployment, or “How this beats the current IR model”
The current model used for IR is that someone gets hacked or hit with malware of some kind and calls for help, or someone gets notified by an external third party that their data has been compromised (see annual reports from Verizon, TrustWave, or Mandiant) and calls for help.  Sometime after that, personnel and/or equipment are sent on-site, and all the while, data may continue to be exfiltrated from the infrastructure.  At some point after the call for help, network and host data, and possibly the contents of memory from hosts may be collected and analyzed…all of which takes considerable time.

Now, what if you deploy Cb before an incident?  If you were to do this, starting with a testbed of systems, and possibly some non-production systems, you could monitor that subset of your infrastructure, and once you become familiar and comfortable with the tool (check with Kyrus for licensing to get the log collection server within your infrastructure), progressively roll the sensor out to more systems.  Once you get Cb rolled out and the server installed, it’s simply a matter of reviewing the data.   Should you suspect that something has occurred, you now have considerable, albeit easily managed and viewed data available.  You can even set up a scheduled task on the server that queries for new executables having been launched, and have this task run every day (or even six hours).  You may initially get a lot of data, but over time, you’ll notice that the set that you receive back should be reduced.  You can even have the task email the list of new executables to you.

Now, even if you were to query the logs every 24 hrs (via a scheduled task or manually), the fact is that you’d know about the incident within 24 hrs (at the most), rather than hearing about it 3 months later from someone else.  Since many of these notifications come well after the actual data theft occurred, when deployed proactively, Cb is capable of providing a level of context that simply isn’t evident or available via more traditional means of IR, such as memory or even disk analysis.  Further, once something is “seen”, you can query the infrastructure for other affected systems, quickly scoping your incident.  Again, through traditional means of IR, scoping the incident can often take considerable time and be very expensive (in time, money, resources, etc.) to the already-compromised environment.

And Cb answers more than just questions related to security and IR.   One of the use cases that the Kyrus guys like to use involves addressing budgeting issues, and going out across the enterprise to determine how many employees were running all components of an office suite of tools.  With the returned information, the organization was able to drastically reduce licensing costs.  Cb can also be used to enforce acceptable use policies, among other things.

If you haven't done so, I'd recommend taking a look at Cb...it doesn't have huge overhead or a "big" footprint, and what it can save you in terms of much more than just IR and security is immense.  

Saturday, July 09, 2011

More Links, Updates

Meetup
Our 6 July NoVA Forensic Meetup went very well, and I wanted to thank Tom Harper for putting in the time and effort that it takes to provide his presentation, and to also thank the ReverseSpace folks for hosting our meetings!  This time, we had about 20 or so folks show up...some new...and I wanted to thank everyone who took the time out to come by and take part in our event.

What I liked about Tom's presentation was the fact that Tom's a practitioner, and he approaches solutions from that perspective.  Tom was at OSDFC, and after the conference mentioned that some of the presentations we saw that day were from folks from the academic side of DFIR, and not so much the practitioner side.  I agree with that sentiment...and Tom's approach is practical and all about gettin' the job done.

I should also note that the ReverseSpace location doesn't say "ReverseSpace" on the outside, nor is there a sign by the road.  The address is 13505 Dulles Technology Drive, Suite 3, in Herndon, VA, and the facility says "Cortona Academy" on the outside.  Don't worry about writing this address down...simply check out the "NoVA Forensics Meetup" page associated with this blog.

Our next meetup is scheduled for 3 August (same time, 7-8:30pm) and we're looking for presentation or discussion ideas, as well as presenters.  @sampenaiii suggested via Twitter that we have a discussion on LulzSec and the impact of their activities on security and forensics; this sounds like a great idea, and I think that we'll need to have a slide or two with some background for those who may not be familiar with what happened. 

Cory-TV
Cory Altheide recently provided a link to the video of a presentation he gave in Italy on 2 May 2011.  The presentation is titled, "The death of computer forensics: digital forensics after the singularity", and is very interesting to watch.  Cory is a very smart guy, and as Chris Pogue recently tweeted, Cory is the bacon of the DFIR community, because he makes everything better!

Cory presents some very interesting thoughts regarding what were once thought to be "forensics killers", and the future of digital forensic analysis. One of the things that Cory mentioned was the existence of metadata, which is not often removed, simply because the user doesn't know about it.  There have been some pretty interesting instances where metadata has played a very important role (here, and here), and I agree with Cory that it will continue to do so, particular due to the fact that we have new formats, devices, and applications coming out all the time.

Clearly, Cory talked about much more than just metadata in his presentation, and I simply can't do justice to it through any sort of concise description.  Instead, I highly recommend that you take the hour+ to sit down and watch it.  I think that the overall point that Cory makes is that as available drive space increased (every time it did) and decreased in price, and as platforms to be analyzed have become more complex and varied, there have been those who've claimed that these things would be the "death of digital forensics"; instead, analysts have adapted. 

MBR Analysis
Speaking of Chris Pogue, he recently posted to his blog regarding MBR Analysis, referring to a discussion he and I had had not long ago.  What I like about Chris's post is that he's talking about it, and by him doing so, my hope is that the things we talked about reach more analysts.  Chris is the author of the Sniper Forensics presentations (here's the one he gave at DefCon18), and he's been getting a lot of mileage from the series, and presents to a lot of people.  As such, my hope is that more people will hear about MBR analysis, how it can be used, and start looking at this as a viable part of a malware detection process/checklist.

When you read the post, don't get caught up in the terminology...what I mean by that is, to clarify a couple of things, the TSK mmls reads the partition table within an image of a physical disk (you don't have one of these if you acquire a logical image of the C:\ volume, for example), and provides you the offsets to the partitions.  The offset to the active partition is often 63 sectors (indexed at 0), not "0x63".

Speaking of the Sniper Forensics presentations, Chris also has another post up on the SpiderLabs Anterior blog that's worth a read.

Additional Thoughts
I recently posted some thoughts regarding how the structure that data is stored in (as well as where within that structure) can provide context to the data that you're looking at, and an email exchange with James, who commented on that post, led to some thoughts regarding timelines.

James (I hope you don't mind me pointing this out...) in a comment that log2timeline doesn't provide enough context.  I was curious about this, and in the ensuing email exchange, at least part of what James was referring to was things like, if an image file was created on the system, provide a link to that file, that would then be opened in a viewer.  In this way, the analyst would have access to additional context.

I thought that this was an interesting idea...and I still do.  Having my own process for creating timelines (taking nothing away from Kristinn's efforts with log2timeline, of course) I thought about how you could implement something like this...and got stuck.  Here's why...let's say that you completely restructured timeline creation and had the timeline itself available with either the image itself mounted as a volume, or files extracted from the image...would you provide links to all files that were created on the system?  I'm thinking not.  Also, if a file is modified, what do you link to?  What value or context is there if you don't know what was modified in the file?  The same would be true for Registry keys.

However, I do think that there are some benefits to this idea, but it really depends upon how you implement it.  For example, let's say that you found some time stamped data in the pagefile or in unallocated space, and wanted to include it in your timeline.  You could do that, and then include a link to the data...but not to the offset within the image; instead, extract the data (plus 100+ bytes on either side of the data) into a separate file, and provide a link to that file.  The same might be true for other files...for example, if your timeline analysis leads you to determine that the infection vector was spear phishing via a PDF file, rather than copying the PDF file out of the image and linking to it, maybe what you could do is copy the PDF file out of image, and parse the offending portions of that file out, emasculate them (i.e., extract them to a text format, etc.), and then link to that.  You might extract images, and link to those...it all depends on how you want to present the data.  But my point is, this may not be something that you would want to completely automate; instead, use the timeline to perform your analysis, and once you've isolated the appropriate entries in your timeline, provide links to relevant data (either the file, a portion of a file, or to your analysis of the file) in order to add context and value to your timeline.

Wednesday, July 06, 2011

Structure Adds Context

A while ago, I was talking to Cory Altheide and he mentioned something about timeline analysis that sort of clarified an aspect of the analysis technique for me...he said that creating a timeline from multiple data sources added context to the data that you were looking at.  This made a lot of sense to me, because rather than just using file system metadata and displaying just the MACB times of the files and directories, if we added Event Log records, Prefetch file metadata, Registry data, etc., we'd suddenly see more than just that a file was created or accessed.  We'd start to see things like, user A had logged in, launched an application, and the result of those actions was the file creation or modification in which we were interested.

Lately, I've been looking at a number of data structures used by Windows systems...for example, the DestList stream within Windows 7 jump lists.  What this got me thinking about is this...as analysts, we have to understand the structure in which data is stored, and correspondingly, how it's used by the application.  We need to understand this because the structure of the data can provide context to that data.

Let's look at an example...once, in a galaxy far, far away, I was working on a PCI forensic assessment, which included scanning every acquired image for potential credit card numbers (CCNs).  When the scan had completed, I found that I had a good number of hits in two Registry hive files.  So my analysis can't stop there, can it?  After all, what does that mean, that I found CCNs in the Registry?  In and of itself, that statement is lacking context.  So, I need to ask:

Are the hits key names?  Value names?  Value data, or embedded in value data?  Or, are the hits located in unallocated space within the hive files?

The answers to any of these questions would significantly impact my analysis and the findings that I report.

Here's another example...I remember talking with someone a while back who'd "analyzed" a Windows PE file by running strings on it, and found the name of a DLL.  I don't remember the exact conclusions that they'd drawn from this, but what I do remember is thinking that had they done some further analysis, they might have had different conclusions.  After all, finding a string in a 200+ KB file is one thing...but what if that DLL had been in the import table of the PE header?  Wouldn't that have a different impact on the analysis than if the DLL was instead the name of the file where stolen data was stored before being exfil'd?

So, much like timeline analysis, understanding the structure in which data is stored, and how that data is used by an application or program, can provide context to the data that will significantly impact your analysis and findings.

Addendum, 7 July
I've been noodling this over a bit more and another thought that I had was that this concept applies not just to DF analysis, but also to the work that often goes on beyond just analysis, particularly in the LE field, and that is developing intelligence.

In many cases, and particularly for law enforcement, there's more to DF analysis than simply running keyword searches or finding an image.  In many instances, the information found in one examination is used to develop intelligence for a larger investigation, either directly or indirectly.  So, it's not just about, "hey, I found an IP address in the web logs", but what verb was used (GET, POST, etc.), what were the contents of the request, who "owns" the IP address, etc.

So how is something like this implemented?  Well, let's say you're using Simson's bulk_extractor, and you find that a particular email address that's popped up in your overall investigation was located in an acquired image.  Just the fact that this email address exists within the image may be a significant finding, but at this point, you don't have much in the way of context, beyond the fact that you found it in the image.  It could be in an executable, or part of a chat transcript, or in another file.  Regardless, where the email address is located within the image (i.e., which file it's located in) will significantly impact your analysis, your findings, and the intel you derive from these.

Now, let's say you take this a step further and determine, based on the offset within the image where the email address was located, that the file that it is located in is an email.  Now, this provides you with a bit more context, but if you really think about it, you're not done yet...how is the email-address-of-interest used in the file?  Is it in the To:, CC:, or From: fields?  Is it in the body of the message?  Again, where that data is within the structure in which it's stored can significantly impact your analysis, and your intel.

Consider how your examination might be impacted if the email address were found in unallocated space or within the pagefile, as opposed to within an email.

More Links

Meetup
Just a reminder about tonight's meetup:

Location: ReverseSpace (this is our location, unless stated otherwise)
Time: 7-8:30pm (this will be the time that we'll meet, unless stated otherwise)

Tonight, Tom Harper will be presenting...you can get a copy of his slides here.

Also, please notice that I've created a "NoVA Forensics Meetup" page, linked on the right-hand side of this blog. 

Mobius
I ran across the Mobius Forensic Framework this morning (because it had been updated), and found it very interesting.  Mobius is a Python-based framework "...that manages cases and case items, providing an abstract interface for developing extensions. Cases and item categories are defined using XML files for easy integration with other tools."  It seems that this framework has been around for some time...the main link indicates that the last update was near the end of 2009.  The framework appears to have a Hive Report capability, as well.

This appears to be very different in function from the Digital Forensics Framework, now at version 1.1.0, and is definitely worth a look.

For/Sec LinkFest
Klaus has updated his blog again, and posted an expansive set of links regarding forensic and security tools.

I'm always looking to improve the work that I do, and I very often find some interesting links in what Klaus provides.  One was a reference to the eScan AV toolkit, from TinyApps.org, in Klaus' RSS feed.  If you work cases that involve detecting suspected malware ("Trojan defense"), this may be a tool that you'll want to employ as part of your malware detection process/checklist.

Xanda
Speaking of links, I ran across this page at Xanda, and found a number of very interesting links, such as an emulator for the PDP-11.  As with many other sites that provide lists of free/open-source/(some commercial) forensics tools, there will be a considerable amount of overlap, but there are also some links on this page that I haven't seen before, and I'm not about to discount anything at this point.  I mean, while I haven't been asked to analyze an Atari system, when I was at IBM our team was asked to perform analysis of mainframe systems more than once.  The Xanda page also has an entire section on steg tools.

Reading
I ran across this interesting bit of reading on the CERIAS blog, authored by Gene Spafford.  Beyond the mention of historically famous names in the DFIR community (from before there was really a DFIR community....) were the statements in the first paragraph regarding deployment of DFIR countermeasures.

As interesting (and immensely helpful) as these countermeasures may be, having performed a number of incident response engagements and analyzed even more drives and images, I think that the reality is that we have to just file this under "ain't gonna happen".  Now, don't get me wrong...I do believe that such measures are good security and would prove to be immensely useful; however, who's going to implement and monitor them, given the state of security to begin with?  What good is any of this going to do when the bad guys have already been through your infrastructure?

Now would countermeasures such as those Gene describes be useful...sure.  If they were properly deployed.

Monday, July 04, 2011

Links

Independence Day
Before anything else, Happy 4th!  I hope that everyone takes a moment to remember those who have fought and sacrificed for our freedoms...that includes not only those who have given the ultimate sacrifice, but those who have lost loved ones in the fight for freedom.  Also remember our public servants (cops, firefighters, EMTs), as well as our service members who are fighting to give others freedom.  May God bless them all.

e-Evidence
There's been an update over at the e-Evidence web site, with the addition of some good reading...take a look.

APT
I posted recently regarding an article Jason Andress had written for ISSA, regarding APT.  Shortly thereafter, my friend Russ contacted me to let me know that he'd co-authored a similar paper, and that I might want to take a look at it.  The blog post is here, and the paper can be found here.  The paper was written as a requirement for the SANS Technology Institute MSISE program, and while it touches on some of the same themes as Jason's paper, this one takes a bit more of a tactic approach...and that's one of the things I really like about this paper.  The approach taken in the paper is not just tactical...it's "here are some of the things that are seen on the network, and here's a cheap or free way to go about detecting it."

The paper also points out some interesting aspects of tactics used by the threat actors, particularly getting into the infrastructure via some method (spear phishing), gaining a foothold with PI-RAT, and then moving laterally within the infrastructure.

Another aspect of this paper is that it provides additional insight into the threat itself; anyone unfamiliar with the threat should read this paper, Jason's article, and others in order to develop a better understanding of the threat.  Much of what I've read out there covers the general flow of these threats, and this paper provides some insight into a specific implementation, and should be considered as such.  Not every incident of this type is going to include the same persistence mechanism, use of the same RAT, or the same network traffic.  However, the paper does a very good job of pointing out some of what can be done in response to this threat, both in initial detection and then response.

So, again...some great information in the paper, and it is easy to follow; if you're trying to get a better understanding of the threat overall, be sure to include this in your reading, along with additional credible, authoritative sources.

Malware
In the past, I've talked about the four malware characteristics I'd developed to help DFIR folks understand and explain malware, and over time, those characteristics have served me pretty well.  One of those characteristics is the initial infection vector...how the malware gets on the system.  Well, I ran across this InformationWeek article this morning talks about Facebook being the "new" malware vector.  Okay, the meaning of "new" aside, I think that this is interesting, in part because it makes complete sense.  Look at the statistics in the article regarding users and the clients they use to access Facebook...pretty telling, if you ask me.

As an analyst, I'd like to hear from other analysts...have you seen incidents where Facebook was the delivery mechanism for malware?  If so, what are the artifacts on a PC or laptop, as opposed to a smartphone?

Also, Cory started a drinking game at OSDFC, because apparently, I pronounce malware "mall-ware"...so for every time I wrote "malware", you need to drink!



WFA 2/e Review
Mike Ahrendt posted a review of WFA 2/e recently; it's great to see that this book is still active and making its rounds, and that people who are reading it are finding something useful.  I tend to reference it myself now and again for my own needs, and sometimes will make notes of new, additional information that I've found with respect to a particular topic.  I think it's great that folks are still picking it up for the first time and finding it useful.

Bootkits
There was a post over on the SANS ISC site recently regarding the resurgence of bootkits, in which MS's Win32/Popureb.E (which is still short of any information useful to analysts) was specifically mentioned.  The post goes on to take a look at AV products that detect and/or clean MBR infectors, and indicates which are more successful than others.  I still think that one of the biggest issues surrounding this sort of thing is that most analysts I've spoken with appear to not look for this sort of thing when it comes to determining if there is malware (drink!) in an image acquired from an infected system.  I'm not sure if this is an awareness issue, or a training/understanding issue; I have a checklist that I use (and try to keep up to date) for engagements such as this, so when I receive an image and the statement that, "we think it was infected with malware", I run through this process, which includes checking for indications of MBR infectors.

My thoughts on this subject aren't so much that I think that MBR infectors are more pervasive than most analysts think; not at all.  I think that it's more of a knowledge or "engaging with your peers" issue than anything else.  I don't think that available courses (whether for training, or ultimately ending in a certification) are necessarily going to cover the topic of malware detection within an acquired image, but I do think that the issue is one that needs to be understood (i.e., the "Trojan defense").  As such, where do analysts go to get this sort of information or education?

What are your thoughts?