Wednesday, December 28, 2011

Jump List Analysis

I've recently spoke with a couple of analysts I know, and during the course of these conversations, I was somewhat taken aback by how little seems to be known or available with respect to Jump Lists.  Jump Lists are artifacts that are new to Windows 7 (...not new as of Vista...), and are also available in Windows 8.  This apparent lack of attention to Jump Lists is most likely due to the fact that many analysts simply haven't encountered Windows 7 systems, or that Jump Lists haven't played a significant role in their examinations.  I would suggest, however, that any examination that includes analysis of user activity on a system will likely see some significant benefit from understanding and analyzing Jump Lists.

I thought what I'd try do is consolidate some information on Jump Lists and analysis techniques in one location, rather than having it spread out all over.  I should also note that I have a section on Jump Lists in the upcoming book, Windows Forensic Analysis 3/e, but keep in mind that one of the things about writing books is that once you're done, you have more time to conduct research...which means that the information in the book may not be nearly as comprehensive as what has been developed since I wrote that section.

In order to develop a better understanding of these artifacts, I wrote some code to parse these files.  This code consists of two Perl modules, one for parsing the basic structure of the *.automaticDestinations-ms Jump List files, and the other to parse LNK streams.  These modules not only provide a great deal of flexibility with respect to what data is parsed and how it can be displayed (TLN format, CSV, table, dumped into a SQLite database, etc.), but also the depth to which the data parsing can be performed.

Jump List Analysis
Jump Lists are located within the user profile, and come in two flavors; automatic and custom Jump Lists.  The automatic Jump Lists (*.automaticDestinations-ms files located in %UserProfile%\AppData\Roaming\Microsoft\Windows\Recent\AutomaticDestinations) are created automatically by the shell as the user engages with the system (launching applications, accessing files, etc.).  These files follow the MS-CFB compound file binary format, and each of the numbered streams within the file follows the MS-SHLLINK (i.e., LNK) binary format.

The custom Jump Lists (*.customDestinations-ms files located in %UserProfile%\AppData\Roaming\Microsoft\Windows\Recent\CustomDestinations) are created when a user "pins" an item (see this video for an example of how to pin an item).  The *.customDestinations-ms files are apparently just a series of LNK format streams appended to each other.

Each of the Jump List file names starts with a long string of characters that is the application ID, or "AppID", that identifies the specific application (and in some cases, version) used to access specific files or resources.  There is a list of AppIDs on the ForensicsWiki, as well as one on the ForensicArtifacts site.

From an analysis perspective, the existence of automatic Jump Lists is an indication of user activity on the system, and in particular interaction via the shell (Windows Explorer being the default shell).  This interaction can be via the keyboard/console, or via RDP.  Jump Lists have been found to persist after an application has been deleted, and can therefore provide an indication of the use of a particular application (and version of that application), well after the user has removed it from the system.  Jump Lists can also provide indications of access to specific files and resources (removable devices, network shares). 

Further, the binary structure of the automatic Jump Lists provides access to additional time stamp information.  For example, the structures for the compound binary file directory entries contain fields for creation and modification times for the storage object; while writing and testing code for parsing Jump Lists, I have only seen the creation dates populated.

Digging Deeper: LNK Analysis
Within the automatic Jump List files, all but one of the streams (i.e., the DestList stream) are comprised of LNK streams.  That's right...the various numbered streams are comprised of binary streams following the MS-SHLLINK binary format.  As such, you can either use something like MiTeC's SSV to view and extract the individual streams, and then use an LNK viewer to view the contents of each stream, or you can use Mark Woan's JumpLister to view and extract the contents of each stream (including the DestList stream).  The numbered streams do not have specific MAC times associated with them (beyond time stamps embedded in MS-CFB format structures), but they do contain MAC time stamps associated with the target file. 

Most any analyst who has done LNK file analysis is aware of the wealth of information contained in these files/streams.  My own testing has shown that various applications populate these streams with different contents.  One thing that's of interest...particularly since it was pointed out in Harry Parsonage's The Meaning of LIFE paper...is that some LNK streams (I say "some" because I haven't seen all possible variations of Jump Lists yet, only a few...) contain ExtraData (defined in the binary specfication), including a TrackerDataBlock.  This structure contains a machineID (name of the system), as well as two "Droids", each of which consists a VolumeID GUID and a version 1 UUID (ObjectID).  These structures are used by the Link Tracking Service; the first applies to the new volume (where the target file resides now), and the second applies to the birth volume (where the target file was when the LNK stream was created).  As demonstrated in Harry's paper, this information can be used to determine if a file was moved or copied; however, this analysis is dependent upon the LNK stream being created prior to the action taking place.  The code that I wrote extracts and parses these values into their components, so that checks can be written to automatically determine if the target file was moved or copied.

There's something specific that I wanted to point out here that has to do with LNK and Jump List analysis.  The format specification for the ObjectID found in the TrackerDataBlock is based on UUID version 1, defined in RFC 4122.  Parsing the second half of the "droid" should provide a node identifier in the last 6 bytes of stream.  Most analysts simply seem to think that this is the MAC address (or a MAC address) for the system on which the target file was found.  However, there is nothing that I've found thus far that states emphatically that it MUST be the MAC address; rather, all of the resources I've found indicate that this value can be a MAC address.  Given that a system's MAC address is not stored in the Registry by default, analysis of an acquired image makes this value difficult to verify.  As such, I think that it's very important to point out that while this value can be a MAC address, there is nothing to specifically and emphatically state that it must be a MAC address.

DestList Stream
The DestList stream is found only in the automatic Jump Lists, and does not follow the MS-SHLLINK binary format (go here to see the publicly documented structure of this stream).  Thanks to testing performed by Jimmy Weg, it appears that not only is the DestList stream a most-recently-used/most-frequently-used (MRU/MFU) list, but some applications (such as Windows Media Player) appear to be moving their MRU lists to Jump Lists, rather than continuing to use the Registry.  As such, the DestList streams can be a very valuable component of timeline analysis.

What this means is that the DestList stream can be parsed to see when a file was most recently accessed.  Unlike Prefetch files, Jump Lists do not appear (at this point) to contain a counter of how many times a particular file (MSWord document, AVI movie file, etc.) was accessed or viewed, but you may be able to determine previous times that a file was accessed by parsing the appropriate Jump List file found in Volume Shadow Copies. 

Summary
Organizations are moving away from Windows XP and performing enterprise-wide rollouts of Windows 7.  More and more, analysts will encounter Windows 7 (and before too long, Windows 8) systems, and need to be aware of the new artifacts available for analysis.  Jump Lists can hold a wealth of information, and understanding these artifacts can provide the analyst with a great deal of clarity and context.

Resources
ForensicsWiki: Jump Lists
Jump List Analysis pt. I, II, III
DestList stream structure documented
Harry Parsonage's The Meaning of LIFE paper - a MUST READ for anyone conducting LNK analysis
RFC 4122 - UUID description; sec 4.1.2 describes the structure format found in Harry's paper; section 4.1.6 describes how the Node field is populated
Perl UUID::Tiny module - Excellent source of information for parsing version 1 UUIDs

19 comments:

Rob Lee said...

Harlan, you said "I wrote some code to parse these files. This code consists of two Perl modules, one for parsing the basic structure of the *.automaticDestinations-ms Jump List files, and the other to parse LNK streams."

Trying to compare the tools (Jumplister, WFA, and SSV) to see if the tools are accurate and to provide a research baseline. Very interested how you went about parsing it, but I cannot seem to find the code modules online at your usual code location. Have you released them? Thanks for a great post.

H. Carvey said...

Trying to compare the tools (Jumplister, WFA, and SSV) to see if the tools are accurate and to provide a research baseline.

How would you determine accuracy? If one tool lists a value as decimal and the other presents it at hex, is one more accurate than another? If a value is located at an offset, and one tool displays it and another doesn't, is the first tool more accurate?

I have to say, I don't really follow what you're trying to achieve. I think that with the available tools, there's already more than enough for a "research baseline".

Very interested how you went about parsing it...

By following the binary format specifications available from MS.

...but I cannot seem to find the code modules online at your usual code location. Have you released them?

No, I haven't released the code yet...I may do so after I add some POD to the modules. I will likely release versions of the modules, and then maintain and keep developing versions with more debugging capabilities (dumping structures, etc.). As I mentioned before, however, these are modules...what gets displayed and the way it gets displayed is totally dependent upon the script you write.

Also, I still have some of the same concerns as our last conversation; specifically:

1. I'm not sure that the vast majority of analysts really understand Jump Lists at this point, so what would be the point of releasing yet another tool, particularly one like this, if the analyst doesn't understand enough about the Jump Lists to understand what's being displayed?

2. I've seen too many times how someone will figure that something doesn't work (usually through operator error), and post to a list but NOT say anything to the tool author. I've seen folks say that they couldn't get SIFT to work, and my response has always been, "...did you contact Rob?", to which they usually respond, "no".

Upon further reflection, add to that some additional thoughts:

3. I don't want the code absorbed into another project, after which I have no access to any of the research you mention, or to continued development.

4. I'm aware of tools that parse Jump Lists, including the DestList stream, but the author of at least one tool does not give credit (by name) for their source for the DestList structure. I've already published a great deal of my Jump List research on my blog as well as to the ForensicsWiki, and there doesn't seem to be any further development of that research, at least none that's available publicly.

5. RegRipper is used in several training courses (some colleges, and in other for-pay training courses), and for whatever reason, I can't even see/review the materials of those courses, either fully or in part.

From what I've seen with respect to other open source tools that I've released, as well as the information I've released on Jump Lists, all of these materials are being used within the community. That's great. However, what seems to be happening is that at least some of those who are using this information are benefiting from it, but very little is making it back into the community.

Rob Lee said...

One additional thought related to determining if a file is copied or moved in a filesystem. You can tell if a file is copied on moved on Win7 machines simply by looking at the MACB times. The information should match up with what you uncovered in the LNK files.

For a quick reference, the information can be found here: http://computer-forensics.sans.org/blog/2010/04/12/windows-7-mft-entry-timestamp-properties/

Rob Lee said...

Yeah truly understand how you feel. I have felt that way on similar projects and research that show up in other locations. I have seen the SIFT workstation in many locations... the great thing is that many of them asked to use it in those locations. Others did not. Still thank you for all the hard work you are trying to do, just would have liked to see the PERL code as that is what I'm most familiar in.

Im glad that you and I were able to sit down for lunch several months ago I was able to show you the SANS books and each slide where we mention Regripper in the SANS courses earlier this year. Sorry others have not had that opportunity.

H. Carvey said...

Rob,

Thanks for that opportunity. Too bad I never saw the NDA that we discussed.

As we discussed in our last exchange, I did reach to Kristinn and make an offer of the code I had available at the time, and never heard back. I know folks get busy...we all do. I've also offered to assist with other items on his roadmap...and I don't see Jump Lists on that roadmap at the moment.

Any thoughts on the topics of accuracy and research that you mentioned in your first comment?

Rob Lee said...

I cannot speak for Kristinn, but I'd try again... he has had a lot going on. Kristinn himself and his family from Iceland to start a new job in California. That is a lot if you ask me.

As for the accuracy, yes I have seen multiple tools parse structures incorrectly. Even structures as simple as index.dat files routinely had misinterpreted data. Others assume the parsing is correct and they blindly build that into their own tools testing only to ensure that it matches the output of the other tool. So accuracy would be to perform my own tests to ensure that what the tool spits out the right time and the right artifacts knowing exactly when and how I created them. Seeing which structures you pulled your data from makes it easier as I personally know PERL.

Jimmy_Weg said...

First, thank you for commenting on my rather small part in a team effort to bring jump lists into a mainstream discussion. I agree that their importance remains unrecognized by many of my peers. It almost never comes up on most of the lists to which I subscribe.

Concerning validation, I can say that the one case that you were kind enough to parse for me produced entirely accurate results. I realize that it was a while ago and involved only one data set, but I imagine that your code hasn't varied with respect to that function. I still want to test a little further with respect to the MAC times in the numbered streams, as to whether, or how closely, they parallel the target MACs. However, the importance of that question hasn't been significant in regard to my particular caseload. I've found variances, but that fact could be logical after further study.

There's little available when it comes to parsers. I pretty much use Mark Woan's and XWF for comparison, and I sometimes triple check with SSV. Mark has been extraordinarily gracious in adopting suggestions, such as simply offering an option to list the stream numbers in hex or decimal. I haven't had a need for a timeline-ready (in the format discussed here) export. Going back through shadows to determine frequency seems like a daunting task, but that's a great point. Often, we can achieve similar results by carving index records, but that depends on whether the desired records are recoverable.

(On an unrelated subject, thanks very much for you card! It was very thoughtful.)

davehull said...

Harlan,

Thanks for your research and this post. There's a wealth of information in the post and the references you've listed. Interesting about the MAC address in lnk files. I wonder where that originated. I went back and looked at the old MS-SHLLINK.pdf I'd downloaded almost 2 years ago when I was porting your lslnk.pl to Metasploit and it says nothing about MAC addresses, nor does the latest version of that document. I know it's in FTK 1.8's parser. I see Parsonage's paper, which I hadn't read until tonight, also mentions the MAC address.

H. Carvey said...

Jimmy,

There's little available when it comes to parsers.

It depends on your perspective. MiTeC's SSV + a LNK file parser make a good combination, and Mark Woan's JumpLister does a great job of parsing the data. My own tools are really more for an analyst who understands Jump Lists and what's available within them.

Going back through shadows to determine frequency seems like a daunting task...

Not at all. Corey Harrell has done a fantastic job of posting batch scripts that he uses to automate the ripping of data out of VSCs...having a tool that can go back through the VSCs and parse previous versions of a specific Jump List file would be a pretty simple way of doing this.

Dave,

Be sure to take a look at RFC 4122, as well.

Thanks.

Jimmy_Weg said...

I did mention SSV and Jumplister, though I don't think it's very efficient to use SSV+LNK parser when one tool will meet my needs. I do, however, have a copy of Paul Sanderson's LinkAlyzer, which does a very thorough job of presenting the data.

I haven't tried Corey's ripper. However, as you know, I have played with ProDiscover in regard to shadows and jump lists. It actually would be a rather handy way to extract anything that I need. In fact, if PD brings its jump list parser up to the level of your and Mark's tools, it would be remarkably fast and easy. (I'm awaiting the next release that should fix a few issues.)

H. Carvey said...

Jimmy,

That's an interesting analysis technique...let's say that you were able to parse the contents of a Jump List's DestList stream, and output the contents in TLN format. If you were interested in a particular file (say, a PDF or MSWord document), you could run that tool against the "current" Jump List file and pipe the output through find to see just the file you were interested in.

Then, using the technique Corey's discussed, you could automatically mount available VSCs, and run the same command against the previous versions of the Jump List file. This would (hopefully) show you all (or just some) of the previous times that the file had been accessed.

The same could be done with other tools.

Corey Harrell said...

@Harlan,

You're comment about analysts not knowing about Jump Lists because they don't encounter Windows 7 systems rings true for my neck of the woods. Most analysts I talk to locally deal with organizations who haven't upgraded to 7 yet (they also skipped the Vista upgrade). I even fall into the same category of not encountering too many 7 boxes. However, I think that this will change over the next year since organizations are going to be upgrading their enterprises from XP. Thanks for putting this post together; it contains a wealth of information. It's a great reference for not only learning but should come in handy when parsing jump lists.

@Jimmy

As Harlan already mentioned parsing the jump lists stored in volume shadow copies would be fairly easy. The process would work as you described; run a command to parse the lists from a mounted image followed by running the same command against any VSCs of interest. A cool thing is that you could parse other artifacts at the same time such as registry keys and Windows link files. Then you could either grep all the data looking for specific file types or review the output for leads. It would only take a minute or two to put together a batch script to do this. At some point during the month I'm putting together a post explaining the technique and will release a few batch scripts showing the capability.

H. Carvey said...

Corey,

I think you're right about analysts and access to Win7 boxes. In some cases, I've chatted with analysts who have analyzed Windows 7 boxes, but the cases involved malware, not specifically looking at user activity. I also agree with you that as organizations transition from XP, everyone will be seeing more Win7 systems come across their work bench.

Jimmy_Weg said...

Here in the hinterlands, most our boxes are Win 7 or leftover Vista. That's probably because we rarely see enterprise machiines, so the stuff is off-the-shelf from Staples. Corey, thanks for the tips, and I'll look forward to your post.

Anonymous said...

Hi Harlan,

I came across a handy trialware utility called "Jumplist File Extract 1.2". Life is soo easy when everything's "automated" lol :)

H. Carvey said...

Does it parse the DestList stream?

Anonymous said...

I think yes. It parses customdestination and automaticdestination and provides the results under the following columns: Long Name, Short Name, Modified, Created, Accessed, Attributes, File Size, Arguments, File Size, Title, Description, Vol. GUID and more!

I want you to use the tool and share your thoughts about it.

H. Carvey said...

I want you to use the tool and share your thoughts about it.

Thanks, but I'm not really interested in trying the tool, as I can't find too much information about it other than it's available at a number of download sites.

Also, I don't find anything that indicates that it parses the DestList stream.

Anonymous said...

Which is the difference between the information contained in *.automaticDestinations-ms and in lnk files %UserProfile%\AppData\Roaming\Microsoft\Windows\Recent?