RegRipper
At the end of this past summer and into the fall, I was working on the print matter for Windows Forensic Analysis 4/e, and I'm in the process now of getting extra, downloadable materials (I decided a while back to forego the included DVD...) compiled and ready to post. During the entire process, and while conducting my own exams, I have updated a number of aspects of RegRipper...some of the code to RegRipper itself has been updated, and I've written or updated a number of plugins. Some recent blogs that have been posted have really provided some information that have led to updates, or at least to a better understanding of the artifacts themselves (how they're created or modified, etc.).
I figured that it'll be time soon for an update to RegRipper. To that end, Brett has graciously provided me access to the Wordpress dashboard for the RegRipper blog, so this will be THE OFFICIAL SITE for all things RegRipper.
Now, I know that not everyone who uses RegRipper is entirely familiar with the tool, how to use it, and what really constitutes "Registry analysis". My intention is to have this site become the clearing house for all information related to RegRipper, from information about how to best use the tool to new or updated plugins.
I think that one of the biggest misconceptions about RegRipper is that it does everything right out of the box. What people believe RegRipper does NOT do has been a topic of discussion, to my knowledge, since a presentation at the SANS Forensic Summit in the summer of 2012. Unfortunately, in most cases, folks have used presentations and social media to state what they think RegRipper does not do, rather than ask how to get it do those things. Corey has done a fantastic job of getting RegRipper to do things that he's needed done. From the beginning, RegRipper was intended to be community-based, meaning that if someone needed a plugin created or modified, they could go to one resource with the request and some sample data for testing, and that's it. That model has worked pretty well, when it's been used. For example, Corey posted a great article discussing PCA, Yogesh posted about another aspect of that topic (specifically, the AmCache.hve file), and Mari shared some data with me so that I could get a better, more thorough view of how the data is maintained in the file. Now, there's a RegRipper plugin that parses this file. The same thing is true with shellbags...thanks to the data Dan provided along with his blog post, there have been updates to the shellbags.pl plugin.
So, expect to see posts to the RegRipper site in 2014, particularly as I begin working on the updates to Windows Registry Forensics.
USB Devices
Speaking of the Registry...
Thanks to David, I saw that Nicole recently posted some more testing results, this time with respect to USB device first insertion. She also has a post up regarding directory transversal artifacts for those devices; that's right, another shellbag artifact post! Add this one to Dan's recent comprehensive post regarding the same artifacts, and you've got quite a bit of fascinating information between those two posts!
Reading through the posts, Nicole's blog is definitely one that you want to add to your blogroll.
Yogesh posted to his blog recently on USB Registry artifacts on Windows 8, specifically with respect to some Registry values that are new specifically to Windows 8.
The Windows Incident Response Blog is dedicated to the myriad information surrounding and inherent to the topics of IR and digital analysis of Windows systems. This blog provides information in support of my books; "Windows Forensic Analysis" (1st thru 4th editions), "Windows Registry Forensics", as well as the book I co-authored with Cory Altheide, "Digital Forensics with Open Source Tools".
Pages
▼
Saturday, December 28, 2013
Wednesday, December 18, 2013
Shellbags
Dan recently posted what has to be one of the most thorough/comprehensive blog articles regarding the
Windows shellbags artifacts. His post specifically focuses on shellbag artifacts from Windows 7, but the value of what he wrote goes far beyond just those artifacts.
Dan states at the very beginning of his post that his intention is to not focus on the structures themselves, but instead address the practical interpretation of the data itself. He does so, in part through thorough testing, as well as illustrating the output of two tools (one of which is commonly used and endorsed in various training courses) side-by-side.
Okay, so Dan has this blog post...so why I am I blogging about his blog post? First, I think that Dan's post...in both the general and specific sense...is extremely important. I'm writing this blog post because I honestly believe that Dan's post needs attention.
Second, I think that Dan's post is just the start. I opened a print preview of his post, and with the comments, it's 67 pages long. Yes, there's a lot of information in the post, and admittedly, the post is as long as it is in part due to the graphic images Dan includes in his post. But this post needs much more attention than "+1", "Like", and "Good job!" comments. Yes, a number of folks, including myself, have retweeted his announcement of the post, but like many others, we do this in order to get the word out. What has to happen now is that this needs to be reviewed, understood, and most importantly, discussed. Why? Because Dan's absolutely correct...there are some pretty significant misconceptions about these (and admittedly, other) artifacts. Writing about these artifacts online and in books, and discussing them in courses will only get an analyst so far. What happens very often after this is that the analyst goes back to their office and doesn't pursue the artifacts again for several weeks or months, and by the time that they do pursue them, there are still misconceptions about these artifacts.
These artifacts specifically need to be discussed and understood to the point where an analyst sees them and stops in their tracks, knowing in the back of their mind that there's something very important about them, and that the modification date and time don't necessarily mean what they think. It would behoove analysts greatly to take the materials that they have available on these (and other) artifacts, put them into a format that is most easily referenced, keep it next to their workstation and share it with others.
Post such as Dan's are very important, because very often artifacts don't mean what we may think they mean, and our (incorrect) interpretation of those artifacts can lead our examination in the wrong direction, resulting is the wrong answers being provided as a result of the analysis.
Windows shellbags artifacts. His post specifically focuses on shellbag artifacts from Windows 7, but the value of what he wrote goes far beyond just those artifacts.
Dan states at the very beginning of his post that his intention is to not focus on the structures themselves, but instead address the practical interpretation of the data itself. He does so, in part through thorough testing, as well as illustrating the output of two tools (one of which is commonly used and endorsed in various training courses) side-by-side.
Okay, so Dan has this blog post...so why I am I blogging about his blog post? First, I think that Dan's post...in both the general and specific sense...is extremely important. I'm writing this blog post because I honestly believe that Dan's post needs attention.
Second, I think that Dan's post is just the start. I opened a print preview of his post, and with the comments, it's 67 pages long. Yes, there's a lot of information in the post, and admittedly, the post is as long as it is in part due to the graphic images Dan includes in his post. But this post needs much more attention than "+1", "Like", and "Good job!" comments. Yes, a number of folks, including myself, have retweeted his announcement of the post, but like many others, we do this in order to get the word out. What has to happen now is that this needs to be reviewed, understood, and most importantly, discussed. Why? Because Dan's absolutely correct...there are some pretty significant misconceptions about these (and admittedly, other) artifacts. Writing about these artifacts online and in books, and discussing them in courses will only get an analyst so far. What happens very often after this is that the analyst goes back to their office and doesn't pursue the artifacts again for several weeks or months, and by the time that they do pursue them, there are still misconceptions about these artifacts.
Shell Items
This discussion goes far beyond simply shellbags, in part because the constituent data structures, the shell items, are much more pervasive on Windows systems that I think most analysts realize, and they're becoming more so with each new version. We've known for some time that Windows shortcut/LNK files can contain shell item ID lists, and with Windows 7, Jump Lists were found to include LNK structures. Shell items can also be found in a number of Registry values, as well, and the number of locations has increased between Vista to Windows 7, and again with Windows 8/8.1
Consider a recent innovation to the Bebloh malware...according to the linked article, the malware deletes itself when it's loaded in memory, and then waits for a shutdown signal, at which point it writes a Windows shortcut/LNK file for persistence. There's nothing in the article that discusses the content of the LNK file, but if it contains only a shell item ID list and no LinkInfo block (or if the two are not homogeneous), then analysts will need to understand shell items in order to retrieve data from the file.
This discussion goes far beyond simply shellbags, in part because the constituent data structures, the shell items, are much more pervasive on Windows systems that I think most analysts realize, and they're becoming more so with each new version. We've known for some time that Windows shortcut/LNK files can contain shell item ID lists, and with Windows 7, Jump Lists were found to include LNK structures. Shell items can also be found in a number of Registry values, as well, and the number of locations has increased between Vista to Windows 7, and again with Windows 8/8.1
Consider a recent innovation to the Bebloh malware...according to the linked article, the malware deletes itself when it's loaded in memory, and then waits for a shutdown signal, at which point it writes a Windows shortcut/LNK file for persistence. There's nothing in the article that discusses the content of the LNK file, but if it contains only a shell item ID list and no LinkInfo block (or if the two are not homogeneous), then analysts will need to understand shell items in order to retrieve data from the file.
These artifacts specifically need to be discussed and understood to the point where an analyst sees them and stops in their tracks, knowing in the back of their mind that there's something very important about them, and that the modification date and time don't necessarily mean what they think. It would behoove analysts greatly to take the materials that they have available on these (and other) artifacts, put them into a format that is most easily referenced, keep it next to their workstation and share it with others.
Publishing Your Work
A very important aspect of Dan's post is that he did not simply sit back and assume that others, specifically tool authors and those who have provided background on data structures, have already done all the work. He started clean, by clearing out his own artifacts, and walking through a series of tests without assuming...well...anything. For example, he clearly pointed out in his post that the RegRipper shellbags.pl plugin does not parse type 0x52 shell items; the reason for this is that I have never seen one of these shell items, and if anyone else has, they haven't said anything. Dan then made his testing data available so that tools and analysis processes can be improved. The most important aspect of Dan's post is not the volume of testing he did...it's the fact that he pushed aside his own preconceptions, started clean, and provided not just the data he used, but a thorough (repeatable) write-up of what he did. This follows right in the footsteps of what others, such as David Cowen, Corey Harrell and Mari DeGrazia, have done to benefit the community at large.
A very important aspect of Dan's post is that he did not simply sit back and assume that others, specifically tool authors and those who have provided background on data structures, have already done all the work. He started clean, by clearing out his own artifacts, and walking through a series of tests without assuming...well...anything. For example, he clearly pointed out in his post that the RegRipper shellbags.pl plugin does not parse type 0x52 shell items; the reason for this is that I have never seen one of these shell items, and if anyone else has, they haven't said anything. Dan then made his testing data available so that tools and analysis processes can be improved. The most important aspect of Dan's post is not the volume of testing he did...it's the fact that he pushed aside his own preconceptions, started clean, and provided not just the data he used, but a thorough (repeatable) write-up of what he did. This follows right in the footsteps of what others, such as David Cowen, Corey Harrell and Mari DeGrazia, have done to benefit the community at large.
Post such as Dan's are very important, because very often artifacts don't mean what we may think they mean, and our (incorrect) interpretation of those artifacts can lead our examination in the wrong direction, resulting is the wrong answers being provided as a result of the analysis.
Monday, December 16, 2013
Updates
Things tend to move fast in the DFIR world sometimes, and since my last post, there have been some updates to some of the things that were mentioned/discussed, and those updates were important enough that I thought that they needed to be visible.
Page_brute
Robert has updated his blog post on restoring Windows CMD sessions from the pagefile; the update includes an updated Yara rule.
AmCache
Thanks to some data shared by a friend, I was able to see that not all of the File subkeys in the AmCache.hve file will have a SHA-1 hash listed. This does nothing to obviate Yogesh's work in this area; in fact, I would suggest that it opens the question of what data is recorded and under which circumstances, making this an even more important resource. I'd sent Mari a copy of the RegRipper amcache.pl plugin to try out, and she found that it crashed...it turns out this was due to the fact that not all of the subkeys had a value named "101" (see Yogesh's blog post with the values listed); I got that fixed right away.
Yogesh has since posted part 2 of his series of blog posts discussing the AmCache.hve file. Yogesh addresses other subkeys within the AmCache.hve file, as well as some other files within the same folder. If you're at all interesting in seeing some of what's new in Windows 8/8.1, specifically for DFIR analysts, take a look.
Shellbags
Thanks to Dan's research regarding shellbags artifacts, as well as his willingness to share his test data, I've been updating the RegRipper shellbags.pl plugin. As of now, it's capable of parsing the type 0x52 shell items that Dan found, and it provides the key path for the resources. So, instead of just:
Control Panel\User Accounts\Create Your Password
Control Panel\User Accounts\Change Your Picture
...now you'll see:
Control Panel\User Accounts\Create Your Password [Desktop\0\3\0\0\]
Control Panel\User Accounts\Change Your Picture [Desktop\0\3\0\1\]
The purpose of this is to allow analysts to validate what they're seeing much more easily. Using this additional information, analysts can validate the embedded DOSDate MAC times, among other things.
I hope that this blog post helps analysts understand the embedded DOSDate time stamps within shell items.
With respect to shellbags, something "interesting" I've found is that the MFT file reference that can be found in some shell items that comprise the shellbags artifacts can a bit misleading; well, maybe not "misleading" but not easily validated. While I've been able to validate the accuracy of this information for folders on local hard drives (I have the MFT as well as the USRCLASS.DAT hive in some data sets), for removable drives and other resources, the data cannot be validated without the MFT from that external resource. As such, at this point, I'm debating whether to provide the MFT file reference in the plugin output, or comment out that code (which would allow someone to un-comment it).
Part of the reason why I'm debating (with myself, mostly) whether or not to provide this information in the output of the plugin is that right now, there's a great deal of information being displayed, and I'm afraid that not all of it is all that well understood. I've seen analysts try to fit the data they're seeing in the tool output to their theory by assuming (albeit incorrectly) what the data means, and that just doesn't work. In fact, it can be disastrous to the examination.
For example, here's an extract from the output of the current version of the shellbags.pl plugin, run against a sample data set:
| 83/8 |My Computer\D:\VM [Desktop\1\2\5\]
| 275/2 |My Computer\D:\VM\Win2008 [Desktop\1\2\5\0\]
| 84/6 |My Computer\D:\VM\Win7 [Desktop\1\2\5\1\]
| 5402/3 |My Computer\D:\VM\Win2003 [Desktop\1\2\5\2\]
| 5410/2 |My Computer\D:\VM\XP2 [Desktop\1\2\5\3\]
| 5422/6 |My Computer\D:\VM\XP3 [Desktop\1\2\5\5\]
| 404/43 |My Computer\D:\test [Desktop\1\2\6\]
| 405/43 |My Computer\D:\vsc [Desktop\1\2\7\]
Now, I have the image of the system from which this data was extracted, so I've been able to extract and parse the MFT in order to verify the MFT file references, which are listed as the MFT record number, followed by the sequence number. So, for an analyst that has an image, or even just the hive file and the MFT available, it's a simple matter to determine whether the path listed in the output of the tool can be referenced to a current record in the MFT, or to an older, historic record. An example of this would be where the MFT file reference for the D:\test folder (as illustrated above) is "404/43" in the shellbags, but MFT record 404 in the current MFT has a higher sequence number (44 or above) and a different name.
Okay, as if that's not confusing enough, what happens if you have MFT file references for paths that are NOT on hard drive local to the system? Consider the following:
| 136/5 |My Computer\F:\seattle\Tools [Desktop\1\1\36\0\]
| 556/37 |My Computer\F:\seattle\Tools\case2 [Desktop\1\1\36\0\0\]
| 560/3 |My Computer\F:\seattle\Tools\case1 [Desktop\1\1\36\0\1\]
In this case, this information was pulled from one of my own systems, and I know that the F:\ volume was, in fact, an external USB-connected wallet drive. Without the MFT from the wallet drive, this information cannot be validated. Is this data then useful?
SAM Hive
Lance posted an EnScript that parses the SAM Registry hive file and displays the users in each security group.
This has been an integral part of the RegRipper samparse.pl plugin for some time now, and it's great to see the need for this information being recognized and brought to other tools.
Addendum, 17 Dec: On the heels of Corey's RecentFileCache.bcf post, Lance has written an EnScript to parse the file. I told you...sometimes you can go weeks without anything going on in the DFIR world, and then all of sudden, things start to happen fast!
Addendum, 18 Dec: Corey posted another PCA article, this one addressing Registry keys that maintain data (not configuration information) regarding indicators of program execution. Great stuff!
Page_brute
Robert has updated his blog post on restoring Windows CMD sessions from the pagefile; the update includes an updated Yara rule.
AmCache
Thanks to some data shared by a friend, I was able to see that not all of the File subkeys in the AmCache.hve file will have a SHA-1 hash listed. This does nothing to obviate Yogesh's work in this area; in fact, I would suggest that it opens the question of what data is recorded and under which circumstances, making this an even more important resource. I'd sent Mari a copy of the RegRipper amcache.pl plugin to try out, and she found that it crashed...it turns out this was due to the fact that not all of the subkeys had a value named "101" (see Yogesh's blog post with the values listed); I got that fixed right away.
Yogesh has since posted part 2 of his series of blog posts discussing the AmCache.hve file. Yogesh addresses other subkeys within the AmCache.hve file, as well as some other files within the same folder. If you're at all interesting in seeing some of what's new in Windows 8/8.1, specifically for DFIR analysts, take a look.
Shellbags
Thanks to Dan's research regarding shellbags artifacts, as well as his willingness to share his test data, I've been updating the RegRipper shellbags.pl plugin. As of now, it's capable of parsing the type 0x52 shell items that Dan found, and it provides the key path for the resources. So, instead of just:
Control Panel\User Accounts\Create Your Password
Control Panel\User Accounts\Change Your Picture
...now you'll see:
Control Panel\User Accounts\Create Your Password [Desktop\0\3\0\0\]
Control Panel\User Accounts\Change Your Picture [Desktop\0\3\0\1\]
The purpose of this is to allow analysts to validate what they're seeing much more easily. Using this additional information, analysts can validate the embedded DOSDate MAC times, among other things.
I hope that this blog post helps analysts understand the embedded DOSDate time stamps within shell items.
With respect to shellbags, something "interesting" I've found is that the MFT file reference that can be found in some shell items that comprise the shellbags artifacts can a bit misleading; well, maybe not "misleading" but not easily validated. While I've been able to validate the accuracy of this information for folders on local hard drives (I have the MFT as well as the USRCLASS.DAT hive in some data sets), for removable drives and other resources, the data cannot be validated without the MFT from that external resource. As such, at this point, I'm debating whether to provide the MFT file reference in the plugin output, or comment out that code (which would allow someone to un-comment it).
Part of the reason why I'm debating (with myself, mostly) whether or not to provide this information in the output of the plugin is that right now, there's a great deal of information being displayed, and I'm afraid that not all of it is all that well understood. I've seen analysts try to fit the data they're seeing in the tool output to their theory by assuming (albeit incorrectly) what the data means, and that just doesn't work. In fact, it can be disastrous to the examination.
For example, here's an extract from the output of the current version of the shellbags.pl plugin, run against a sample data set:
| 83/8 |My Computer\D:\VM [Desktop\1\2\5\]
| 275/2 |My Computer\D:\VM\Win2008 [Desktop\1\2\5\0\]
| 84/6 |My Computer\D:\VM\Win7 [Desktop\1\2\5\1\]
| 5402/3 |My Computer\D:\VM\Win2003 [Desktop\1\2\5\2\]
| 5410/2 |My Computer\D:\VM\XP2 [Desktop\1\2\5\3\]
| 5422/6 |My Computer\D:\VM\XP3 [Desktop\1\2\5\5\]
| 404/43 |My Computer\D:\test [Desktop\1\2\6\]
| 405/43 |My Computer\D:\vsc [Desktop\1\2\7\]
Now, I have the image of the system from which this data was extracted, so I've been able to extract and parse the MFT in order to verify the MFT file references, which are listed as the MFT record number, followed by the sequence number. So, for an analyst that has an image, or even just the hive file and the MFT available, it's a simple matter to determine whether the path listed in the output of the tool can be referenced to a current record in the MFT, or to an older, historic record. An example of this would be where the MFT file reference for the D:\test folder (as illustrated above) is "404/43" in the shellbags, but MFT record 404 in the current MFT has a higher sequence number (44 or above) and a different name.
Okay, as if that's not confusing enough, what happens if you have MFT file references for paths that are NOT on hard drive local to the system? Consider the following:
| 136/5 |My Computer\F:\seattle\Tools [Desktop\1\1\36\0\]
| 556/37 |My Computer\F:\seattle\Tools\case2 [Desktop\1\1\36\0\0\]
| 560/3 |My Computer\F:\seattle\Tools\case1 [Desktop\1\1\36\0\1\]
In this case, this information was pulled from one of my own systems, and I know that the F:\ volume was, in fact, an external USB-connected wallet drive. Without the MFT from the wallet drive, this information cannot be validated. Is this data then useful?
How Valuable Is Data?
All of this brings up a very important question...specifically, what data is valuable, and how valuable is it? Can value be assessed differently by different analysts, or is data value somewhat universal?
What is the value of the data if it is misunderstood by the analyst? Is the value of the data diminished if the analyst follows it down the wrong path, based on misunderstood data and incorrect assumptions?
Here's an example - below at the embedded MAC times taken from the D:\VM\Win2003 entry above:
M: 2011-08-18 23:55:46
A: 2011-08-18 23:55:46
C: 2011-08-18 23:48:32
What do these time stamps mean? What is the "value" of this data?
All of this brings up a very important question...specifically, what data is valuable, and how valuable is it? Can value be assessed differently by different analysts, or is data value somewhat universal?
What is the value of the data if it is misunderstood by the analyst? Is the value of the data diminished if the analyst follows it down the wrong path, based on misunderstood data and incorrect assumptions?
Here's an example - below at the embedded MAC times taken from the D:\VM\Win2003 entry above:
M: 2011-08-18 23:55:46
A: 2011-08-18 23:55:46
C: 2011-08-18 23:48:32
What do these time stamps mean? What is the "value" of this data?
SAM Hive
Lance posted an EnScript that parses the SAM Registry hive file and displays the users in each security group.
This has been an integral part of the RegRipper samparse.pl plugin for some time now, and it's great to see the need for this information being recognized and brought to other tools.
Addendum, 17 Dec: On the heels of Corey's RecentFileCache.bcf post, Lance has written an EnScript to parse the file. I told you...sometimes you can go weeks without anything going on in the DFIR world, and then all of sudden, things start to happen fast!
Addendum, 18 Dec: Corey posted another PCA article, this one addressing Registry keys that maintain data (not configuration information) regarding indicators of program execution. Great stuff!
Wednesday, December 04, 2013
Links and News
There have been some exciting developments recently on the Windows digital forensic analysis front, and I thought it would be a good idea to bring them all together in one place.
Recover CMD sessions from the pagefile
If you perform analysis of Windows systems at all, be sure to check out Robert's blog post that discusses how to use page_brute (which I'd mentioned previously here) to recover command prompt sessions from the Windows pagefile. In the post, the author mentions quite correctly that grabbing a memory image still isn't something that's part of standard incident response procedures. If you receive a laptop system (or an image thereof) you may find a hibernation file, which you can then analyze, if doing so is something that will help you attain your goals.
Page_brute is based on Yara rules, and Robert shares the rule that he wrote...if you look at it, and follow his reasoning in the post, it's amazingly simple AND it works!
This sort of analysis can be very valuable, particularly if you don't have a memory dump available. As we learned at OMFW 2013, Volatility is moving in the direction of incorporating the pagefile into analysis, which is fantastic...but that's predicated by the responder's ability to capture a memory dump prior to shutting the system down.
I got yara-python installed (with some help...thanks!) and I then extracted the pagefile from an image I have available. I had also copied the rule out of Robert's blog post, and pasted it into the default_signatures.yar file that is part of page_brute, and ran the script. In fact, page_brute.py worked so well, that as it was running through the pagefile and extracting artifacts, MS Security Essentials "woke up" and quarantined several extracted blocks identified as Exploit:js/Blacole, specifically KU and MX variants. I then opened a couple of the output files from the CMDscan_Optimistic_Blanklines folder, and I wasn't seeing any of the output that Robert showed in his blog post, at least not in the first couple of files. So, I ran strings across the output files, using the following command:
D:\Tools>strings -n 5 H:\test\output\CMDscan_Optimistic_Blanklines\*.page | find "[Version"
I didn't get anything, so I ran the command again, this time without the "[", and I got a number of strings that looked like Registry key paths. In the end, this took some setup, downloading a script and running two commands, but you know what...even with that amount of effort, I still got 'stuff' that I would not have gotten as quickly. Not only has page_brute.py proved to be very useful, it also illustrates what can be done when someone wants to get a job done.
Resources
Excellent Yara post; look here to get the user manual and see how to write rules.
Registry Forensics Class
If you're interested in an online course in analyzing the Windows Registry, Andrew Case, Vico Marziale, and Joe Sylve put together the Registry Analysis Master Class over at The Hacker Academy. If you're interested in the course, take a look at Ken Pryor's review of the class to see if this is something for you.
Windows Application Experience and Compatibility
Corey's got a new blog post up where he discusses the Windows Application Experience and Compatibility feature, and how the RecentFileCache.bcf file can serve as a data source indicating program execution. As usual, Corey's post is thorough, referencing and building on previous work.
Corey shared a link to his blog post over on the Win4n6 Yahoo group, and Yogesh responded that he's doing some research along the same lines, as well, with a specific focus on Windows 8 and the AmCache.hve file, which follows the same file format as Windows Registry hives. Yogesh's blog post regarding the AmCache.hve file can be found here. Why should you care about this file? Well, from the post:
This file stores information about recently run applications/programs. Some of the information found here includes Executable full path, File timestamps (Last Modified and Created), File SHA1 hash, PE Linker Timestamp, some PE header data and File Version information (from Resource section) such as FileVersion, ProductName, CompanyName and Description.
This information can be very valuable during analysis; for example, using the SHA-1 hash, an analyst could search VirusTotal for information regarding a suspicious file. The file reference number from the key name could possibly be used to locate other files that may have been written to the system around the same time.
Yogesh has been posting some great information over on his blog recently, specifically with respect to Registry and Windows Event Log artifacts associated with USB devices connected to Windows 8 systems. Be sure to add it to your daily reading, or to your blog roll, in order to catch updates.
Recover CMD sessions from the pagefile
If you perform analysis of Windows systems at all, be sure to check out Robert's blog post that discusses how to use page_brute (which I'd mentioned previously here) to recover command prompt sessions from the Windows pagefile. In the post, the author mentions quite correctly that grabbing a memory image still isn't something that's part of standard incident response procedures. If you receive a laptop system (or an image thereof) you may find a hibernation file, which you can then analyze, if doing so is something that will help you attain your goals.
Page_brute is based on Yara rules, and Robert shares the rule that he wrote...if you look at it, and follow his reasoning in the post, it's amazingly simple AND it works!
This sort of analysis can be very valuable, particularly if you don't have a memory dump available. As we learned at OMFW 2013, Volatility is moving in the direction of incorporating the pagefile into analysis, which is fantastic...but that's predicated by the responder's ability to capture a memory dump prior to shutting the system down.
I got yara-python installed (with some help...thanks!) and I then extracted the pagefile from an image I have available. I had also copied the rule out of Robert's blog post, and pasted it into the default_signatures.yar file that is part of page_brute, and ran the script. In fact, page_brute.py worked so well, that as it was running through the pagefile and extracting artifacts, MS Security Essentials "woke up" and quarantined several extracted blocks identified as Exploit:js/Blacole, specifically KU and MX variants. I then opened a couple of the output files from the CMDscan_Optimistic_Blanklines folder, and I wasn't seeing any of the output that Robert showed in his blog post, at least not in the first couple of files. So, I ran strings across the output files, using the following command:
D:\Tools>strings -n 5 H:\test\output\CMDscan_Optimistic_Blanklines\*.page | find "[Version"
I didn't get anything, so I ran the command again, this time without the "[", and I got a number of strings that looked like Registry key paths. In the end, this took some setup, downloading a script and running two commands, but you know what...even with that amount of effort, I still got 'stuff' that I would not have gotten as quickly. Not only has page_brute.py proved to be very useful, it also illustrates what can be done when someone wants to get a job done.
Resources
Excellent Yara post; look here to get the user manual and see how to write rules.
Registry Forensics Class
If you're interested in an online course in analyzing the Windows Registry, Andrew Case, Vico Marziale, and Joe Sylve put together the Registry Analysis Master Class over at The Hacker Academy. If you're interested in the course, take a look at Ken Pryor's review of the class to see if this is something for you.
Windows Application Experience and Compatibility
Corey's got a new blog post up where he discusses the Windows Application Experience and Compatibility feature, and how the RecentFileCache.bcf file can serve as a data source indicating program execution. As usual, Corey's post is thorough, referencing and building on previous work.
Corey shared a link to his blog post over on the Win4n6 Yahoo group, and Yogesh responded that he's doing some research along the same lines, as well, with a specific focus on Windows 8 and the AmCache.hve file, which follows the same file format as Windows Registry hives. Yogesh's blog post regarding the AmCache.hve file can be found here. Why should you care about this file? Well, from the post:
This file stores information about recently run applications/programs. Some of the information found here includes Executable full path, File timestamps (Last Modified and Created), File SHA1 hash, PE Linker Timestamp, some PE header data and File Version information (from Resource section) such as FileVersion, ProductName, CompanyName and Description.
This information can be very valuable during analysis; for example, using the SHA-1 hash, an analyst could search VirusTotal for information regarding a suspicious file. The file reference number from the key name could possibly be used to locate other files that may have been written to the system around the same time.
More Stuff
As I was working on a RegRipper plugin for parsing and presenting the data in the AmCache.hve file, I ran across something interesting, albeit the fact that I have only one sample file to look at, at the moment. Beneath the Root key is a Programs subkey, and that appears to contain subkeys for various programs. The values within each of these subkeys do not appear to correspond to what Yogesh describes in his post, but there are some very interesting value data available. For example, the Files value is a multi-string value that appears to reference various files beneath the Root\Files subkey (as described in Yogesh's post) that may be modules loaded by the program. This can provide for some very interesting correlation, particularly if it's necessary for your analysis.
As I was working on a RegRipper plugin for parsing and presenting the data in the AmCache.hve file, I ran across something interesting, albeit the fact that I have only one sample file to look at, at the moment. Beneath the Root key is a Programs subkey, and that appears to contain subkeys for various programs. The values within each of these subkeys do not appear to correspond to what Yogesh describes in his post, but there are some very interesting value data available. For example, the Files value is a multi-string value that appears to reference various files beneath the Root\Files subkey (as described in Yogesh's post) that may be modules loaded by the program. This can provide for some very interesting correlation, particularly if it's necessary for your analysis.
Yogesh has been posting some great information over on his blog recently, specifically with respect to Registry and Windows Event Log artifacts associated with USB devices connected to Windows 8 systems. Be sure to add it to your daily reading, or to your blog roll, in order to catch updates.
Wednesday, November 20, 2013
Sniper Forensics, Memory Analysis, and Malware Detection
I conducted some analysis recently where I used timeline analysis, Volatility, and the Sniper Forensics concepts shared by Chris Pogue to develop a thorough set of findings in relatively short order.
I was analyzing an image acquired from a system thought to have been infected with Poison Ivy. All I had to go on were IPS alerts of network traffic originating from this system on certain dates...but I had no way to determine how "close" the clocks were for the system and the monitoring device.
I started by creating a timeline, and looking for the use of publicly documented persistence mechanisms employed by the malware in question (Windows service, StubPath values within "Installed Components" Registry keys, etc.), as well as looking for the use of NTFS alternate data streams. All of these turned up...nothing. A quick review of the timeline showed me that McAfee AV was installed on the system, so I mounted the image via FTK Imager and scanned it with ClamWin 0.98 and MS Security Essentials (both of which were updated immediately prior to the scan). MSSE got a hit on a file in the system32 folder, which later turned out to be the initial installer. I searched the timeline, as well as the Software and System Registry hives for the file name, and got no hits beyond just the existence of that file within the file system.
The system was a laptop system and had a hibernation file, which was last modified the day that the system itself had been shut down. I exported the hibernation file for analysis, and then downloaded the Win32 standalone version of Volatility version 2.3.1, and used that to convert the hibernation file to raw format, then identify the profile (via the imageinfo plugin) and get a process list (via the pstree plugin).
I got in touch with Jamie Levy (@gleeda), who pointed me to this post, which was extremely helpful. Based on the output of the malfind plugin, I was able to identify a possible process of interest, and I searched the timeline for the executable file name, only to find nothing. However, I did have a process ID to correlate to the output of the pstree plugin, and I could see that the process in question was a child of services.exe, which told me not only where to look, but also that the installation process required at least Administrator-level privileges. Correlating information between the RegRipper samparse.pl and profilelist.pl plugins, I was able to see which user profiles (both local and domain) on the system were members of the appropriate group. This also provided me with the start time for both processes, which correlated to the system being started.
A fellow team member had suggested running strings across the memory dump, and then searching the output for specific items; doing so, I found what appeared to be the configuration information used by Poison Ivy (which was later confirmed via malware RE performed by another team member). As a refinement to that approach, I ran the Volatility vaddump plugin, targeting the specific process, and dumped all of the output files to a folder. I then ran strings across all of the output folders, and found a similar list of items as when I'd run strings across the entire memory dump.
I validated the configuration item findings via the Volatility handles plugin, looking for mutants within the specific process. I also dumped the process executable from the memory dump via the procmemdump plugin; the first time I did it, MSSE lit up and quarantined the file, so I disabled MSSE and re-ran the plugin (I set my command prompts with QuickEdit and Insert modes, so re-running the command line as as simple as hitting the up-arrow once, and then hitting Enter...). I was able to ship the dumped PE file and the initial wrapper I found to a specialist for analysis while I continued my analysis.
I then ran the svcscan plugin and searched the output for the executable file name, and found the Windows service settings, to include the service name. I also ran the dlllist plugin to see if there were any DLLs that I might be interested in extracting from the image.
I also ran the netscan plugin, and reviewing the output gave no indication of network connections associated with the process in question. Had there been a connection established associated with the process of interest, I would've run Jamaal's ethscan module, which he talked about (however briefly) at the recent OMFW.
I had downloaded the Poison Ivy plugin, which according to the headers, is specific to one version of the PIvy server. The plugin ran for quite a while, and because I had what I needed, I simply halted it before it completed.
Timing
Something important to point out here is that the analysis that I've described here did not take days or weeks to complete; rather, the research and memory analysis was all completed in just a few hours on one evening. It could have gone a bit quicker, but until recently, I haven't had an opportunity to do a lot of this type of analysis. However, incorporating a "Sniper Forensics" thought process into my analysis, and in turn making the best use of available tools and processes, allowed me develop findings in an efficient and thorough manner.
Something important to point out here is that the analysis that I've described here did not take days or weeks to complete; rather, the research and memory analysis was all completed in just a few hours on one evening. It could have gone a bit quicker, but until recently, I haven't had an opportunity to do a lot of this type of analysis. However, incorporating a "Sniper Forensics" thought process into my analysis, and in turn making the best use of available tools and processes, allowed me develop findings in an efficient and thorough manner.
In short, the process (with Volatility plugins) looks like this:
- convert, imageinfo, pstree
- malfind - get PID
- run vaddump against the PID, then run strings against the output
- run handles against the PID
- run procmemdump against the PID (after disabling AV)
- run dlllist against the PID
- run netscan; if connections are found associated with the PID, run ethscan
At this point, I had a good deal of information that I could add to my timeline in order to provide additional context, granularity, and relative confidence in the data I was viewing. File system metadata can be easily modified, but Windows Event Log records indicating service installation and service start can be correlated with process start times, assisting you in overcoming challenges imposed by the use of anti-forensics techniques.
It's also important to point out that Registry, timeline, and memory analysis are all tools, and while they're all useful, each of them in isolation is nowhere near as powerful as using them all together. Each technique can provide insight that can either augment or be used as pivot points for another.
Resources
A couple of interesting items have popped up recently regarding memory acquisition and analysis, and I thought it would be a good idea to provide them here:
- Jamie also pointed me to this post, which, while it wasn't directly helpful (the image I was working with was from a Win7SP1 system), it does show the thought process and work flow in solving the challenge
- From MoonSols, some new features in DumpIt
- Mark Baggett's ISC Diary posts on winpmem and its uses: here, and here
- Page_brute - check out how you can incorporate the pagefile into your analysis, until Volatility v3 hits the streets
Wednesday, November 13, 2013
Tools, Malware, and more conference follow-up
Tools
Arsenal Recon - Image Mounter utility: the web site says that the tool is "currently geared towards digital forensics and incident response developers, but we have released a sample GUI and command line mount/management utility known as "MountTool" as a proof of concept."
The web site goes on to say, "We have heard that others in the DFIR community are already building better GUIs and extending the functionality of Arsenal Image Mounter..."; this is great to hear, and I hope that this is something that becomes public. Having a proof of concept tool is great to test things out, but seeing something that extends the capabilities of the utility is even better, and I think, far more useful.
Willi Ballenthin's EVXTract: Here is Willi's presentation from the SANS Summit in Prague, on 6 Oct 2013. In the sixth slide, Willi mentions the 6 steps to carving EVT records from unallocated space or other unstructured data. Somewhere in there, perhaps step 4a, should be "given the record size, read that number of bytes into the buffer, and the last 4 bytes should be the same as the first four bytes". This is the "sanity check" I've used as an initial step for validating EVT records.
Anyway, what Willi's presentation and materials really demonstrate for me is the move to carving for records rather than full files. Yes, others have been doing this for a while, in particular the MagnetForensics folks with their IEF product, but that's not an open source tool. Understanding data structures and going after the individual records can be far more fruitful than going after full files.
Mari has written up an excellent post detailing how to retrieve deleted data from a SQLite database; she's even provided a Python-based parser. This is yet another aspect of data retrieval that is based on a detailed knowledge and understanding of data structures, similar to what was discussed here (IE index.dat deleted records), and here (MFT resident file residue).
Lance's post regarding the tools he turns to during examinations, broken down by tiers. His list is very useful in that it illustrates a thought process that others can look at for comparison. I can't say that I have a similar breakdown, which probably has more to do with the types of analysis I tend to get engaged in, and the prior information that each of us may have available.
Of course, how a tool is used can also be very valuable information. For example, from Lance's post, as
well as what I saw at OSDFC, folks use RegRipper, but apparently not the way I use it. It seems that the predominant means of using RegRipper is to just run all the plugins against the available hives. That may be good for some folks, but like Corey Harrell, I tend to take something of a more targeted approach. Often during my own examinations, I don't need everything...in fact, getting everything can often be an expenditure of time and effort that I can't afford, as I have to sort through a lot of data that isn't particularly useful or beneficial. While in some instances it is useful to take a broader approach, there are times and circumstances when it's far more beneficial to take something of a targeted approach, if for no other reason to do a quick check and get something of value quickly, leaving the everything for later. At times, I've found it very beneficial to use a Registry viewer (WRR is pretty good, although it doesn't handle "big data" values) for viewing and searches, then run specific RegRipper plugins, and then have a plugin open in an editor for tweaking. That's kind of how I went about this work involving shell items.
Based on recent conferences, I'm going to have to look at adding TSK Autopsy to one of my own tiers.
Malware Response and Analysis
Claus has a new post up regarding his Anti-Malware Response Go Kit; in this case, Claus discusses more about his requirements and process, whereas some of his previous posts have been admittedly valuable lists of tools. The one correction I would offer up, however, is in the section where he briefly discusses RegRipper: "Plugins are developed by the community..." - actually, not so much, really. Yes, there are a number of plugins developed by other folks, for which I am very thankful, and offer a huge thanks to them for doing so. Other plugins were developed or extended because someone (Corey Harrell ) saw the need and reached out to me, but for the most part, the majority of the development of plugins have not been something that's sought after by the community at large. I have updated a couple of plugins (appcompatcache.pl) to support Windows 8/8.1 thanks to the suggestion and submission of additional test data by Eric Z. Also, I've written a number of malware-specific, "tactical" plugins based on something I've seen, usually within just a couple of minutes of reading the material...but again, these are not something that have been requested or sought by the community.
From the Handler's Diary blog, Analyzing Malicious Processes: very cool use case for Volatility. Not only does the post illustrate how to use Volatility, but it also clearly illustrates what an artifact (in this case, of lateral movement) "looks like", which is something that we often don't see addressed. Even though artifacts are mentioned, many times, there's very little information about what to look for on disk or in memory with respect to those artifacts.
Also from the Handler's Diary - jackcr's DFIR challenge summary; Jamie Levy (@gleeda) referred to this challenge in her "Every Step You Take: Profiling the System" talk at OMFW2013.
From Lenny Zeltser, a list of XOR obfuscation examination tools.
Win32/Trooti is an example of malware that "parks" on a vacant SvcHost value. [TrendMicro, M86 Security] It's interesting how the different write-ups address the particular value on which the malware "parks"...some tell us that it happens, others share the results of their particular analysis. More than anything else, I think that this information can help us in reviewing and if necessary, updating our own malware detection techniques.
Conference Followup
There's another write up available regarding recent conferences, in this case, the same ones I was able to attend. As such, the write-up provides something of a different perspective. You can find the OMFW & OSDFC Recap post over at the HiddenIllusion blog site. This post is more of a blow-by-blow of the conferences, and it does give an interesting perspective on the conferences. A couple of things I wanted to comment on (and this doesn't mean that I disagree...):
From the general notes: "I was told that my tweets and recap post of last years activities was helpful to those who couldn't attend..." - this is much more of a truism than most really recognize, I think. Not everyone can attend conferences, but very often we're able to learn more about which conferences or events would be most beneficial to us by what others share about their experiences; what was good, what wasn't so good, what there should be more or less of, etc. This sort of feedback can not only be valuable to folks who are planning out their conference attendance, but it can also be extremely valuable to conference organizers and presenters, as well. For example, in the OSDFC section, there was this quote:
On the disappointing side - I did feel like I was seeing a noticeable amount of people doing the same things as others have already done.
That's extremely valuable to know. Yes, you're right...there was content that would make folks think, "wow, I've seen this stuff before", and there could be any number of reasons for that. Sometimes, I think folks pursue something because they're seeing it for the first time, and they don't do a comprehensive literature search prior to kicking off their research. Sometimes, this is good, because it's done for educational purposes, and the researcher may find something new. Remember though...this year, the conference program was crowd-sourced, so it wasn't Brian and the BasisTech staff who decided which presentations would be given.
Under the "Memoirs of a Hidsight[sic] Hero" section, the author says, "Don't try and write a book about Mac rootkits..."; it wasn't so much that Cem was out to disprove the authors because they were writing a book. Rather, my take was that Cem heard the claims of near-complete anonymity and thought, "that can't be right", and ended up disproving the claims. Maybe the take away from this one would be, "...if you're going to write a book on Mac rootkits, get Cem as a co-author or tech editor." ;-)
With respect to the MantaRay presentation: "...maybe useful for others but doesn't fit into my process flow." This is exactly right...depending upon the types of issues you face, you may not need to run every tool every time. However, one of the useful things about tools like MantaRay and it's predecessor, TapeWorm, is that very often, they're configurable. That is, you can trim them down or add to them in order to meet your needs. The guys who developed MantaRay have provided the tools for use by others, which is great, particularly for folks with similar use-cases, or those new to the issue at hand.
Arsenal Recon - Image Mounter utility: the web site says that the tool is "currently geared towards digital forensics and incident response developers, but we have released a sample GUI and command line mount/management utility known as "MountTool" as a proof of concept."
The web site goes on to say, "We have heard that others in the DFIR community are already building better GUIs and extending the functionality of Arsenal Image Mounter..."; this is great to hear, and I hope that this is something that becomes public. Having a proof of concept tool is great to test things out, but seeing something that extends the capabilities of the utility is even better, and I think, far more useful.
Willi Ballenthin's EVXTract: Here is Willi's presentation from the SANS Summit in Prague, on 6 Oct 2013. In the sixth slide, Willi mentions the 6 steps to carving EVT records from unallocated space or other unstructured data. Somewhere in there, perhaps step 4a, should be "given the record size, read that number of bytes into the buffer, and the last 4 bytes should be the same as the first four bytes". This is the "sanity check" I've used as an initial step for validating EVT records.
Anyway, what Willi's presentation and materials really demonstrate for me is the move to carving for records rather than full files. Yes, others have been doing this for a while, in particular the MagnetForensics folks with their IEF product, but that's not an open source tool. Understanding data structures and going after the individual records can be far more fruitful than going after full files.
Mari has written up an excellent post detailing how to retrieve deleted data from a SQLite database; she's even provided a Python-based parser. This is yet another aspect of data retrieval that is based on a detailed knowledge and understanding of data structures, similar to what was discussed here (IE index.dat deleted records), and here (MFT resident file residue).
Lance's post regarding the tools he turns to during examinations, broken down by tiers. His list is very useful in that it illustrates a thought process that others can look at for comparison. I can't say that I have a similar breakdown, which probably has more to do with the types of analysis I tend to get engaged in, and the prior information that each of us may have available.
Of course, how a tool is used can also be very valuable information. For example, from Lance's post, as
well as what I saw at OSDFC, folks use RegRipper, but apparently not the way I use it. It seems that the predominant means of using RegRipper is to just run all the plugins against the available hives. That may be good for some folks, but like Corey Harrell, I tend to take something of a more targeted approach. Often during my own examinations, I don't need everything...in fact, getting everything can often be an expenditure of time and effort that I can't afford, as I have to sort through a lot of data that isn't particularly useful or beneficial. While in some instances it is useful to take a broader approach, there are times and circumstances when it's far more beneficial to take something of a targeted approach, if for no other reason to do a quick check and get something of value quickly, leaving the everything for later. At times, I've found it very beneficial to use a Registry viewer (WRR is pretty good, although it doesn't handle "big data" values) for viewing and searches, then run specific RegRipper plugins, and then have a plugin open in an editor for tweaking. That's kind of how I went about this work involving shell items.
Based on recent conferences, I'm going to have to look at adding TSK Autopsy to one of my own tiers.
Malware Response and Analysis
Claus has a new post up regarding his Anti-Malware Response Go Kit; in this case, Claus discusses more about his requirements and process, whereas some of his previous posts have been admittedly valuable lists of tools. The one correction I would offer up, however, is in the section where he briefly discusses RegRipper: "Plugins are developed by the community..." - actually, not so much, really. Yes, there are a number of plugins developed by other folks, for which I am very thankful, and offer a huge thanks to them for doing so. Other plugins were developed or extended because someone (
From the Handler's Diary blog, Analyzing Malicious Processes: very cool use case for Volatility. Not only does the post illustrate how to use Volatility, but it also clearly illustrates what an artifact (in this case, of lateral movement) "looks like", which is something that we often don't see addressed. Even though artifacts are mentioned, many times, there's very little information about what to look for on disk or in memory with respect to those artifacts.
Also from the Handler's Diary - jackcr's DFIR challenge summary; Jamie Levy (@gleeda) referred to this challenge in her "Every Step You Take: Profiling the System" talk at OMFW2013.
From Lenny Zeltser, a list of XOR obfuscation examination tools.
Win32/Trooti is an example of malware that "parks" on a vacant SvcHost value. [TrendMicro, M86 Security] It's interesting how the different write-ups address the particular value on which the malware "parks"...some tell us that it happens, others share the results of their particular analysis. More than anything else, I think that this information can help us in reviewing and if necessary, updating our own malware detection techniques.
Conference Followup
There's another write up available regarding recent conferences, in this case, the same ones I was able to attend. As such, the write-up provides something of a different perspective. You can find the OMFW & OSDFC Recap post over at the HiddenIllusion blog site. This post is more of a blow-by-blow of the conferences, and it does give an interesting perspective on the conferences. A couple of things I wanted to comment on (and this doesn't mean that I disagree...):
From the general notes: "I was told that my tweets and recap post of last years activities was helpful to those who couldn't attend..." - this is much more of a truism than most really recognize, I think. Not everyone can attend conferences, but very often we're able to learn more about which conferences or events would be most beneficial to us by what others share about their experiences; what was good, what wasn't so good, what there should be more or less of, etc. This sort of feedback can not only be valuable to folks who are planning out their conference attendance, but it can also be extremely valuable to conference organizers and presenters, as well. For example, in the OSDFC section, there was this quote:
On the disappointing side - I did feel like I was seeing a noticeable amount of people doing the same things as others have already done.
That's extremely valuable to know. Yes, you're right...there was content that would make folks think, "wow, I've seen this stuff before", and there could be any number of reasons for that. Sometimes, I think folks pursue something because they're seeing it for the first time, and they don't do a comprehensive literature search prior to kicking off their research. Sometimes, this is good, because it's done for educational purposes, and the researcher may find something new. Remember though...this year, the conference program was crowd-sourced, so it wasn't Brian and the BasisTech staff who decided which presentations would be given.
Under the "Memoirs of a Hidsight[sic] Hero" section, the author says, "Don't try and write a book about Mac rootkits..."; it wasn't so much that Cem was out to disprove the authors because they were writing a book. Rather, my take was that Cem heard the claims of near-complete anonymity and thought, "that can't be right", and ended up disproving the claims. Maybe the take away from this one would be, "...if you're going to write a book on Mac rootkits, get Cem as a co-author or tech editor." ;-)
With respect to the MantaRay presentation: "...maybe useful for others but doesn't fit into my process flow." This is exactly right...depending upon the types of issues you face, you may not need to run every tool every time. However, one of the useful things about tools like MantaRay and it's predecessor, TapeWorm, is that very often, they're configurable. That is, you can trim them down or add to them in order to meet your needs. The guys who developed MantaRay have provided the tools for use by others, which is great, particularly for folks with similar use-cases, or those new to the issue at hand.
Thursday, November 07, 2013
Conferences
I recently had the opportunity (and honor) of attending the Open Memory Forensics Workshop (OMFW) and the Open Source Digital Forensics Conference (#OSDFCon), held in Chantilly, VA. I've attended both conferences in the past, had a great time, and this time around was no different.
OMFW
I've always enjoyed the format that Aaron has used for the OMFW, going back to the very first one. That first time, there was a short presentation followed by a panel, and back and forth, with breaks. It was fast-moving, the important stuff was shared, and if you wanted more information, there was usually a web site that you could visit in order to download the tools, etc.
This time around, there was greater focus on things like upcoming updates to Volatility, and the creation of the Volatility Foundation. Also, a presentation by George M. Garner, Jr., was added, so there were more speakers, more variety in topics discussed, and a faster pace, all of which worked out well.
The presentations that I really got the most out of were those that were more akin to use cases.
Sean and Steven did a great job showing how they'd used various Volatility plugins and techniques to get ahead of the bad guys during an engagement, by moving faster than the bad guys could react and getting inside their OODA loop.
Cem's presentation was pretty fascinating, in that it all seemed to have started with a claim by someone that they could hide via a rootkit on Mac OSX systems. Cem's very unassuming, and disproved the claim pretty conclusively, apparently derailing a book (or at least a chapter of the book) in the process!
Jamie's presentation involved leveraging CybOX with Volatility, and was very interesting, as well as well-received.
There was more build-up and hype to Jamaal's presentation than there was actual presentation! ;-) But that doesn't take anything at all from what Jamaal talked about...he'd developed a plugin called ethscan that will scan a memory dump (Windows, Linux, Mac) and produce a pcap. Jamaal pointed out quite correctly that many times when responding to an incident, you won't have access to a pcap file from the incident; however, it's possible that you can pull the information you need out of the memory buffer from the system(s) involved.
What's really great about OMFW is that not only does Aaron get some of the big names that are really working hard (thanks to them!) to push the envelope in this area of study to present, but there are also a lot of great talks in a very short time period. I'll admit that I wasn't really interested in what goes into the framework itself (that's more for the developers), but there were presentations on Android and Linux memory analysis; there's something for everyone. You may not be interested in one presentation, but wait a few minutes...someone will talk about a plugin or a process, and you'll be glued to what they're saying.
Swag this year was a cool coffee mug and Volatility stickers.
Here's a wrap-up from last year's conference. You can keep up on new developments in Volatility, as well as the Volatility training schedule, at the Volatility Labs blog.
OSDFCon
I've attended this conference before, and just as in the past, there is a lot of great information shared, with something for everyone. Personally, I'm more interested in the talks that present how a practitioner used open source tools to accomplish something, solve a problem, or overcome a challenge. I'm not so much interested in academic presentations, nor so much in talks that talk about open source tools that folks have developed. As in the past, I'd suggest yet again that there be multiple tracks for this conference...one for academics and developers, and another for practitioners, by practitioners.
As part of full disclosure, I did not attend any of the training or tutorials, and I could not attend all of the presentations.
You can see the program of talks here.
Thoughts and Take Aways
Visualization in DFIR is a sticky point...in some ways, it may be a solution without a problem. Okay, so the recommendation is, "don't use pie charts"...got it. But how does one use visualization techniques to perform analysis, when malware and intrusions follow the Least Frequency of Occurrence? How can a histogram show an analyst when the bad guy or the malware compromised a system when activities such as normal user activity, software and system updates, etc., are the overwhelming available activity? Maybe there is a way to take a bite out of this, but I'm not sure that academics can really start to address this until there is a crossover into the practitioner's end of the pool. I only mention this because it's a recurring thought that I have each time I attend this conference.
As Simson pointed out, much of the current visualization occurs after the analyst has completed their examination and is preparing a report, either for a customer or for presentation in court. Maybe that's just the nature of the beast.
Swag this year was a plastic coffee cup for the car with the TSK logo, TSK stickers, and a DVD of Autopsy.
Resources
Link to Kristinn's stuff
Thanks
We should all give a great, big Thank You to everyone involved in making both of these conferences possible. It takes a lot of work to organize a conference...I can only imagine that it's right up there with herding cats down a beach...and providing a forum to bring folks together. So, to the organizers and presenters, to everyone who worked so hard on making these conferences possible, to those who sat at tables to provide clues to the clueless ("...where's the bathroom?")...thank you.
What else
There is another thing that I really like about DFIR-related conferences; interacting with other DFIR folks that I don't get to see very often, and even those who are not directly involved with what we do on a day-to-day basis. Unfortunately, it seems that few folks who attend these conferences want to engage and talk about DFIR topics, but now and again I find someone who does.
In this case, a good friend of mine wanted to discuss "...is what we do a 'science' or an 'art'?" at lunch. And when I say "discuss", I don't mean stand around and listen to others, I mean actively engaging in discussion. That's what a small group of us...there were only four of us at the table...did during lunch on Tuesday. Many times, finding DFIR folks at DFIR conferences that want to actively engage in discussion and sharing of DFIR topics...new malware autostart/persistence mechanisms seen, new/novel uses of tools, etc...is hard to do. I've been to conferences before where, for whatever reason, you just can't find anyone to discuss anything related to DFIR, or to what they do. In this instance, that wasn't the case, and some fascinating discussion ensued.
OMFW
I've always enjoyed the format that Aaron has used for the OMFW, going back to the very first one. That first time, there was a short presentation followed by a panel, and back and forth, with breaks. It was fast-moving, the important stuff was shared, and if you wanted more information, there was usually a web site that you could visit in order to download the tools, etc.
This time around, there was greater focus on things like upcoming updates to Volatility, and the creation of the Volatility Foundation. Also, a presentation by George M. Garner, Jr., was added, so there were more speakers, more variety in topics discussed, and a faster pace, all of which worked out well.
The presentations that I really got the most out of were those that were more akin to use cases.
Sean and Steven did a great job showing how they'd used various Volatility plugins and techniques to get ahead of the bad guys during an engagement, by moving faster than the bad guys could react and getting inside their OODA loop.
Cem's presentation was pretty fascinating, in that it all seemed to have started with a claim by someone that they could hide via a rootkit on Mac OSX systems. Cem's very unassuming, and disproved the claim pretty conclusively, apparently derailing a book (or at least a chapter of the book) in the process!
Jamie's presentation involved leveraging CybOX with Volatility, and was very interesting, as well as well-received.
There was more build-up and hype to Jamaal's presentation than there was actual presentation! ;-) But that doesn't take anything at all from what Jamaal talked about...he'd developed a plugin called ethscan that will scan a memory dump (Windows, Linux, Mac) and produce a pcap. Jamaal pointed out quite correctly that many times when responding to an incident, you won't have access to a pcap file from the incident; however, it's possible that you can pull the information you need out of the memory buffer from the system(s) involved.
What's really great about OMFW is that not only does Aaron get some of the big names that are really working hard (thanks to them!) to push the envelope in this area of study to present, but there are also a lot of great talks in a very short time period. I'll admit that I wasn't really interested in what goes into the framework itself (that's more for the developers), but there were presentations on Android and Linux memory analysis; there's something for everyone. You may not be interested in one presentation, but wait a few minutes...someone will talk about a plugin or a process, and you'll be glued to what they're saying.
Swag this year was a cool coffee mug and Volatility stickers.
Here's a wrap-up from last year's conference. You can keep up on new developments in Volatility, as well as the Volatility training schedule, at the Volatility Labs blog.
OSDFCon
I've attended this conference before, and just as in the past, there is a lot of great information shared, with something for everyone. Personally, I'm more interested in the talks that present how a practitioner used open source tools to accomplish something, solve a problem, or overcome a challenge. I'm not so much interested in academic presentations, nor so much in talks that talk about open source tools that folks have developed. As in the past, I'd suggest yet again that there be multiple tracks for this conference...one for academics and developers, and another for practitioners, by practitioners.
As part of full disclosure, I did not attend any of the training or tutorials, and I could not attend all of the presentations.
You can see the program of talks here.
Thoughts and Take Aways
Visualization in DFIR is a sticky point...in some ways, it may be a solution without a problem. Okay, so the recommendation is, "don't use pie charts"...got it. But how does one use visualization techniques to perform analysis, when malware and intrusions follow the Least Frequency of Occurrence? How can a histogram show an analyst when the bad guy or the malware compromised a system when activities such as normal user activity, software and system updates, etc., are the overwhelming available activity? Maybe there is a way to take a bite out of this, but I'm not sure that academics can really start to address this until there is a crossover into the practitioner's end of the pool. I only mention this because it's a recurring thought that I have each time I attend this conference.
As Simson pointed out, much of the current visualization occurs after the analyst has completed their examination and is preparing a report, either for a customer or for presentation in court. Maybe that's just the nature of the beast.
Swag this year was a plastic coffee cup for the car with the TSK logo, TSK stickers, and a DVD of Autopsy.
Resources
Link to Kristinn's stuff
Thanks
We should all give a great, big Thank You to everyone involved in making both of these conferences possible. It takes a lot of work to organize a conference...I can only imagine that it's right up there with herding cats down a beach...and providing a forum to bring folks together. So, to the organizers and presenters, to everyone who worked so hard on making these conferences possible, to those who sat at tables to provide clues to the clueless ("...where's the bathroom?")...thank you.
What else
There is another thing that I really like about DFIR-related conferences; interacting with other DFIR folks that I don't get to see very often, and even those who are not directly involved with what we do on a day-to-day basis. Unfortunately, it seems that few folks who attend these conferences want to engage and talk about DFIR topics, but now and again I find someone who does.
In this case, a good friend of mine wanted to discuss "...is what we do a 'science' or an 'art'?" at lunch. And when I say "discuss", I don't mean stand around and listen to others, I mean actively engaging in discussion. That's what a small group of us...there were only four of us at the table...did during lunch on Tuesday. Many times, finding DFIR folks at DFIR conferences that want to actively engage in discussion and sharing of DFIR topics...new malware autostart/persistence mechanisms seen, new/novel uses of tools, etc...is hard to do. I've been to conferences before where, for whatever reason, you just can't find anyone to discuss anything related to DFIR, or to what they do. In this instance, that wasn't the case, and some fascinating discussion ensued.
Monday, October 28, 2013
Links
This is not the persistence mechanism you were looking for...
First off, thanks to David Cowen for taking the time to address a question regarding the use of the SvcHost key in the Software hive as a persistence mechanism.
It all started with a response to one of David's Sunday Funday challenges, where the SvcHost key was held out as a persistence mechanism in and of itself. This got me to thinking because it wasn't something I was familiar with, based on my own research...so I wanted to know more and posted a question. David took it on as an exercise for himself, and posted his findings. Thanks to Dave to doing the work and sharing his findings with the community.
The purpose of this isn't to prove someone wrong...not at all. Rather, it is to challenge popularly-held beliefs about DFIR artifacts and ask, is this correct? Very often, something we've read or heard persists (no pun intended) long enough that it becomes accepted as fact. One example that I see time and time again is malware that creates persistence in the HKCU\..\Run key, and the report stating that the malware does this to start again when the system boots. This simply isn't the case, but it's stated on the Microsoft MMPC site, as well as by other malware analysts, often enough that it's simply accepted as "fact".
Interesting Reading
I saw that Corey Harrell linked to a couple of interesting articles on Twitter, one being a report indicating that security professionals in enterprise environments lack malware detection skills. If you follow the links to the report, you'll see that its based on a survey; even so, I think that as a responder, it correlates pretty closely to my experience, as well as that of others who've responded to incidents such as these, either as a consultant or as an FTE. The report lists three "root causes for the "darkness"", but I would suggest that they missed the most important one...management. Folks in the IT industry focus on where their management tells them to focus, and well as what management supports. If management focus is to keep email up and running, that's where IT will focus. If management sends IT staff off to training, but there's no requirement to use any of that training when they return, the new skills atrophy.
I have a couple of posts that may be of use to folks, most of which are linked via this one. Also, WFA 3/e (and the upcoming 4/e) has an entire chapter that covers this material, as well. Also, be sure to read Corey's blog post on malware root cause analysis, which provides a great deal of valuable information, as well as his Triaging Malware Incidents post.
The other article points to a new post-exploitation wiki that is available from the folks at the NoVA InfoSec group. This effort appears to be getting up and running, but it is a pretty cool idea to take a great deal of the knowledge and experience in responder's heads and post it in a way that it can be shared with others. This is definitely a site to keep an eye on.
Malware Persistence
While we're on the topic of malware, I ran across this very interesting write-up on the Terminator RAT. What I found most fascinating about the malware is the persistence mechanism; specifically, the malware 'sits' in a user's Startup folder, but can modify the Registry to redirect where that Startup folder is within the file system. That is, it can "tell" the OS to use another folder besides the user's Start Menu\Programs\StartUp folder. After reading this write-up, it took me only about 6 minutes to write a RegRipper plugin to extract this information, so now I can automatically search for this item. It didn't take too much more to add a check to the plugin, so that if the path is anything other than what is expected, the plugin will flag it for me. Interestingly, this is one of the suggested locations to look based on Symantec's list of common Registry key locations used by malware.
That's not the only interesting persistence mechanism I've seen recently; I ran across a description of Win32/KanKan not too long ago and wrote up a RegRipper plugin to help detect anomalies in under 20 minutes.
Tools
Lance recently posted on tools he uses during exams; it's a great list, and probably indicative of what many DFIR folks use. I hadn't really thought about it before.
I don't have the tier 1 tools that Lance has, so my list is a bit different. More often than not, I've found that the tiers of tools that I use depends heavily on the goals of the examination...I base my analysis process on what I'm asked to show, demonstrate or discover. For example, I've had analysis engagements where the goal was to answer questions based on the Windows XP/2003 Event Logs, which had been cleared. As such, my tier 1 tools become mmls.exe and blkls.exe from the TSK tools, strings from MS, and Perl. In one case, I was able to retrieve over 330 event records, to include the "smoking gun", but in another case, I found over 73K valid event records, none of which was the one I was looking for (parsing the Security hive with the RegRipper auditpol.pl plugin showed me why this was the case).
In a more general sense, I usually start off using FTK Imager for image verification. Like Lance, I use LogParser, great tool from MS for extracting records from Windows Event Logs, and it's useful for a number of other things, as well.
Also like Lance, I write a lot of my own tools and utilities in Perl, or add to ones that I already have (RegRipper, etc.). Some of the code that I use isn't so much full tools but subroutines and functions that I can cut-and-paste into other tools, such as a function that I have for dumping binary data in hex editor-format.
Windows 8 Forensics
I've been working on updating WFA to the fourth edition, and as such, I've been attempting to address for DFIR analysts the recurring question of "...what's new that we need to know in Windows ___?" (insert whichever version, starting with XP...). I thought I'd post a bunch of the links I've found to be very useful in helping to answer that question. Here are the links I've found recently:
Amanda Thompson's Windows 8 Forensic Guide
Ken Johnson's SANS DFIR Summit 2012 presentation: Windows 8 Recovery Forensics
ForensicsWiki: Prefetch - the page links to research conducted by Jared Atkinson and Yogesh Kahtri, both of whom have documented changes in the Windows 8 .pf file format.
Digital Forensics Stream: Windows 8 and 8.1 Search Charm History
Windows 8 Forensics: Recycle Bin
CyberArms: Windows 8 Forensics: Internet History Cache; also, David Cowen posted the answer to his blog challenge regarding how to analyze the IE10 web history database.
First off, thanks to David Cowen for taking the time to address a question regarding the use of the SvcHost key in the Software hive as a persistence mechanism.
It all started with a response to one of David's Sunday Funday challenges, where the SvcHost key was held out as a persistence mechanism in and of itself. This got me to thinking because it wasn't something I was familiar with, based on my own research...so I wanted to know more and posted a question. David took it on as an exercise for himself, and posted his findings. Thanks to Dave to doing the work and sharing his findings with the community.
The purpose of this isn't to prove someone wrong...not at all. Rather, it is to challenge popularly-held beliefs about DFIR artifacts and ask, is this correct? Very often, something we've read or heard persists (no pun intended) long enough that it becomes accepted as fact. One example that I see time and time again is malware that creates persistence in the HKCU\..\Run key, and the report stating that the malware does this to start again when the system boots. This simply isn't the case, but it's stated on the Microsoft MMPC site, as well as by other malware analysts, often enough that it's simply accepted as "fact".
Interesting Reading
I saw that Corey Harrell linked to a couple of interesting articles on Twitter, one being a report indicating that security professionals in enterprise environments lack malware detection skills. If you follow the links to the report, you'll see that its based on a survey; even so, I think that as a responder, it correlates pretty closely to my experience, as well as that of others who've responded to incidents such as these, either as a consultant or as an FTE. The report lists three "root causes for the "darkness"", but I would suggest that they missed the most important one...management. Folks in the IT industry focus on where their management tells them to focus, and well as what management supports. If management focus is to keep email up and running, that's where IT will focus. If management sends IT staff off to training, but there's no requirement to use any of that training when they return, the new skills atrophy.
I have a couple of posts that may be of use to folks, most of which are linked via this one. Also, WFA 3/e (and the upcoming 4/e) has an entire chapter that covers this material, as well. Also, be sure to read Corey's blog post on malware root cause analysis, which provides a great deal of valuable information, as well as his Triaging Malware Incidents post.
The other article points to a new post-exploitation wiki that is available from the folks at the NoVA InfoSec group. This effort appears to be getting up and running, but it is a pretty cool idea to take a great deal of the knowledge and experience in responder's heads and post it in a way that it can be shared with others. This is definitely a site to keep an eye on.
Malware Persistence
While we're on the topic of malware, I ran across this very interesting write-up on the Terminator RAT. What I found most fascinating about the malware is the persistence mechanism; specifically, the malware 'sits' in a user's Startup folder, but can modify the Registry to redirect where that Startup folder is within the file system. That is, it can "tell" the OS to use another folder besides the user's Start Menu\Programs\StartUp folder. After reading this write-up, it took me only about 6 minutes to write a RegRipper plugin to extract this information, so now I can automatically search for this item. It didn't take too much more to add a check to the plugin, so that if the path is anything other than what is expected, the plugin will flag it for me. Interestingly, this is one of the suggested locations to look based on Symantec's list of common Registry key locations used by malware.
That's not the only interesting persistence mechanism I've seen recently; I ran across a description of Win32/KanKan not too long ago and wrote up a RegRipper plugin to help detect anomalies in under 20 minutes.
Tools
Lance recently posted on tools he uses during exams; it's a great list, and probably indicative of what many DFIR folks use. I hadn't really thought about it before.
I don't have the tier 1 tools that Lance has, so my list is a bit different. More often than not, I've found that the tiers of tools that I use depends heavily on the goals of the examination...I base my analysis process on what I'm asked to show, demonstrate or discover. For example, I've had analysis engagements where the goal was to answer questions based on the Windows XP/2003 Event Logs, which had been cleared. As such, my tier 1 tools become mmls.exe and blkls.exe from the TSK tools, strings from MS, and Perl. In one case, I was able to retrieve over 330 event records, to include the "smoking gun", but in another case, I found over 73K valid event records, none of which was the one I was looking for (parsing the Security hive with the RegRipper auditpol.pl plugin showed me why this was the case).
In a more general sense, I usually start off using FTK Imager for image verification. Like Lance, I use LogParser, great tool from MS for extracting records from Windows Event Logs, and it's useful for a number of other things, as well.
Also like Lance, I write a lot of my own tools and utilities in Perl, or add to ones that I already have (RegRipper, etc.). Some of the code that I use isn't so much full tools but subroutines and functions that I can cut-and-paste into other tools, such as a function that I have for dumping binary data in hex editor-format.
Windows 8 Forensics
I've been working on updating WFA to the fourth edition, and as such, I've been attempting to address for DFIR analysts the recurring question of "...what's new that we need to know in Windows ___?" (insert whichever version, starting with XP...). I thought I'd post a bunch of the links I've found to be very useful in helping to answer that question. Here are the links I've found recently:
Amanda Thompson's Windows 8 Forensic Guide
Ken Johnson's SANS DFIR Summit 2012 presentation: Windows 8 Recovery Forensics
ForensicsWiki: Prefetch - the page links to research conducted by Jared Atkinson and Yogesh Kahtri, both of whom have documented changes in the Windows 8 .pf file format.
Digital Forensics Stream: Windows 8 and 8.1 Search Charm History
Windows 8 Forensics: Recycle Bin
CyberArms: Windows 8 Forensics: Internet History Cache; also, David Cowen posted the answer to his blog challenge regarding how to analyze the IE10 web history database.
Wednesday, October 09, 2013
Shell Item Artifacts, Reloaded
I spoke on David Cowen and crew's Forensic Lunch a bit ago, on the topic of shell item artifacts. I put together some slides at the last moment to use as visual references for the discussion, to illustrate what I'd done with respect to digging into the talk a bit more. The slides
In short, what I'd done is this...based on a previous Forensic Lunch, during which Joachim Metz discussed the existence of MFT file reference numbers within some of the shell item structures. Specifically, starting with Vista, shell items pointing to files and folders appear to contain MFT file reference numbers. This is mentioned not only in Joachim's Windows Shell Item format specification, but it's also described on Willi Ballenthin's Shellbags analysis page.
Accepting this, I wanted to validate the information and see what it looks like in the real world. Using FTK Imager Lite, I extracted the MFT and USRCLASS.DAT hive file from my own Windows 7 system. Parsing the shellbags entries from the USRCLASS.DAT hive (using a customized RegRipper plugin), I was able to get a hex dump of specific shell items as all of the shellbags were being parsed. I redirected the output of the plugin to a file, and selected specific entries for analysis. Figure 1 illustrates one of those examples.
The example that we're using is for a shell item that points to the D:\cases folder, with the key path Shell\BagMRU\1\2\18. As you can see in figure 1, I've boxed the DOSDate times in yellow; the MFT file reference number is marked in colored text, with the MFT record entry (i.e., 45) in red, and the sequence number (i.e., 13) in green. I highlighted these values, as we'll be using them in the rest of our examination.
The translated DOSDate times are as follows:
M: 2011-11-29 20:27:44
A: 2011-11-29 20:27:44
B: 2011-11-29 20:27:44
In short, what I'd done is this...based on a previous Forensic Lunch, during which Joachim Metz discussed the existence of MFT file reference numbers within some of the shell item structures. Specifically, starting with Vista, shell items pointing to files and folders appear to contain MFT file reference numbers. This is mentioned not only in Joachim's Windows Shell Item format specification, but it's also described on Willi Ballenthin's Shellbags analysis page.
Accepting this, I wanted to validate the information and see what it looks like in the real world. Using FTK Imager Lite, I extracted the MFT and USRCLASS.DAT hive file from my own Windows 7 system. Parsing the shellbags entries from the USRCLASS.DAT hive (using a customized RegRipper plugin), I was able to get a hex dump of specific shell items as all of the shellbags were being parsed. I redirected the output of the plugin to a file, and selected specific entries for analysis. Figure 1 illustrates one of those examples.
Fig. 1: Sample Shell Item |
The translated DOSDate times are as follows:
M: 2011-11-29 20:27:44
A: 2011-11-29 20:27:44
B: 2011-11-29 20:27:44
The times listed above are the last modified, last accessed, and creation/born dates for the folder, extracted and translated from the shell item that points to the folder (see fig. 1). Per Joachim's format specification, there doesn't seem to be an entry modified time value available. All of the times are listed in UTC format.
It's important to keep in mind where these times come from...the time stamps are read from the MFT (in FILETIME format) and translated to DOSDate time stamps via the FileTimeToDosDateTime() function.
I had also extracted the MFT from the D:\ volume and parsed it via a custom Perl script. I searched the output for record number 45, and it still had the sequence number of 13.
The times from the $STANDARD_INFORMATION attribute:
M: Mon Sep 23 16:17:36 2013 Z
A: Mon Sep 23 16:17:36 2013 Z
C: Mon Sep 23 16:17:36 2013 Z
B: Tue Nov 29 20:27:43 2011 Z
The times from the $FILE_NAME attribute:
M: Tue Nov 29 20:27:43 2011 Z
A: Tue Nov 29 20:27:43 2011 Z
C: Tue Nov 29 20:27:43 2011 Z
B: Tue Nov 29 20:27:43 2011 Z
Now, the big question is...so what? Well, the fact is that this information can be extremely useful to an analyst.
The time stamps can be extremely telling. In this case, we see that the times from the shell item are relatively close to those within the $FILE_NAME attribute; this tells us that
Sebastien pointed out something important with respect to shellbags on Windows 7 systems a bit ago over on the Win4n6 Yahoo group, specifically:
"Go in your Windows 7 help file and search for "Folders: frequently asked questions". Then select: "Why doesn't Windows remember a folder window's size and location on the desktop?". You will find why you are not able to find something similar on Win7:
"In Windows Vista, a folder window opens at the same size and location on the desktop that it did the last time you closed it, based on the location where the folder is stored. For example, if you resize the Music folder window and then close it, it'll be the same size the next time you open it.
Windows 7 remembers one size and location setting for all your folders and libraries. So each time you open Windows Explorer, it'll open at the same size and location on the desktop that it did the last time you closed it, regardless of which folder or library you open.""
The MFT file reference number tells us something about the MFT record; specifically, MFT records are reused, not deleted, when a file is deleted, and the sequence number is incremented each time the record is reused. In our example, the folder still exists, and the MFT file reference number from the shell item points to an existing record in the MFT. If this wasn't the case...if the sequence number in the MFT was 14 or more, then the information from the shell item would provide us with historical information from the system, showing us what existed on the system at some point in the past. In this case, the LastWrite time for the key in question (which, based on the MRU values, applies to the item of interest), is 2012-05-22 14:34:27 UTC, which tells us when the folder in question actually existed on the system. So, in addition to VSCs and hives in the RegBack folder, we have yet another potential source of historical information on systems, and it's simply a matter of how this information will be applied during an investigation.
Where this becomes really valuable, particularly if it pertains to the goals of your exam, is that you can develop a partial reconstruction of what a system "looked like" in the past...it would be more of a smear than a snapshot, given the various sources. What I mean by that is that shell items exist within many more locations and artifacts on a Windows system than simply the shellbags...as previously mentioned, shell items exist in shortcut/LNK files, Jump Lists (Win7/8), as well as a number of locations within the Registry. In short, there are a lot of locations within a Windows system where these particular artifacts can be found, and they seem to get more numerous as the versions of Windows increase, and they're created behind the scenes by the operating system. As such, they're not only useful for seeing that the system "looked like" in the past, but they can also be valuable if the use of anti-forensics techniques are suspected.
Finally, I'm sure that you noticed the slight discrepancy between the times listed in the shell item, and what's in the MFT. This most likely has to do with the translation that occurs via the API function...a 64-bit value with 100 nanosecond granularity is reduced to a 32-bit value with second granularity. Also, per the API function, the seconds are divided by 2, and there's no sub-second granularity to the value.
Tuesday, September 24, 2013
Links - Malware Edition
Malware
Bromium Labs posted some pretty interesting information regarding another Zeus variant; the information in this post is very useful. For example, the phishing attack(s) reportedly targeted folks in the publishing industry.
What I found most interesting is that the variant, with all of its new capabilities, still uses the Run key for persistence.
Looking at other artifacts that can be used to detect malware on systems, see this Zeus write up from SecureWorks...one of the listed capabilities is that it can modify the hosts file on the system. MS KB 172218 illustrates why this is important...but it's also something that can be checked with a very quick query.
Speaking of write ups, I really enjoyed reading this one from System Forensics, for a couple of reasons:
First, the author documents the setup so that you can not only see what they're doing, but so that the analysis can be replicated.
Second, the author uses the tools available to document the malware being analyzed. For example, they use PEView to determine information about the sample, including the fact that it's compiled for 32-bit systems. This is pretty significant information, particularly when it comes to where one (DF analyst, first responder) will look for artifacts. Fortunately, the system on which the malware was run is also 32-bit, so analysis will be pretty straightforward. It does seem very interesting to me that most malware analysts/RE folks appear to use 32-bit Windows XP when they conduct dynamic analysis.
Again, we see that this variant uses the Run key (in this case, in the user context) for persistence.
Finally, they performed Prefetch file analysis to gather some more information regarding what the malware actually does on a system.
Corey released his Tr3Secure data collection script. Open the batch file up in Notepad to see what it does...Corey put a lot of work into the script, and apparently got some great input from others in the community - see what happens when you share what you're doing, and someone else actually takes the time to look at it, and comment on it? Stuff just gets better.
Memory
If you're grabbing memory during IR, you might want to take a look at SketchyMoose's Total Recall script. Corey's script gives you the capability to dump memory, and SketchyMoose's script can help make analysis a bit easier, as well.
Persistence
Adam has posted another in a series (four total, thus far) of blogs regarding persistence mechanisms. There is some pretty interesting stuff so far, and I was recently looking at a couple of his posts, in particular #3 in the series. I tried out some of the things he described under the App Paths section of the post, and I couldn't get them to work. For example, I tried typing "pbrush" and "pbrush.exe" at the command prompt, and just go the familiar, "'pbrush' is not recognized as an internal or external command, operable program or batch file." I also added calc.exe as a key name, and for the default value, added the full path to Notepad.exe (in the system32 folder) and tried launching calc.exe...each time, the Calculator would launch. I have, however, seen the AppCertDlls key reportedly used for persistence in malware write ups, which is why I wrote the RegRipper plugin to retrieve that information.
Update: Per an exchange that I had with Adam via Twitter (good gawd, Twitter is a horrible way to try to either explain things, or get folks to elaborate on something...people, don't use it for that...), apparently, the App Paths "thing" that Adam pointed out only works if you try to run the command via the Run box ("through the shell"), and doesn't work if you try to run it from the command prompt.
Bromium Labs posted some pretty interesting information regarding another Zeus variant; the information in this post is very useful. For example, the phishing attack(s) reportedly targeted folks in the publishing industry.
What I found most interesting is that the variant, with all of its new capabilities, still uses the Run key for persistence.
Looking at other artifacts that can be used to detect malware on systems, see this Zeus write up from SecureWorks...one of the listed capabilities is that it can modify the hosts file on the system. MS KB 172218 illustrates why this is important...but it's also something that can be checked with a very quick query.
Speaking of write ups, I really enjoyed reading this one from System Forensics, for a couple of reasons:
First, the author documents the setup so that you can not only see what they're doing, but so that the analysis can be replicated.
Second, the author uses the tools available to document the malware being analyzed. For example, they use PEView to determine information about the sample, including the fact that it's compiled for 32-bit systems. This is pretty significant information, particularly when it comes to where one (DF analyst, first responder) will look for artifacts. Fortunately, the system on which the malware was run is also 32-bit, so analysis will be pretty straightforward. It does seem very interesting to me that most malware analysts/RE folks appear to use 32-bit Windows XP when they conduct dynamic analysis.
Again, we see that this variant uses the Run key (in this case, in the user context) for persistence.
Finally, they performed Prefetch file analysis to gather some more information regarding what the malware actually does on a system.
A couple of thoughts about the analysis:
Had the author run the malware and then copied off the MFT, they might have recovered the batch files as they're very likely resident files. Even if the files had been marked as not in use (i.e., deleted), had they responded quickly enough, they might have been able to retrieve the MFT before those records were overwritten.
The author states in the article, "The stolen data is sent via the C&C using the HTTP protocol."; I tend to believe that it can be valuable to know how this is done. For example, if the WinInet API is used, there may be a set of artifacts available that can assist analysts in gaining some information regarding the malware.
IRHad the author run the malware and then copied off the MFT, they might have recovered the batch files as they're very likely resident files. Even if the files had been marked as not in use (i.e., deleted), had they responded quickly enough, they might have been able to retrieve the MFT before those records were overwritten.
The author states in the article, "The stolen data is sent via the C&C using the HTTP protocol."; I tend to believe that it can be valuable to know how this is done. For example, if the WinInet API is used, there may be a set of artifacts available that can assist analysts in gaining some information regarding the malware.
Corey released his Tr3Secure data collection script. Open the batch file up in Notepad to see what it does...Corey put a lot of work into the script, and apparently got some great input from others in the community - see what happens when you share what you're doing, and someone else actually takes the time to look at it, and comment on it? Stuff just gets better.
Memory
If you're grabbing memory during IR, you might want to take a look at SketchyMoose's Total Recall script. Corey's script gives you the capability to dump memory, and SketchyMoose's script can help make analysis a bit easier, as well.
Persistence
Adam has posted another in a series (four total, thus far) of blogs regarding persistence mechanisms. There is some pretty interesting stuff so far, and I was recently looking at a couple of his posts, in particular #3 in the series. I tried out some of the things he described under the App Paths section of the post, and I couldn't get them to work. For example, I tried typing "pbrush" and "pbrush.exe" at the command prompt, and just go the familiar, "'pbrush' is not recognized as an internal or external command, operable program or batch file." I also added calc.exe as a key name, and for the default value, added the full path to Notepad.exe (in the system32 folder) and tried launching calc.exe...each time, the Calculator would launch. I have, however, seen the AppCertDlls key reportedly used for persistence in malware write ups, which is why I wrote the RegRipper plugin to retrieve that information.
Update: Per an exchange that I had with Adam via Twitter (good gawd, Twitter is a horrible way to try to either explain things, or get folks to elaborate on something...people, don't use it for that...), apparently, the App Paths "thing" that Adam pointed out only works if you try to run the command via the Run box ("through the shell"), and doesn't work if you try to run it from the command prompt.
Monday, September 23, 2013
Shell Item Artifacts
I was watching the 9/20 Forensic Lunch with David Cowen and crew recently, and when Jonathan Tomczak TZWorks was initially speaking, there was a discussion of MFT file reference numbers found in shellbags artifacts. Jonathan pointed out that these artifacts are also found in Windows shortcut/LNK files and Jump Lists. From there, Dave posed a question (which I think was based off of the mention of Jump Lists), asking if this was an artifact specifically related to Windows 7. As it turns out, this isn't so much a function of Windows 7, as how shell items are crafted on different versions of Windows; if you remember this post, shell items are becoming more and more prominent on Windows platforms. They've existed in shellbags and LNK files since XP, and as of Windows 7, they can be found in Jump Lists (streams in Jump Lists are, with the exception of the DestList stream, LNK format). Windows 8 has Jump Lists, as well, and thanks to Jason's research, we know that LNK-formatted data can also be found in the Registry.
Willi mentions the existence of these artifacts here, in his description of the ItemPos* section of the post; look for the section of code that reads:
if (ext_version >= 0x0007) {
FILEREFERENCE file_ref;
....
What this says, essentially, is that for certain types of shell items, when a specific ext_version value is found (in this case, greater than 7, which indicates Vista...), there may be file reference available within the shell item. I say "may be" to reiterate Jonathan's comments; I have only looked at a very limited set of artifacts, and Jonathan made no specific reference to the types of shell items that did or did not contain file reference numbers.
This is also mentioned in Joachim Metz's Windows Shell Item format specification, specifically on pg 25, within the discussion of the extension block. Joachim has put a lot of effort into documenting a great deal of information regarding the actual structure of a good number of shell items; in his documentation, if the ext_version is 7 or greater, certain types of shell items appear to contain the MFT file reference.
So, again...this is not something that you should expect to see in all types of shell items...many types of shell items simply will not contain this information. However, those shell items that point to files and folders...type 0x31, 0x32, 0xB1, etc...and those on Vista systems and beyond...may contain MFT file reference numbers.
I had a quick chat with David, and he pointed out that making use of the MFT file reference number from within the shellbags artifacts can show you what existed on the system at some point in the past, as the file reference number is essentially the MFT record number concatenated with the sequence number for the record. This works in very well with David's TriForce analysis methodology, and can be extremely valuable to an examiner.
The only shortcoming I can see here is that the time stamps embedded within these shell items are not of the same granularity as the time stamps found within the MFT; see this MS API for translating FILETIME time stamps to DOSDate format, which is how the time stamps are stored in the shell items. As such, the time values will be different from what's found in the MFT.
Shell Items in the Registry
There are a number of RegRipper plugins that parse shell items; menuorder.pl, comdlg32.pl (for Vista+ systems), itempos.pl, shellbags.pl, photos.pl (for Windows 8 systems). This simply illustrates how pervasive shell items are on the different versions of Windows.
There are a number of RegRipper plugins that parse shell items; menuorder.pl, comdlg32.pl (for Vista+ systems), itempos.pl, shellbags.pl, photos.pl (for Windows 8 systems). This simply illustrates how pervasive shell items are on the different versions of Windows.
Willi mentions the existence of these artifacts here, in his description of the ItemPos* section of the post; look for the section of code that reads:
if (ext_version >= 0x0007) {
FILEREFERENCE file_ref;
....
What this says, essentially, is that for certain types of shell items, when a specific ext_version value is found (in this case, greater than 7, which indicates Vista...), there may be file reference available within the shell item. I say "may be" to reiterate Jonathan's comments; I have only looked at a very limited set of artifacts, and Jonathan made no specific reference to the types of shell items that did or did not contain file reference numbers.
This is also mentioned in Joachim Metz's Windows Shell Item format specification, specifically on pg 25, within the discussion of the extension block. Joachim has put a lot of effort into documenting a great deal of information regarding the actual structure of a good number of shell items; in his documentation, if the ext_version is 7 or greater, certain types of shell items appear to contain the MFT file reference.
So, again...this is not something that you should expect to see in all types of shell items...many types of shell items simply will not contain this information. However, those shell items that point to files and folders...type 0x31, 0x32, 0xB1, etc...and those on Vista systems and beyond...may contain MFT file reference numbers.
I had a quick chat with David, and he pointed out that making use of the MFT file reference number from within the shellbags artifacts can show you what existed on the system at some point in the past, as the file reference number is essentially the MFT record number concatenated with the sequence number for the record. This works in very well with David's TriForce analysis methodology, and can be extremely valuable to an examiner.
The only shortcoming I can see here is that the time stamps embedded within these shell items are not of the same granularity as the time stamps found within the MFT; see this MS API for translating FILETIME time stamps to DOSDate format, which is how the time stamps are stored in the shell items. As such, the time values will be different from what's found in the MFT.
Thursday, September 12, 2013
Forensic Perspective
We all have different perspectives on events, usually based on our experiences. When I was a member of the ISS ERS team, I tried to engage the X-Force Vulnerability folks in discussions regarding the exploits they developed. I figured that they needed to test them, and that they used virtual systems to do so...what I wanted to do was get access to the images of those virtual systems after an exploit had successfully been developed, so that I could examine the image for artifacts directly associated with the exploit. The perspective of the folks who wrote the exploit seemed to be that if the exploit worked, it worked. As a DFIR analyst, my perspective was, how can I be better prepared to serve and support my customers?
We know that when a Windows system is in use (by a user or attacker), there is stuff that goes on while other stuff goes on, and this will often result in indirect artifacts, stuff that folks that are not DFIR analysts might not consider. For example, I ran across this post a bit ago regarding NetTraveler (the post was written by a malware analyst); being a DFIR analyst, I had submitted the link to social media, along with the question of whether the download of "new.jar" caused a Java deployment cache index (*.idx) file to be created. From my perspective, and based on my experience, I may respond to a customer that had been infected with something like this that is perhaps a newer version, and in the face of AV not detecting this malware, I would be interested in finding other artifacts that might indicate an infection...something like an *.idx file.
Forensic Perspective
I ran across this post on the Carnal0wnage blog, which describes a method for modifying a compromised system (during a pen test) so that passwords will be collected as they are changed. A couple of things jumped out at me from the post...
First, the Registry modification would be picked up by the RegRipper lsa_packages.pl plugin. So, if you're analyzing an acquired image, or a Registry file extracted from a live system, or if you've deployed F-Response as part of your response, you're likely to see something amiss in this value, even if AV doesn't detect the malware itself.
Second, the code provided for the associated DLL not only writes the captured passwords to a file, but also uses the WinInet API to send the information off of the system. This results in an entry being made into the IE history index.dat file for the appropriate account. By "appropriate", I mean whichever privilege level the code runs under; on XP systems, I've seen infected systems where the malware ran with System-level privileges and the index.dat file in the "Default User" profile was populated. I analyzed a Windows 2008 R2 system not long ago that was infected with ZeroAccess, and the click-fraud URLs were found in the index.dat file in the NetworkService profile.
If you haven't seen it yet, watch Mudge's comments at DefCon21...he makes a very good point regarding understanding perspectives when attempting to communicate with others.
We know that when a Windows system is in use (by a user or attacker), there is stuff that goes on while other stuff goes on, and this will often result in indirect artifacts, stuff that folks that are not DFIR analysts might not consider. For example, I ran across this post a bit ago regarding NetTraveler (the post was written by a malware analyst); being a DFIR analyst, I had submitted the link to social media, along with the question of whether the download of "new.jar" caused a Java deployment cache index (*.idx) file to be created. From my perspective, and based on my experience, I may respond to a customer that had been infected with something like this that is perhaps a newer version, and in the face of AV not detecting this malware, I would be interested in finding other artifacts that might indicate an infection...something like an *.idx file.
Forensic Perspective
I ran across this post on the Carnal0wnage blog, which describes a method for modifying a compromised system (during a pen test) so that passwords will be collected as they are changed. A couple of things jumped out at me from the post...
First, the Registry modification would be picked up by the RegRipper lsa_packages.pl plugin. So, if you're analyzing an acquired image, or a Registry file extracted from a live system, or if you've deployed F-Response as part of your response, you're likely to see something amiss in this value, even if AV doesn't detect the malware itself.
Second, the code provided for the associated DLL not only writes the captured passwords to a file, but also uses the WinInet API to send the information off of the system. This results in an entry being made into the IE history index.dat file for the appropriate account. By "appropriate", I mean whichever privilege level the code runs under; on XP systems, I've seen infected systems where the malware ran with System-level privileges and the index.dat file in the "Default User" profile was populated. I analyzed a Windows 2008 R2 system not long ago that was infected with ZeroAccess, and the click-fraud URLs were found in the index.dat file in the NetworkService profile.
If you haven't seen it yet, watch Mudge's comments at DefCon21...he makes a very good point regarding understanding perspectives when attempting to communicate with others.
Links
Artifacts
Jason Hale has a new post over on the Digital Forensics Stream blog, this one going into detail regarding the Search History artifacts associated with Windows 8.1. In this post, Jason points out a number of artifacts, so it's a good idea to read it closely. Apparently, with Windows 8.1, LNK files are used to maintain records of searches. Jason also brought us this blog post describing the artifacts of a user viewing images via the Photos tile in Windows 8 (which, by the way, also makes use of LNK streams...).
Claus is back with another interesting post, this one regarding Microsoft's Security Essentials download. One of the things I've always found useful about Claus's blog posts is that I can usually go to his blog and see links to some of the latest options with respect to anti-virus applications, including portable options.
Speaking of artifacts, David Cowen's Daily Blog #81 serves as the initiation of the Encyclopedia Forensica project. David's ultimate goal with this project is to document what we know, from a forensic analysis perspective, about major operating systems so that we can then determine what we don't know. I think that this is a very interesting project, and one well worth getting involved in, but my fear is that it will die off too soon, from nothing more than lack of involvement. There are a LOT of folks in the DFIR community, many of whom would never contribute to a project of this nature.
One of perhaps the biggest issues regarding knowledge and information sharing within the community, that I've heard, going back as far as WACCI 2010 and beyond, is that too many practitioners simply feel that they don't have any means for contributing to the community in a manner that allows them to do so. Some want to, but can't be publicly linked to what they share. Whatever the reason, there are always ways to contribute. For example, if you don't want to request login credentials on the ForensicsWiki and actually write something, how about suggesting content (or clarity or elaboration on content) or modifications via social media (Twitter, G+, whatever...even directly emailing someone who has edited pages)?
Challenges
Like working forensic challenges, or just trying to expand your skills? I caught this new DFIR challenge this morning via Twitter, complete with an ISO download. This one involves a web server, and comes with 25 questions to answer. I also have some links to other resources on the FOSS Tools page for this blog.
Speaking of challenges, David Cowen's been continuing his blog-a-day challenge, keeping with the Sunday Funday challenges that he posts. These are always interesting, and come with prizes for the best, most complete answers. These generally don't include images, and are mostly based on scenarios, but they can also be very informative. It can be very beneficial to read winning answers come Monday morning.
Academia
I ran across this extremely interesting paper authored by Dr. Joshua James and Pavel Gladyshev, titled Challenges with Automation in Digital Forensics Investigations. It's a bit long, with the Conclusions paragraph on pg. 14, but it is an interesting read. The paper starts off by discussing "push-button forensics" (PBF), then delves into the topics of training, education, licensing, knowledge retention, etc., all issues that are an integral part of the PBF topic.
I fully agree that there is a need for intelligent automation in what we do. Automation should NOT be used to make "non-experts useful"...any use of automation should be accompanied with an understanding of why the button is being pushed, as well as what the expected results should be so that anomalies can be recognized.
It's also clear that some of what's in the paper relates back to Corey's post about his journey into academia, where he points out the difference between training and education.
Video
I ran across a link to Mudge's comments at DefCon21. I don't know Mudge, and have never had the honor of meeting him...about all I can say is that a company I used to work for used the original L0phtCrack...a lot. Watching the video and listening to the stories he shared was very interesting, in part because one of the points he made was getting out and engaging with others, so that you can see their perspectives
Monday, September 02, 2013
Data Structures, Revisited
A while back, I wrote this article regarding understanding data structures. The importance of this topic has
not diminished with time; if anything, it deserves much more visibility. Understanding data structures provides analysts with insight into the nature and context of artifacts, which in turn provides a better picture of their overall case.
First off, what am I talking about? When I say, "data structures", I'm referring to the stuff that makes up files. Most of us probably tend to visualize files on a system as being either lines of ASCII text (*.txt files, some log files, etc.), or an amorphous blob of binary data. We may sometimes even visualize these blobs of binary data as text files, because of how our tools present the information found in those blobs. However, as we've seen over time, there are parts of these blobs that can be extremely meaningful to us, particularly during an examination. For example, in some of these blobs, there may be an 8-byte sequence that is the FILETIME format time stamp that represents when a file was accessed, or when a device was installed on a system.
A while back, as an exercise to learn more about the format of the IE (version 5 - 9) index.dat file, I wrote a script that would parse the file based on the contents of the header, which includes a directory table that points to all of the valid records within the file, according to information available on the ForensicsWiki (thanks to Joachim Metz for documenting the format, the PDF of which can be found here). Again, this was purely an exercise for me, and not something monumentally astounding...I'm sure that we're all familiar with pasco. Using what I'd learned, I wrote another script that I could use to parse just the headers of the index.dat as part of malware detection, the idea being that if a user account such as "Default User", LocalService, or NetworkService has a populated index.dat file, this would be an indication that malware on the system is running with System-level privileges and communicating off-system via the WinInet API. I've not only discussed this technique on this blog and in my books, but I've also used this technique quite successfully a number of times, most recently to quickly identify a system infected with ZeroAccess.
More recently, I was analyzing a user's index.dat, as I'd confirmed that the user was using IE during the time frame in question. I parsed the index.dat with pasco, and did not find any indication of a specific domain in which I was interested. I tried my script again...same results. Exactly. I then mounted the image as a read-only volume and ran strings across the user's "Temporary Internet Files" subfolders (with the '-o' switch), looking specifically for the domain name...that command looked like this:
C:\tools>strings -o -n 4 -s | find "domain" /i
Interestingly enough, I got 14 hits for the domain name in the index.dat file. Hhhhmmmm....that got me to thinking. Since I had used the '-o' switch in the strings command, the output included the offsets within the file to the hits, so I opened the index.dat in a hex editor and manually scrolled on down to one of the offsets; in the first case, I found full records (based on the format specification that Joachim had published). In another case, there was only a partial record, but the string I was looking for was right there. So, I wrote another script that would parse through the file, from beginning to end, and locate records without using the directory table. When the script finds a complete record, it will parse it and display the record contents. If the record is not complete, the script will dump the bytes in a hex dump so that I could see the contents. In this way, I was able to retrieve 10 complete records that were not listed in the directory table (and were essentially deleted), and 4 partial records, all of which contained the domain that I was looking for.
Microsoft refers to the compound file binary file format as a "file system within a file", and if you dig into the format document just a bit, you'll start to see why...the specification details sectors of two sizes, not all of which are necessarily allocated. This means that you can have strings and other data buried within the file that are not part of the file when viewed through the appropriate application.
MS Office documents no longer use this file format specification, but it is used in *.automaticDestinations-ms Jump Lists on Windows 7 and 8. The Registry is similar, in that the various "cells" that comprise a hive file can allow for a good bit of unallocated or "deleted" data...either deleted keys and values, or residual information in sectors that were allocated to the hive file as it continued to grow in size. MS does a very good job of making the Windows XP/2003 Event Log record format structure available; as such, not only can Event Logs from these systems be parsed on a binary basis (to not only locate valid records within the .evt file that are "hidden" by the information in the header), but records can also be recovered from unallocated space and other unstructured data. MFT records have been shown to contain useful data , particularly as a file moves from being resident to non-resident (specific to the $DATA attribute), and that can be particularly true for systems on which MFT records are 4K in size (rather than the 1K that most of us are familiar with).
Understanding data structures can help us develop greater detail and additional context with respect to the available data during an examination. We can recover data from within files that is not "visible" in a file by going beyond the API. Several years ago, I was conducting a PCI forensic audit, and found several potential credit card numbers "in" a Registry hive...understanding the structures within the file, and a bit of a closer look revealed that what I was seeing wasn't part of the Registry structure, but instead part of the sectors allocated to the hive file as it grew...they simply hadn't been overwritten with key and value cells yet. This information had a significant impact on the examination. In another instance, I was trying to determine which files a user had accessed, and found that the user did not have a RecentDocs key within their NTUSER.DAT; I found this to be odd, as even a newly-created profile will have a RecentDocs key. Using regslack.exe, I was able to retrieve the deleted RecentDocs key, as well as several subkeys and values.
not diminished with time; if anything, it deserves much more visibility. Understanding data structures provides analysts with insight into the nature and context of artifacts, which in turn provides a better picture of their overall case.
First off, what am I talking about? When I say, "data structures", I'm referring to the stuff that makes up files. Most of us probably tend to visualize files on a system as being either lines of ASCII text (*.txt files, some log files, etc.), or an amorphous blob of binary data. We may sometimes even visualize these blobs of binary data as text files, because of how our tools present the information found in those blobs. However, as we've seen over time, there are parts of these blobs that can be extremely meaningful to us, particularly during an examination. For example, in some of these blobs, there may be an 8-byte sequence that is the FILETIME format time stamp that represents when a file was accessed, or when a device was installed on a system.
A while back, as an exercise to learn more about the format of the IE (version 5 - 9) index.dat file, I wrote a script that would parse the file based on the contents of the header, which includes a directory table that points to all of the valid records within the file, according to information available on the ForensicsWiki (thanks to Joachim Metz for documenting the format, the PDF of which can be found here). Again, this was purely an exercise for me, and not something monumentally astounding...I'm sure that we're all familiar with pasco. Using what I'd learned, I wrote another script that I could use to parse just the headers of the index.dat as part of malware detection, the idea being that if a user account such as "Default User", LocalService, or NetworkService has a populated index.dat file, this would be an indication that malware on the system is running with System-level privileges and communicating off-system via the WinInet API. I've not only discussed this technique on this blog and in my books, but I've also used this technique quite successfully a number of times, most recently to quickly identify a system infected with ZeroAccess.
More recently, I was analyzing a user's index.dat, as I'd confirmed that the user was using IE during the time frame in question. I parsed the index.dat with pasco, and did not find any indication of a specific domain in which I was interested. I tried my script again...same results. Exactly. I then mounted the image as a read-only volume and ran strings across the user's "Temporary Internet Files" subfolders (with the '-o' switch), looking specifically for the domain name...that command looked like this:
C:\tools>strings -o -n 4 -s
Interestingly enough, I got 14 hits for the domain name in the index.dat file. Hhhhmmmm....that got me to thinking. Since I had used the '-o' switch in the strings command, the output included the offsets within the file to the hits, so I opened the index.dat in a hex editor and manually scrolled on down to one of the offsets; in the first case, I found full records (based on the format specification that Joachim had published). In another case, there was only a partial record, but the string I was looking for was right there. So, I wrote another script that would parse through the file, from beginning to end, and locate records without using the directory table. When the script finds a complete record, it will parse it and display the record contents. If the record is not complete, the script will dump the bytes in a hex dump so that I could see the contents. In this way, I was able to retrieve 10 complete records that were not listed in the directory table (and were essentially deleted), and 4 partial records, all of which contained the domain that I was looking for.
Microsoft refers to the compound file binary file format as a "file system within a file", and if you dig into the format document just a bit, you'll start to see why...the specification details sectors of two sizes, not all of which are necessarily allocated. This means that you can have strings and other data buried within the file that are not part of the file when viewed through the appropriate application.
CFB Format
The Compound File Binary Format document available from MS specifies the use of a sector allocation table, as well as a small sector allocation table. For Jump Lists in particular, these structures specify which sectors are in use; mapping the ones that are in use, and targeting just those sectors within the file that are not in use can allow you to recover potentially deleted information.
The Compound File Binary Format document available from MS specifies the use of a sector allocation table, as well as a small sector allocation table. For Jump Lists in particular, these structures specify which sectors are in use; mapping the ones that are in use, and targeting just those sectors within the file that are not in use can allow you to recover potentially deleted information.
MS Office documents no longer use this file format specification, but it is used in *.automaticDestinations-ms Jump Lists on Windows 7 and 8. The Registry is similar, in that the various "cells" that comprise a hive file can allow for a good bit of unallocated or "deleted" data...either deleted keys and values, or residual information in sectors that were allocated to the hive file as it continued to grow in size. MS does a very good job of making the Windows XP/2003 Event Log record format structure available; as such, not only can Event Logs from these systems be parsed on a binary basis (to not only locate valid records within the .evt file that are "hidden" by the information in the header), but records can also be recovered from unallocated space and other unstructured data. MFT records have been shown to contain useful data , particularly as a file moves from being resident to non-resident (specific to the $DATA attribute), and that can be particularly true for systems on which MFT records are 4K in size (rather than the 1K that most of us are familiar with).
Understanding data structures can help us develop greater detail and additional context with respect to the available data during an examination. We can recover data from within files that is not "visible" in a file by going beyond the API. Several years ago, I was conducting a PCI forensic audit, and found several potential credit card numbers "in" a Registry hive...understanding the structures within the file, and a bit of a closer look revealed that what I was seeing wasn't part of the Registry structure, but instead part of the sectors allocated to the hive file as it grew...they simply hadn't been overwritten with key and value cells yet. This information had a significant impact on the examination. In another instance, I was trying to determine which files a user had accessed, and found that the user did not have a RecentDocs key within their NTUSER.DAT; I found this to be odd, as even a newly-created profile will have a RecentDocs key. Using regslack.exe, I was able to retrieve the deleted RecentDocs key, as well as several subkeys and values.
Summary
Understanding the nature of the data that we're looking at is critical, as it directs our interpretations of that data. This interpretation will not only direct subsequent analysis, but also significantly impact our conclusions. If we don't understand the nature of the data and the underlying data structures, our interpretation can be significantly impacted. Is that credit card number, which we found via a search, actually stored in the Registry as value data? Just because our search utility located it within the physical sectors associated with a particular file name, do we understand enough about the file's underlying data structures to understand the true nature and context of the data?
Understanding the nature of the data that we're looking at is critical, as it directs our interpretations of that data. This interpretation will not only direct subsequent analysis, but also significantly impact our conclusions. If we don't understand the nature of the data and the underlying data structures, our interpretation can be significantly impacted. Is that credit card number, which we found via a search, actually stored in the Registry as value data? Just because our search utility located it within the physical sectors associated with a particular file name, do we understand enough about the file's underlying data structures to understand the true nature and context of the data?