Windows Incident Response

Saturday, July 06, 2013

HowTo: Determine Program Execution

Sometimes during an examination, it can be important to determine what programs have been executed on a

system, and more specifically, when and by which user. Some of the artifacts on a system will provide us with indications of programs that have been executed, while others will provide information about which user launched the program, and when. As such, some of this information can be included in a timeline.

Hopefully, something that will become evident throughout this post, as well as other HowTo posts, is that rather than focusing on individual artifacts, we're going to start putting various artifacts into "buckets" or categories. The purpose for doing this is so that analysts don't get lost in a sea of artifacts, and are instead able to tailor their initial approach to an examination, possibly using an analysis matrix.

Okay, let's get started...

AutoStart Locations
Before we begin to look at the different artifacts that can be directly tied to a user (or not), I wanted to briefly discuss autostart locations. These are locations within the system...file system, Registry...where references to programs can reside that allow programs to be executed automatically, without any interaction from the user beyond booting the system or logging in. There are a number of such locations and techniques that can be used...Registry autostart locations, including the ubiquitous Run key, Windows services, the StartUp folder on the user's Program Menu, and even the use of the DLL Search Order functionality/vulnerability. Each of these can be (and have been) discussed in multiple blog posts, so for now, I'm simply going to present them here, under this "umbrella" heading, for completeness.

Scheduled Tasks can be, and are, used as an autostart location. Many of us may have QuickTime or iTunes installed on our system; during installation, a Scheduled Task to check for software updates is created, and we see the results of this task now and again. Further, on Windows 7 systems, a Scheduled Task creates backups of the Software, System, Security, and SAM hive files into the C:\Windows\system32\config\RegBack folder every 10 days. When considering autostart locations, be sure to check the Scheduled Tasks folder.

Tip
On a live system, you need to use both the schtasks.exe and at.exe commands to get a complete listing of all of the available Scheduled Tasks.

Tools: RegRipper plugins, MS/SysInternals AutoRuns; for XP/2003 Scheduled Task *.job files, jobparse.pl; on Vista+ systems, the files are XML

User
There are a number of artifacts within the user context that can indicate program execution. This can be very useful, as it allows analysts to correlate program execution to the user context in which the program was executed.

UserAssist
The contents of value data within a user's UserAssist subkeys can provide an excellent view into what programs the user has launched via the Explorer shell...by double-clicking icons or shortcuts, as well as by navigating via the Program Menu. Most analysts are aware that the value names are Rot-13 encoded (and hence, easily decoded), and folks like Didier Stevens have gone to great lengths to document the changes in what information is maintained within the value data, as versions of the operating systems have progressed from Windows 2000 to Windows 8.

Tools: RegRipper userassist.pl and userassist_tln.pl plugins

RunMRU
When a user clicks on the Start button on their Windows XP desktop, and then types a command into the Run box that appears, that command is added to the RunMRU key.

Interestingly, I have not found this key to be populated on Windows 7 systems, even though the key does exist. For example, I continually use the Run box to launch tools such as RegEdit and the calculator, but when I dump the hive file and run the runmru.pl RegRipper plugin against it, I don't see any entries. I have found the same to be true for other hives retrieved from Windows 7 systems.

Tools: RegRipper runmru.pl plugin

ComDlg32\CIDSizeMRU Values
The binary values located beneath this key appear to contain names of applications that the user recently launched. From my experience, the majority of the content of these values, following the name of the executable file, is largely zeros, with some remnant data (possibly window position/size settings?) at the end of the file. As one of the values is named MRUListEx, we can not only see (via a timeline) when the most recent application was launched, but we can also see when other applications were launched by examining available VSCs.

AppCompatFlags
According to MS, the Program Compatibility Assistant is used to determine if a program needs to be run in XP Compatibility Mode. Further, "PCA stores a list of programs for which it came up...even if no compatibility modes were applied", under the Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\Compatibility Assistant\Persisted key in the user's NTUSER.DAT hive. As such, we can query these values and retrieve a list of programs run by the user.

Tools: RegRipper appcompatflags.pl plugin (I updated the plugin, originally written by Brendan Cole, to include retrieving the values beneath the Persisted key, on 6 July 2013; as such, the plugin will be included in the next rollout)

MUICache
The contents of this key within the user hives (NTUSER.DAT for XP/2003, USRCLASS.DAT for Win7) often contains references to applications that were launched within the user context. Often times, these application will include command line interface (CLI) utilities.

Windows shortcuts/LNK files and Jump Lists
You're probably thinking..."huh?" Most analysts are familiar with how shortcuts/LNK files (and Jump Lists) can be used to demonstrate access to files or external storage devices, but they can also be used to demonstrate program execution within the context of a user.

Most of us are familiar with the LNK files found in the ..\Windows\Recent and ..\Office\Recent folders within the user profile...so, think about how those shortcuts are created. What usually happens is that the user double-clicks a file, the OS will read the file extension from the file, and then query the Registry to determine which application to launch in order to open the file. Windows will then launch the application...and this is where we have program execution.

Many times when a user installs an application on their system, a desktop shortcut may be created so that the user can easily launch the application. The presence of an icon on the desktop may indicate that the user launched an installer application.

Tools: custom Perl script, tools to parse LNK files

Java Deployment Cache Index (*.idx) Files
The beginning of 2013 saw a lot of discussion about vulnerabilities to Java, as well as reports of 0-days, and as a result, there was a small number of folks within the community looking into the use of Java deployment cache index (*.idx) files during analysis. The use of these files as artifacts during an investigation goes back to well before then, thanks to Corey Harrell. These files provide indications of downloads to the system via Java, and in some cases, those downloads might be malicious in nature. These artifacts are related specifically to Java being executed, and may lead to indications of additional programs being executed. Further, given that the path to the files is within the user profile folder, we can associate the launch of Java with a specific user context.

Tools: idxparse.pl parser

Browser History
A user's browser history not only indicates that they were browsing the web (i.e., executing the browser program), but the history can also be correlated to the *.idx files discussed above in order to determine which site they were visiting that caused Java to be launched.

System
There are a number of artifacts on the system that can provide indications of program execution

Prefetch File Analysis
Most analysts are aware of some of the metadata found within Prefetch files. Application prefetch files include metadata indicating when the application was last launched, as well as how many times it has been launched. This can provide some excellent information

Tools: pref.pl, or any other tools/scripts that parse the embedded module strings. Recent versions of scripts I've written and use incorporate an alerting mechanism to identify items within the strings and string paths found to be "suspicious" or "unusual".

AppCompatCache
This value within the System hive in the Registry was first discussed publicly by Mandiant, and has proven to be a treasure trove of information, particularly when it comes to malware detection and determining program execution, in general.

Tools: Mandiant's shim cache parser, RegRipper appcompatcache.pl plugin (appcompatcache_tln.pl plugin outputs in TLN format, for inclusion in timelines).

Legacy_* Keys
Within the System hive, most of use are familiar with the Windows services keys. What you may not realize is that there is another set of keys that can be very valuable when it comes to understanding when Windows services were run...the Enum\Root\Legacy_* keys. Beneath the ControlSet00n\Enum\Root key in the System hive, there are a number of subkeys whose names being with LEGACY_, and include the names of services.

There are a number of variants of malware (Win32/Alman.NAD, for example) that install as a service, or driver, and when launched, the operating system will create the Enum\Root\Legacy_* key for the service/driver. Also, these keys persist after the service or driver is no longer used, or even removed from the system. Malware writeups by AV vendors will indicate that the keys are created when the malware is run (in a sandbox), but it is more correct to say that the OS creates the key(s) automatically as a result of the execution of the malware. This can be an important distinction, which is better addressed in another blog post.

Tools: RegRipper legacy.pl plugin

Direct* and Tracing Keys
These keys within the Software hive can provide information regarding program execution.

The "Direct*" keys are found beneath the Microsoft key, and are keys whose names start with "Direct", such as Direct3D, DirectDraw, etc. Beneath each of these keys, you may find a MostRecentApplication key, which contains a value named Name, the data of which indicates an application that used the particular graphics functionality. Many times during an exam, I'll see "iexplore.exe" listed in the data, but during one particular exam, I found "DVDMaker.exe" listed beneath the DirectDraw key. In another case, I found "mmc.exe" listed beneath the same key.

I've found during exams that the Microsoft\Tracing key contains references to some applications that appear to have networking capabilities. I do not have any references to provide information as to which applications are subject to tracing and appear beneath this key, but I have found references to interesting applications that were installed on systems, such as Juniper and Kiwi Syslog tools (during incident response engagements, this can be very helpful and allow you collect Event Logs from the system that have since been overwritten, and included in a timeline...). Unfortunately, these artifacts have nothing more than the EXE name (no path or other information is included or available), but adding the information to a timeline can provide a bit of context and granularity for analysis.

Tip
When examining these and other keys, do not forget to check the corresponding key beneath the Wow6432Node key within the Software hive. The RegRipper plugins address this automatically.

Tools: RegRipper direct.pl and tracing.pl plugins

Event Logs
Service Control Manager events within the System Event Log, particularly those with event IDs 7035 and 7036, provide indications of services that were successfully sent controls, for either starting or stopping the service. Most often within the System Event Log, you'll see these types of events clustered around a system start or shutdown. During DFIR analysis, you're likely going to be interested in either oddly named services, services that only appear recently, or services that are started well after a boot or system startup. Also, you may want to pay close attention to services such as "PSExeSvc", "XCmdSvc", "RCmdSvc", and "AtSvc", as they may indicate lateral movement within the infrastructure.

On Windows 2008 R2 systems, I've found indications of program execution in the Application Experience Event Logs; specifically, I was examining a system that had been compromised via an easily-guessed Terminal Services password, and one of the intruders had installed Havij (and other tools) on the system. The Application-Experience/Program-Inventory Event Log contained a number of events associated with program installation (event IDs 903 and 904), application updates (event ID 905), and application removal (event IDs 907 and 908). While this doesn't provide a direct indication of a program executing, it does illustrate that the program was installed, and that an installer of some kind was run.

On my own Windows 7 system, I can open the Event Viewer, navigate to the Event Log, and view the records that illustrate when I have installed various programs knowingly (FTK Imager) and unknowningly (Google+ Chat). There are even a number of application updates to things like my ActiveState Perl and Python installations.

Tools: LogParser, evtxparse.pl

Other Indirect Artifacts
Many times, we may be able to determine program execution through the use of indirect artifacts, particularly those that persist well after the application has finished executing, or even been deleted. Many of the artifacts that we've discussed are, in fact, indirect artifacts, but there may still be others available, depending upon the program that was executed.

A number of years ago, I was...and I don't like to admit this...certified to perform PCI forensic audits. On one case, I ran into my first instance of a RAM scraper...this was a bit of malware that was installed on a point-of-sale (POS) back office server (running Windows) as a Windows service. After the system was booted, this instance of the malware would read the contents of a register, do some math, and use that value as a seed to wait a random amount of time before waking up and dumping the virtual memory from one of eight named (the names were listed in the executable file) processes. The next step was to parse the memory dump for track data, and this was accomplished via the use of Perl script that was "compiled" via Perl2Exe. I'm somewhat familiar with such executables, and one of the artifacts we found to validate our findings with respect to the actual execution of the malicious code was temporary directories created by "compiled" script. When executables "compiled" with Perl2Exe are run, any of the Perl modules (including the runtime) packed into the executable are extracted as DLLs into a temporary directory, at which time they are "available" to the running code. As the code was launched by a Windows service, the "temp" directories were found in the C:\Windows\Temp folder. The interesting thing that we found was that the temp directories used to hold the modules/DLLs are not deleted after the code completes, and they persist even if the program itself is removed from the system. In short, we had a pretty good timeline for each time the parsing code was launched.

On my own Windows 7 system, because I run a number of Perl scripts that were "compiled" with Perl2Exe within the context of my user account, the temp directories are found in the path, C:\Users\harlan\AppData\Local\Temp...the subdirectories themselves are named "p2xtmp-", and are followed by an integer, and themselves contain subdirectories that represent the Perl runtime namespace. The time stamps (creation dates) for these subdirectories provide indications of when I executed scripts that had been compiled via Perl2Exe.

Memory Dumps
During dead box analysis, memory dumps can be an excellent source of information. When an application crashes, a memory dump is created, and a log file containing information including a process list also created. When another application crash occurs, the memory dump is overwritten, but the log file is appended to, meaning that you can have a number of crash events available for analysis. I have found this historical information to be very useful during examinations because, while the information is somewhat limited, it can illustrate whether or not a program was running at some point in the past.

We're not going to discuss hibernation files here, as once you access a hibernation file and begin analysis, there really is very little difference between analyzing the hibernation file and analyzing a memory dump for a live system. Many of the techniques that you'd use, and the artifacts that you would look for, are pretty much the same.

Tools: text viewer

Malware Detection
Another use of this artifact category is that it can be extremely valuable in detecting the presence of malware on a system. However, malware detection is a topic that is best addressed in another post, as there is simply too much information to limit the topic to just a portion of a blog post.

Resources
This idea of determining program execution has been discussed before:
Timeline Analysis, and Program Execution
There Are Four Lights: Program Execution

Friday, July 05, 2013

HowTo: Correlate Files To An Application

Not long ago, I ran across a question on the ForensicFocus forum, in which the original poster (OP) said that

there were a number of files were found in a user profile during an examination, and they wanted to know which application was "responsible for" these files. There wasn't much of a description of the files (extension, content), so it wasn't as if someone could say, "oh, yeah...I know what those files are from."

There are number of analysis techniques that you can use in an effort to determine the origin of a file. My hope in sharing this information is to perhaps provide something you may not have seen or thought of before. Also, I'm hoping that others will share their thoughts and experiences, as well.

What's in a name?
Some applications have a naming convention for their files. For example, when you open MS Word and work on a document, there are temp files saved along the way while you edit the document that have a particular naming convention; using this naming convention, MS has advice for recovering lost MS Word documents.

Another example that I find to be useful is the naming convention used by digital cameras. We see this many times when our friends post pictures to social media without changing the names of the files, and we'll recognize the naming convention of the files (i.e., file name starts with "IMG" or "DSC", or something similar) and know that the files were uploaded directly from a digital camera or smartphone. This may also be true if the files were copied directly from the storage medium of the device to the computer system that you're examining.

Location
Some applications will save various files in specific locations, which are not usually changed by the user. However, in other instances, applications simply use the user or system %Temp% folder as a temporary storage location. MS Office, as mentioned above, uses the current working directory to store it's temp files, which are created at (by default) regular intervals while the application is open. If you have an MS Word document open on your desktop, and you're editing it, you can see these files being created.

Content
Try opening the file in question in a viewer or editor of some kind. Sometimes, a viewer like Notepad might be enough to see the contents of the file, and the file may contain contents that provide insight as to it's origin.

Tip
I remember working on a case a long time ago, assisting another analyst. They'd sent me a file that contained several lines, including an IP address, and what looked like a user name and password. I asked for the location of where the file was located on the system, but that wasn't much help to either of us. As we dug into the examination, it turned out that the system had been subject to a SQL injection attack, and what we were looking at was an FTP batch script; we found the commands used to create the script embedded within the web server logs, and we found the file downloaded to the system, as well.

One aspect of file contents is the file signature. File signature analysis is still in use, and most seasoned analysts are aware of the uses and limitations of this analysis technique. However, it may be a good place to start by opening the file in a hex editor, and viewing the first 20 or so bytes of the file, comparing that to the file extension.

Another aspect of content is metadata. Many file types...PDF, DOCX/PPTX, JPG, etc...have the capacity to store metadata within the file. Metadata stays with the file, regardless of where the file goes, or what the file name is changed to...as long as the format isn't modified (.jpg file opened in MS Paint, and saved in .gif format), or the file isn't somehow manipulated, then the metadata will remain.

Here's an excellent post that can provide some insight into where certain, specific files may have come from. This is a great example of how a file may be created as a result of a simple command line, rather than a full-blown GUI application.

While not specific to the contents of the file itself, look to see if the file has an associated alternate data stream. When XP SP2 was rolled out, any file downloaded via IE or OutLook had a specific ADS associated with it, which was referred to as the file's "zoneID". In many instances, I've see the same sort of thing on Windows 7 systems, even though the browser was Firefox or Chrome. If a file has an associated ADS, document the name and contents of the ADS, as it may provide a very good indication of the origin of the file, regardless of location. Also, keep in mind that it is trivial to fake these ADSs.

Timelines
Timeline analysis is a fantastic analytic tool for determining where files "came from". Timelines provide both context and granularity, and as such, can provide significant insight into what was happening on the system when the files were created (or modified).

Consider this...with just a file that you're curious about, you don't have much. Sure, you can open the file in an editor, but what if the contents are simply a binary mess that makes no sense to you? Okay, you check the creation date of the file, and then compare that to information you were able to pull together regarding the users logged on to the system, and you see the "cdavis" was logged on at the time in question. What does that tell you? I know...not a lot. However, if you were to create a timeline of system and user activity, you would see who was logged into the system, what they were doing and possibly even additional details about what may have occurred "near" the file being created. For example, you might have information about a user logging in and then sometime later, their UserAssist data shows that they launched an application, and this is followed by a Prefetch file being modified, which is followed by other activity, and then the file in question was created on the system.

If you're performing timeline analysis and suspect that the time stamps on the file in question may have been modified (this happens quite often, simply because it's so easy to do...), open the MFT and compare the creation date from the $FILE_NAME attribute to that of the $STANDARD_INFORMATION attribute; it may behoove you to include the $FILE_NAME attribute information in your timeline, as well.

Wednesday, July 03, 2013

HowTo: Determine Users on the System

Now and again, I see questions in various forums that are related to (or flat out asking) about how to go about determining users on a Windows system. In several instances, I've seen the question posted to the EnCase User Forum, asking why the results of the "Initialize Case" EnScript are different from what is retrieved by other tools. There are several locations within the Windows system that can contain information about accounts on the system.

SAM hive - The SAM hive maintains information about user accounts local to that system. In a corporate environment, many times you won't find the user account for the active user listed in the SAM hive, as the user account was set up in Active Directory and managed from a domain controller. In home environments, you're likely to see multiple user accounts listed in the SAM hive.

Tool: RegRipper samparse.pl plugin; the samparse_tln.pl plugin will parse the SAM hive and output various items (account creation date, last login date, etc.) in TLN format for inclusion in a timeline.

Software hive - Within the Software hive is a key named "ProfileList" that maintains a list of...you guessed it...user profiles on the system. This information can then be correlated against what you find within the file system (see below).

Tool: RegRipper profilelist.pl and winlogon.pl plugins. The winlogon.pl plugin checks for "Special Accounts", which are accounts that do not appear on the Welcome screen. This technique used by intruders in order to "hide" accounts from administrators.

File system - User information is maintained and "recorded" in the user's profile within the file system...on Vista+ systems, in the C:\Users folder. There should be a correlation between what's in the ProfileList key, and what can be observed within the file system.

Tool: Any file viewer, or mount the image as a volume and use the dir command

Note
The LocalService or NetworkService accounts having a populated IE index.dat (web history) may be an indication of a malware infection. I've examined systems infected with malware that is used for click-fraud and found an enormous index.dat file for one of these accounts.

Now, most analysts are aware that you can have an account listed in the SAM hive, but not have a user profile folder within the file system. What this can indicate is that the user account was set up, but has not been used to access the system yet. User profiles are not created until the user logs into the system using the account credentials.

Changing Settings
In order to determine if someone was accessing user settings (changing user account information, or modifying account information), there are two places you can look. First, examine the Windows Security Event Log for indications of events that pertain to user account management (see MS KB 977519 for a list of pertinent event IDs).

Tools: LogParser, evtxparse.pl

Second, look to the shellbags. What? That's right...look to the shellbag artifacts. Event Logs can roll over, and sometimes quickly, depending upon activity and Event Log settings (events being audited, Event Log size, etc.). If you suspect that a user has been creating user accounts, or if you just want to determine if that has been the case, check the shellbags artifacts, and you might see something similar to the following in the artifact listing:

Desktop\Control Panel\User Accounts
Desktop\Control Panel\User Accounts\Create Your Password
Desktop\Control Panel\User Accounts\Change Your Picture

The above listing is an extract that was pulled out of the shellbags artifacts from my own system, but I should note that while investigating a system that had been compromised via Terminal Services, I parsed the shellbags artifacts for the compromised user account, and found similar entries related to those above, with the exception that they indicated that a user account had been created. The intruder had then attempted to "hide" the account from the Welcome screen by making the new account a "Special Account", but they had misspelled one of the keys in the path, so the functionality was not enabled for the account.

Tool: RegRipper shellbags.pl plugin

Note
If you have thoughts as to how to expand these "HowTo" posts, or questions regarding how to take the analysis further, please let me know. Also, if there's anything specific that you'd like to see addressed, please comment here or contact me at keydet89 at yahoo dot com.

Monday, July 01, 2013

HowTo: Correlate an Attached Device to a User

Not long after my previous post on correlating LNK files to an external device, I received a question regarding correlating a device to a particular user. Some may look at this and think, well, that's easy...the LNK files in question are located in the user profile, so correlating the user to the device is actually pretty easy.

Okay, but what happens if there are no LNK files that point to the device? After all, for a LNK file to be available, the user must either create it manually, or perform some action where the operating system will create it automatically, right? Usually, this means that the user has opened a folder on the external device and double-clicked a non-executable file of some kind, such as a .txt file or a Word or PowerPoint document. The resulting action is that the appropriate application is launched based on the file extension, and the file is opened in that application, and an LNK file is created.

So, if you're examining a system and you suspect (or can show) that a USB device had been connected to the system, then how would you go about associating the device with a particular user, in the absence of LNK files?

MountPoints2 Key
As part of your USB device discovery process, one of the places that you're going to look is in the MountedDevices key within the System hive, in order to map the devices you've found in the USBStor subkeys to the volume globablly-unique identifier (GUID). Beneath the MountedDevices key, some of the value names will be the volume GUIDs, and their binary data will contain the device information, in Unicode format. Parsing the data and mapping back to the value name will provide you with the volume GUID. You can then use this information to search the subkeys beneath the MountPoints2 key in the NTUSER.DAT hive files for the volume GUID

The path to the MountPoints2 key is:
\Software\Microsoft\Windows\CurrentVersion\Explorer\MountPoints2

Volume GUID key LastWrite time
It's commonly accepted that the LastWrite time for the volume GUID subkey beneath the MountPoints2 key indicates when that device was last connected to the system.

Shellbags Artifacts
Another artifact that allows us to correlate an attached device to a user is the shellbags artifacts, which on Windows 7 systems are found in the USRCLASS.DAT hive file within the user profile. This is how we can tie a device to a particular user.

The shellbags artifacts are simply paths to resources composed of shell items, the same types of data structures that can be found in LNK file shell item ID lists, Jump Lists, as well as other locations within the Registry. On Vista and Windows 7 systems, the first time that the user opens an Explorer window to a folder on an attached USB device (external hard drive or thumb drive), one of the shell items in the path will represent the drive letter to which the device was mapped, followed by the folder paths. The drive letter can be correlated to a particular USB device by mapping to the contents of the MountedDevices key values (or, for a device previously connected to the system, you may want to dig a bit into the MountedDevices key contents available in VSCs), or to the "Windows Portable Devices\Devices" subkeys in the Software hive (again, be sure to check VSCs, as well).

Historical Registry Information
If you're looking for historical information from within the Registry, be sure to check the contents of the C:\Windows\system32\config\RegBack folder for backed-up copies of the System and Software hives.

However...and this is very important...the shellbags artifacts may contain information about attached devices that are not immediately obvious via other means of analysis. For example, USB external drives and thumb drives are usually represented on the system as a volume (i.e., F:\, G:\) or drive letter, whereas smartphones, digital cameras, and MP3 players, while storage devices, usually appear beneath in the shellbags artifacts as a different type (type == 0x2e) of shell item. Not all of the tools available for parsing shellbags will parse these types of shell items; in fact, those that don't seem to simply skip parsing the entire path, so shellbags with type 0x2e shell items at their root are not displayed.

Further, depending upon the version of Windows and the type of device, the shell items the comprise the path the resource might be type 0x00, or a variable type. As the name implies, the routine for parsing these types of shell items varies, and in many instances, a great deal of information can be retrieved via knowledgeable manual analysis.

Devices connected via Bluetooth
Devices can be connected to a Windows system in other ways, including via Bluetooth. What's interesting about this type of connection is that smartphones are pretty ubiquitous at this point, and once the initial connection has been made, reconnecting to the device is trivial, and the device doesn't even have to be in view. The device can be reconnected as long as it is in range, and can be on a belt, or in a backpack or purse.

In my research, I found that a lot of users will connect to their smartphone via Bluetooth and use that connection to play music via their computer. I also found out that MS provides a file called fsquirt.exe, which is loaded on a system during installation if the system is found to have a Bluetooth radio.

Use the bthport.pl RegRipper plugin to get some information about devices connected to a Windows system via Bluetooth.

Saturday, June 29, 2013

HowTo: Tie LNK Files to a Device

Based on commentary I've seen in a couple of online forums, I thought I'd resurrect the "HowTo" label from some previous blog posts, and share (for commentary, feedback and improvement) some of the analysis processes that I've used while examining images of Windows systems. There is a good deal of information available regarding various Windows artifacts, and one of perhaps the most difficult aspects of analysis is to tie various disparate bits of information together, correlating the artifacts, and building a complete picture so that your analysis can be used to answer questions and provide solutions.

This particular topic was previously discussed in this blog (and here's another, much older post), but sometimes processes like this need to be revisited. Before we start, however, it's important to point out that this process will work only on Windows Vista systems and above, due to the information that is required for the process to work properly.

LNK Files
A Windows shortcut/LNK file can contain volume serial number, or VSNs. This is intended to be a unique 4-byte (DWORD) value that identifies the volume, and is changed when the volume is reformatted. Many tools that parse LNK files will display the VSN in their output, if one exists.

Note: Prefetch files include a volume information block which also contains a VSN. If this information is different from the local system...that is, if a user launched an application from an external storage device...you can also use this process to correlate the VSN to the particular device. You can view the VSN for a volume on a live system by navigating to the volume via the command prompt and typing the 'vol' command.

Registry
The EMDMgmt key (within the Software hive) contains information about USB external devices connected to the system. This information is generated and used by the ReadyBoost service, at least in part to determine the suitability of the device for use as external RAM.

The path to the key in question is:
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\EMDMgmt

This key will contain subkeys that pertain to and describe external storage media. The subkeys that we're interested in are those that begin with "_??_USBSTOR#". These subkey names are very similar to artifacts found in the System hive, particularly in the USBStor subkeys. These subkey names include device serial number, as well a volume name (if one exists) and a VSN in decimal format.

An example of such a subkey name, with the VSN in bold, appears as follows:
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\EMDMgmt\_??_USBSTOR#Disk&Ven_Best_Buy&Prod_Geek_Squad_U3&Rev_6.15#0C90195032E36889&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}TEST_1677970716

For those subkeys that pertain to USB thumb drives, the emdmgmt.pl RegRipper plugin will parse the subkey name, and display the VSN formatted in a usable, understandable manner. That is to say that the plugin will translate the decimal value for the VSN into a hexidecimal format, and display it in the same manner as the VSN seen in LNK and Prefetch files, as well as what is displayed by the vol command on live systems.

Again, it is important to note the EMDMgmt key exists on Vista systems and above, but not on XP systems. As such, this technique will not work for XP/2003 systems.

Now that we have these two pieces of information, we can correlate LNK files (or Prefetch files, if necessary) to a particular device, based on the VSNs. I've used this technique a number of times, most recently in an attempt to determine a user's access to a particular device (remember, LNK files are most often associated with a user, as they are often located within the user's profile). If you know what it is that you're attempting to determine or demonstrate...that is, the goals of your analysis...then the tools and artifacts tend to fall right into place. When I've had to perform this type of correlation of artifacts, because of the tools I have available, this analysis is complete in just a few minutes.

As a final note, do not forget the value of historical information on the system, particularly for the Registry. The RegBack folder should contain a backed-up copy of the Software hive, and there is additional information available in VSCs. Corey Harrell has a number of excellent posts on his blog that demonstrate how to use simple tools and processes...batch files...to exploit the information available in VSCs.

Resources
MS-SHLLINK file format specification
Description of EMDMgmt RegRipper plugin

Thursday, June 20, 2013

Crossing Streams

Sometimes, crossing the streams can be a good thing. I was checking out some of the new posts on my RSS feed recently, and saw SQLite on the Case over on the LinuxSleuthing blog.

I'm not an anti-Linux, Windows-only guy. I'm just a guy who's used Windows, done vulnerability assessments of Windows systems, and been asked to do IR and forensic analysis of Windows systems for a long time. I also like looking in other places for something I can use to make my job easier, progress more smoothly, and allow me to perform a more comprehensive analysis of Windows systems. Because you can find SQLite databases on Windows systems (usually associated with a third-party application, such as Firefox or Chrome, or found in iDevice backups), I like to see what's out there in the DFIR community with respect to this database structure, and the LinuxSleuthing blog has been both generous and valuable in that regard.

The blog as a whole, and this post specifically, contains a lot of great info, but this time, it wasn't the technical info within the post that caught my attention...it was something of a crossing of the streams. What I mean by that is that I saw a couple of statements in the post that reminded me of things I'd said in my own blog, and it was as if two different people, with different backgrounds, interests, etc., were following the same (or very similar) reasoning.

The first statement:

It is very common in SQLite databases for integers to represent a deeper meaning than their numeric value.

This sentence took me back to my post on understanding data structures. Much like many of the data structures on Windows systems, the SQLite table in question has a column that contains integers, which must be translated in order to be meaningful to the analyst. Either the analyst does this automatically, or a software tool, which provides a layer of abstraction over the raw data, does it for us. In this case, "4" refers to an "incoming" call...that's important because doing a text-based keyword search for "incoming" won't reveal anything directly pertinent to the table or column in question.

In the case of Windows systems, a 4 byte DWORD value might be an identifier, providing information regarding the type of something, or it might be a flag value, containing multiple settings AND'd together. Our tools tend to provide a layer of abstraction over the data, and many will translate the integers to their corresponding human-readable values. As such, it is important that we understand the data structure and it's constituent elements, rather than simply relying on tools to spit out the data for us, as this helps us better understand the context of the data that we're looking at.

Consider the DOSDate time stamps found embedded in some shell items...what do they represent, and where do they come from? Okay, we have something of an understanding of what they represent - the MAC times of the target resource (folder, file usually). If the file system in which the target resource resides is NTFS, we know that the values start as FILETIME objects with 100 nanosecond granularity, are truncated to the second, and then (per the publicly available MS API), the second value is multiplied by 2. So, an NTFS time value of "23:15:05.657" becomes "23:15:10", and we have a considerable loss of granularity. We also know that the embedded time stamps are a snapshot in time, and that the resources can be impacted by actions outside the purview of the shell items. For example, after a shell item ID list within an LNK file is created, files can be added to or deleted from one of the constituent folders, updating the last modification time. Finally, of what value is the last accessed time on Vista and above systems?

So, my (rather long winded) point is that when we see a date/time stamp labeled "last access time" in the output of a tool, do we really understand the context of that integer value?

Okay, on to the second statement...

If you thought use another tool and see what it says then go outside and drag your knuckles on the concrete for a bit.

This statement reminded me on my thoughts on what I refer to as the tool validation "myth-odology".

Consider another recent blog post regarding LNK file parsing tools; in that post, I described the issue I wanted to test for, and then after querying a forum for the tools folks currently used to parse these files, I ran several of them against some test data. In this case, my goal was to see which of the tools correctly parsed shell item ID lists within the LNK files. Given this goal, would you download a tool that does NOT parse/display the shell item ID list in an LNK file, and use it to validate the output of other tools?

Here's the issue...many of the tools recommended for parsing LNK files parse only the header and LinkInfo block, and not the shell item ID list. Some that do parse the shell item ID lists do not appear to do so correctly. The shell item data structures are not isolated to just LNK files, they're actually pretty pervasive on Windows systems, but I chose to start with LNK files as they've been around for a lot longer than other artifacts, and tools for parsing them might be more mature.

So, why the interest in shell item ID lists? Well, in most normal cases, the shell item ID lists might simply provide redundant information, replicating what's seen in the LinkInfo block. However, there are legitimate cases where Windows will create an LNK file for a device (smart phone, digital camera, etc.) that consists solely of a header and a shell item ID list, and does not contain a LinkInfo block. Not being able to accurately parse these LNK files can present an issue to IP theft (devices are great at storing files) and illicit images cases (might mean the difference between possession and production). Also, there is malware out there that uses specially-crafted LNK files (i.e., "target.lnk") as a propagation mechanism.

Given all this, if someone wants to use the output of a tool that does not parse the shell item ID list of an LNK file to validate the output of another tool, I think the above imagery of "drag your knuckles" is appropriate. ;-) Just sayin'.

Wednesday, June 19, 2013

Reading

I wanted to share some of the interesting items I've read over the passed couple of weeks, and in doing so, I think that it's important to share not just that it was read, but to also share any insights gleaned from the material. It's one thing to provide a bunch of links...it's another thing entirely to share the impact that the material had on your thinking, and what insights you might have had. I see this a good deal, not just in the DFIR community...someone posts links to material that they have read, as if to say, "hey, click on this and read it...", but don't share their insights as to why someone should do that. If something is of value, I think that a quick synopsis of why it's of value would be useful to folks.

I look at it this way...have you ever looked up a restaurant on Yelp to read the reviews, and used what you saw to decide whether you wanted to go or not? Or have you ever look at reviews of movies in order to decide if you wanted to spend the money to see it in the theater now, or just wait until it hits the cable system, where you can see it for $5? That's the approach I'm taking with this post...

Anyway, onward...

The Needs of the Many - this is an excellent blog post that presents and discusses the characteristics of servant security leader.

This is an excellent read, not just for those seeking to understand how to be a servant leader, but also for the any Star Trek fan, as Andrew uses not just quotes from the series and movies, but also uses scenes as metaphors for the topic. It's one thing to write a paragraph and add a Wikipedia link for reference, but it's another thing entirely to use iconic movie characters and scenes to illustrate a point, such as three-dimensional thinking, giving the reader that, "oh, yeah" moment.

Survivorship Bias - this blog post was an excellent read that really opened my eyes to how we tend to view our efforts in tool testing, as well as in analysis.

A quote from the article that really caught my attention is:

Failure to look for what is missing is a common shortcoming, not just within yourself but also within the institutions that surround you.

This is very true in a lot of ways, especially within the DFIR community, which is the "institution" in the quote. Training courses (vendor-specific and otherwise) tend to talk a lot about tools, and as such, many analysts focus on tools, rather than the analysis process. Some analysts will use tools endorsed by others, never asking if the tools have been tested or validated, and simply trusting that they have been. In other cases, analysts will use one tool to validate the output of another tool, without ever understanding the underlying data structures being parsed...this is what I refer to as the tool validation "myth-odology".

This focus on tools is taking analysts away from where they need to be focused...which should be on the analysis process. I've seen analysts say that the tools allow non-experts to be useful, but how useful are they if there is no understanding of what the tool is parsing, or how the tools does what it does? A non-expert finding a piece of "evidence" at a physical crime scene will not know to provide the context of that evidence, and the same is true in the digital realm, as well. Tools should not be viewed as "making non-experts useful". Again, this is part of the "institution".

What I've seen this lead to is the repeated endorsement and use of tools that do not completely parse data structures, and do not provide any indication that they have not parsed those structures. As the tools are endorsed by "experts", and analysts find just those things that they are looking for (that "survivorship bias", where failures are not considered) the tools continue to be used, and this is something that appears to be institutional rather than an isolated, individual problem.

If you have the time, I highly recommend reading through other posts at the YouAreNotSoSmart blog. There are some really good posts there that are definitely worth your time to not just read, but consider and ingest. I'm a firm believer that anyone who wants to progress in a field needs to regularly seek knowledge outside of that field, and this is a good place to spend some time doing just that.

Paul Melson: GrrCON 2012 Forensic Challenge Write-up - Folks seem to really like stories about how others accomplished something...Paul provides how he answered the GrrCON 2012 Forensic Challenge. This is actually from the end of last year, but that's fine...everything he did still holds. Paul walks through the entire process used, describing the tools used, providing command lines, and illustrating the output. If you attempted this challenge, compare what you did to what Paul provided.

Within the DFIR community especially, I've found that analysts really tend to enjoy reading about or hearing how others have gone about solving problems. However, one of the shortcomings of the community is that not a lot of folks within it like to share how they've gone about solving problems. I know that last year, a friend of mine tried to set up a web site where folks could share those stories, but it never got off the ground. It's unfortunate...none of us alone is as smart as all of us together, and there is a lot of great information sitting out there, untapped and unused, because we aren't willing or able to share it.

Digital Forensics Stream: Amazon Cloud Drive Forensics, pt I - Similar to Paul's post, this DFStream blog post wasn't about a challenge, but it did provide an end-to-end walk through of an important aspect of analysis...testing. With the rise of cloud services, and an increased discussion of their use, one of the aspects that analysts will have to contend with is the use of desktop applications for accessing these services. Access to these services are being built into platforms, and their use is going to become more transparent (to the user) over time. As such, analysts are going to need to have an understanding of the effect of these applications on the systems being analyzed, and many analysts are in fact already asking those questions...and this post provides some answers.

GSN: Computer forensics experts discover how to determine how many times a hard drive had been turned on, and for how many hours it had run. Very interesting find within the SMART info of some drives, this can definitely be useful.

Tuesday, June 18, 2013

There Are Four Lights: LNK Parsing tools

Based on the content of my last post regarding shell items, I wanted to take a look at some of the available tools for parsing Windows shortcuts/LNK files. I started by asking folks what tools they used to parse LNK files, and then went looking for those, and other, tools.

The purpose of this blog post is to take a look at how effective some of the various tools used by analysts are at parsing shell item ID lists within Windows shortcut/LNK files.

This blog post contains an excellent description of what we're looking for and trying to achieve with the testing.

My previous post includes sample output from the tool I use to parse LNK files; one of the files I used for testing is an LNK file for a device that does not contain a LinkInfo block, but instead contains ONLY a shell item ID list. I did not specifically craft this LNK file...it was taking from a Windows 2008 R2 system. I copied this file to my desktop, and changed the extension from ".lnk" to ".txt".

For comparison purposes, the script I wrote parses the device test file as follows:

File: c:\users\harlan\desktop\camera.txt
shitemidlist My Computer/DROID2/Removable Storage/dcim/Camera

The other file I used in this testing is an LNK file created by the installation of Google Chrome. All of the tools tested handled parsing this LNK file just fine, although not all of them parsed the shell item ID list.

Now, on to the tools themselves. Some of the things I'm most interested in when looking at tools for parsing LNK files include completeness/correctness of output, ease of use, the ease with which I can incorporate the output into my analysis processes, etc. I know that some of these aspects may mean different things to different people...for example, if you're not familiar with parsing shell item ID lists, how do you determine completeness/correctness? Also, "ease of use" may mean "GUI" to some, but it may mean "CSV output" to others. As such, I opted to not give any recommendations or ratings, but to instead just provide what I saw in the output from each tool.

TZWorks lp64 v0.55 - lp64 (the 64-bit version of the tool) handled the Google Chrome LNK file easily, as did the other tools included in this test. Unlike some of the other tools, lp64 parsed the shell item ID list from the LNK file:

ID List: {CLSID_UsersFiles}\Local\Google\Chrome\Application\chrome.exe

For the device test file described above, lp64 provided the following output:

ID List: {CLSID_MyComputer}\{2006014e-0831-0003-0000-000000000000}

I'm not sure what the GUID refers to...I did a look up via Google and didn't find anything that would really give me an indication of what that meant. Looking at the file itself in a hex editor (i.e., UltraEdit), I can see from where that data originated, and I can tell that the shell item was not parsed correctly; that is to say, the 16 bytes extracted from the file are NOT a GUID, yet lp64 parses them as such.

WoanWare LnkAnalyser v1.01 - This tool is a CLI utility that took me a couple of attempts to run, first because I had typed "lnkanalyzer" instead of "lnkanalyser". ;-) I then pointed it at the camera.txt file from the previous post (renamed from camera.lnk) and it did not display any shell item contents. In fact, the tool listed several sections (i.e., Target Metadata, Volume ID, TrackerDataBlock, etc.), all of which were empty, with the exception of the time stamps, which were listed as "1/1/0001 12:00:00 AM".

LnkAnalyser did handle the Google Chrome LNK file just fine, but without parsing the shell item ID list.

Lnk_parser - The Google Code page states that this tool is "in beta" and should not be rehosted...I opted to include it in testing. It turns out that this tool is very interactive (which I could have avoided, had I read the command line usage instructions), posting a list of questions to the console for the analyst to answer, with respect to where the target file is located, the type of output that you want, and where you want the output to go. I chose CSV output, going to the current working directory, as well as to the console. The output of the tool did include:

[Link Target ID List]
CLSID: 20d04fe0-3aea-1069-a2d8-08002b30309d = My Computer

This was followed by a number of "[Property Store]" entries that made little sense; that is to say, I am familiar with what these entries might represent from my research, but the data that they contain doesn't look as it would be meaningful or usable to an analyst. I did find a reference to one of the PROPERTYSTORAGE values from the lnk_parser output listed in the Cloud Storage Forensic Analysis PDF, reportedly as part of the output from XWays 16.5, but I'm not clear as to what it refers to.

Lnk_parser did not handle the Google Chrome LNK file at all. I used the same settings/choices as I did for the previous file, and got no output the console. The resulting CSV file in the working directory had only one entry, and it was just some garbled data.

MiTeC Windows File Analyzer (WFA) - LNK files are just one of the file formats that WFA is capable of parsing. WFA is GUI-based and works on directories (rather than individual files), so I had to rename the camera.txt file to camera.lnk. WFA did not parse any data from the camera.lnk file, although it handled the Google Chrome LNK file just fine. WFA did not, however, parse the shell item ID list from the Google Chrome LNK file.

Log2Timeline - a user over on the Win4n6 forum mentioned that log2timeline parses shell item ID lists in LNK files, but I verified with Kristinn that at the moment, it does not. As such, log2timeline was not included in the test. I am including it in this listing simply due to the fact that someone had mentioned that it does parse shell item ID lists.

Other tools - some others have mentioned using EnCase 6 and/or 7 for parsing LNK files; I do not have access to either one, so I cannot test them.

Results
The overall results of my (admittedly limited) testing indicates that the TZWorks lp64 tool does the best job of the available tools when it comes to parsing shell item ID lists within LNK files. That being said, however, some shell items do not appear to be parsed correctly.

On a side note, something that I do like about lp64 is that it lists it's output in an easy-to-parse format. Each element is listed on a single line with an element ID, a colon, and then the data...this makes it easy to parse using Perl or Python.

So What?
So, why is this important? After all, who cares? Well, to be honest, every analyst should, for the simple fact that shell items can be found in a number of artifacts besides just Windows LNK files. They exist in shellbags artifacts, within the MenuOrder subkeys, they're embedded within Windows 7 and 8 Jump Lists, ComDlg32 subkey values (Vista+), and they can even be found in the Windows 8 Photo artifacts. Being able to understand shell items can be important, and being able to correctly parse device shell items can be even more important; in CP cases, the use of devices may indicate production, and in IP theft cases, a device may have been used as the target of a copy or move operation. Also, there is malware that is known to use specially-constructed LNK files (ie, target.lnk) as part of their infection/propagation mechanism, so being able to accurately parse these files will be valuable to the malware threat analysis.

Resources
ForensicsWiki page - LNK
LinuxSleuthing blog post

Saturday, June 01, 2013

There Are Four Lights: Shell Items

There's a good bit of information available on artifacts referred to as "shellbags", but not much information, nor discussion, on the underlying data structures within shellbags...shell items.

Shell items are data structures used to identify various elements within the Windows folder hierarchy. Where a simple ASCII listing the path to a folder or file would suffice, we instead have shell items, and paths reconstructed by parsing shell item ID lists. Some of these shell items are 22 bytes in length, and contain just a GUID, which needs to be translated into something that the analyst can recognize, such as "My Computer" or "Control Panel". Other shell items refer to other resources, including folders, and need to be parsed differently. More on that later.

LNK files
Shell items have been part of Windows systems well before talk of shellbags first came up. Shell items are included in Windows shortcut/LNK files, which have been available on Windows systems for quite some time. However, it's only been within the passed 12 to 18 months that there's been much real recognition of the fact that LNK files contain shell items, and this recognition has been due, in part, to some of the popular tools used to parse LNK files actually parsing this information. Even today, there are a number of tools available and in common use for parsing LNK files that do not parse the shell items. In just the passed year along, I've examined a number of Windows systems on which LNK files were created for devices (in most cases, digital cameras) that consisted solely of a header and a shell item ID list, and did not contain a LinkInfo block. What this means is that many of the commonly used tools simply show nothing in the output. Why does this matter at all? Well, take a look at this blog post regarding what we're doing wrong with respect to LNK parsing, and when you're done, read the follow-up blog post, found here.

Do I expect to see intruders manipulating LNK files in a manner similar to what is described here? No, I don't...that doesn't mean that it won't happen, however. What I have seen when it comes to LNK files is the use of LNK files that are comprised solely of a header and a shell item ID list, but no LinkInfo block, which means that most of the tools in common use within the community will not show data.

A while back, Sophos released a tool to help protect users from the exploitation of the CVE-2010-2568 vulnerability. Given that this vulnerability is almost three years old, it makes me wonder who often the analysis of shell item ID lists within LNK files is missed.

Time stamps
Many of the shell item structures include DOSDate format time stamps, which correlate to the modified, accessed, and created dates of the object resource (usually, a file or a folder). A couple of things to keep in mind with respect to these times, particularly when the platform you're analyzing is formatted NTFS:

The DOSDate time stamps within the shell items, particularly for resources located on the system itself, were originally stored within the MFT as FILETIME objects. What this means is that we have a significant loss in granularity, going from 64-bits based on 100-nanosecond intervals, to 32-bits with the seconds multiplied times 2. If the seconds value for the original FILETIME time stamp is 5, then what's stored in the DOSDate format is 10...and a difference of 5 seconds can be significant, particularly in timeline analysis, and if you don't know enough about the data structures to explain it.
Systems from Vista on up do not, by default, update last accessed times through normal user activity, such as opening files.
Target resources listed in shellbags can be modified by activity and processes outside the purview of the shellbags artifacts.

While digging a bit deeper into parsing XP shellbags, I saw a number of structures that included either FILETIME objects, or strings that specified a date and time, embedded within the shell items; however, without additional documentation and resources, I really have no way of determining to what these time stamps refer or correlate. However, suffice to say, those time stamps are there and likely pertain to something.

Many types of shell items also include a good deal of embedded information within the structure itself. Some include file sizes, and can be used to demonstrate changes in file size over time (on Vista and Windows 7 systems, you may be able to demonstrate the changes in contents via analysis of the files in VSCs). I've also seen some shell items that contain a lot of information, which each section marked by an individual GUID that I needed to look up on the MS site to determine what it meant...in one case, one of the embedded GUIDs marked the last modification time of the resource, while another marked the creation date.

Where are shell items used?
Artifacts that include shell items include:

LNK files
Jump Lists (both auto* and custom*, on both Win7 and 8)
Shellbags
MeunOrder subkeys
ComDlg32 subkey values (Vista+)
Windows 8 USRCLASS.DAT (Photos artifacts)

What's clear to me is that shell items appear to be increasing in use within Windows systems as the versions increase. Shell items are used to refer to resources other than files and folders. Some shell items refer to network resources, building out a path to other systems that the user accessed. In a corporate environment, it's not unusual to see paths to file servers, but many times it may be an HR issue when there are number of paths that lead to the C$ share on other employee systems. Some shell items refer to devices, such as digital cameras, smart phones and iPods, while others refer to web-based resources. I examined a compromised system a while back and found that the intruder had used FTP through Windows Explorer; I found this very interesting because the shellbags were the only artifacts of this activity, and would have been missed if I had not examined these artifacts.

Usefulness
Overall, what is the usefulness of understanding these artifacts? One of the things that I've seen throughout my time as an analyst is that if we don't know about something, we're not likely to incorporate it into our analysis process. The purpose of this blog post is to raise awareness of these artifacts, and get folks looking at them in more than just training courses.

With respect to shellbags artifacts, things changed drastically between XP and Win7. With XP, the shellbags artifacts are located in the NTUSER.DAT, and the NodeSlot value within each BagMRU subkey points to a Bag\Shell subkey, which may have an ItemPos* value (i.e., a value whose name starts with "ItemPos", followed by what looks like it might be a screen resolution setting). If so, very often this value contains a number of concatenated shell items that provide a directory listing...yes, that's exactly right, the contents of the folder. I know of one analyst who has used this information to demonstrate the contents of encrypted volumes.

With Vista (and subsequently, Win7), these artifacts were moved to the USRCLASS.DAT hive, and no longer make use of the NodeSlot value to correlate additional information from the hive. However, there are an ItemPos* values in the NTUSER.DAT hive that can provide you with an indication of the files on a user's Desktop at a specific point in time.

And now for the Ugly...
While Microsoft provides documentation of a number of formats, shell items is not one of them. It has taken the work of a small number of dedicated folks within the community, sometimes with support from a small number of other folks who have provided sample data, to put together initial documentation and subsequent tools for parsing these artifacts.

As of now, there are very few tools for parsing LNK files that will parse the shell item ID list (SHITEMIDLIST). And at this point, we're only talking about parsing the information, not including it into other analysis methodologies.

There are a few tools available that parse shellbags, and of those, most do not parse all of the available shell items. In particular, many of the available tools do not parse shell item structures that point to devices. For some of those tools, this can be verified through the source code, while for others, you would need to run those tools and compare the output with other resources. IMHO, I'd think that something like this would be a significant issue, not just in cases involving illicit images (may show production over possession or distribution), but also in cases of IP theft, harassment, etc. But this illustrates why it's so important for analysts to understand the underlying data structures that are being parsed.

For example, here is the output of a script for parsing LNK files, including the shell item ID list, run against a Google Chrome LNK file on my desktop:

C:\Perl\jl>lnk.pl "c:\users\harlan\desktop\Google Chrome.lnk"
File: c:\users\harlan\desktop\Google Chrome.lnk
mtime Fri May 17 21:35:44 2013 UTC
atime Wed May 22 11:33:48 2013 UTC
ctime Wed Apr 13 19:37:47 2011 UTC
workingdir C:\Users\harlan\AppData\Local\Google\Chrome\Application
basepath C:\Users\harlan\AppData\Local\Google\Chrome\Application\chrome.exe
description Access the Internet
machineID enzo
birth_obj_id_node 00:50:56:c0:00:08
shitemidlist Users/AppData/Local/Google/Chrome/Application/chrome.exe
vol_sn 22D3-06AE
vol_type Fixed Disk

Here's the output of the same script, run against a completely legit LNK file taken from another system (extension changed):

File: c:\users\harlan\desktop\camera.txt
shitemidlist My Computer/DROID2/Removable Storage/dcim/Camera

That's it...there's nothing else to display, no LinkInfo block, no string data, nothing beyond the header and the shell item ID list. In fact, the flags in the header specifically state that there is no LinkInfo block. Again, it is critical to understand here that the LNK file was not specifically crafted as an exercise; rather, it was created through the normal, legitimate use of the operating system. However, there are few tools available that will parse the shell item ID lists, and those tools will provide no output for this file.

Resources
ForensicsWiki: LNK
ForensicsWiki: Shell Item

Special thanks to Joachim Metz for correlating and providing a great deal of format information when it comes to a variety of data structures on Windows systems. I have to say that I fully agree with his philosophy on analysis, as listed in his ForensicsWiki bio. I also want to thank Kevin Moore for his work in supporting Andrew Case, et al, by writing the shellbag parsing code for Registry Decoder. Finally, I would like to thank all of those who have provided sample data for me to use in developing some parsing tools.

Wednesday, May 29, 2013

Good Reading, Tools

Reading
Cylance Blog - Uncommon Event Log Analysis - some great stuff here showing what can be found with respect to indirect or "consequential" artifacts, particularly within the Windows Event Logs on Vista systems and above. The author does a pretty good job of pointing out how some useful information can be found in some pretty unusual places within Windows systems. I'd be interested to see where things fall out when a timeline is assembled, as that's how I most often locate indirect artifacts.

Cylance Blog - Uncommon Handle Analysis - another blog post by Gary Colomb, this one involving the analysis of handles in memory. I liked the approach taken, wherein Gary explains the why, and provides a tool for the how. A number of years ago, I had written a Perl script that would parse the output of the MS SysInternals tool handle.exe (ran it as handle -a) and sort the handles found based on least frequency of occurrence, in order to do something similar to what's described in the post.

Security BrainDump - Bugbear found some interesting ZeroAccess artifacts; many of the artifacts are similar to what is seen in other variants of ZA, as well as in other malware families (i.e., file system tunneling), but in this case, the click fraud appeared in the systemprofile folder...that's very interesting.

SpiderLabs Anterior - The White X - this was an interesting and insightful read, in that it fits right along with Chris Pogue's Sniper Forensics presentations, particularly when he talks about 'expert eyes'. One thing Chris is absolutely correct about is that we, as a community, need to continue to shift our focus away from tools and more toward methodologies and processes. Corey Harrell has said the same thing, and I really believe this to be true. While others have suggested that the tools help to make non-experts useful, I would suggest that the usefulness of these "non-experts" is extremely limited. I'm not suggesting that one has to be an expert in mechanical engineering and combustion engine design in order to drive a car...rather, I'm simply saying that we have to have an understanding of the underlying data structures and what the tools are doing when we run those tools. We need to instead focus on the analysis process.

Java Web Vulnerability Mitigation on Windows - Great blog post that is very timely, and includes information that can be used in conjunction with RegRipper to in order to determine initial infection vector (IIV) during analysis.

ForkSec Blog - "new" blog I saw referenced on Twitter one morning, and I started my reading with the post regarding the review of the viaExtract demo. I don't do any mobile forensics at the moment, but I did enjoy reading the post, as well as seeing the reference to Santoku Linux.

Tools
win-sshfs - ssh(sftp) file system for Windows - I haven't tried this one but it does look interesting.

4Discovery recently announced that they'd released a number of tools to assist in forensic analysis. I downloaded and ran two of the tools...LinkParser and shellbagger. I ran LinkParser against a legit LNK file that I'd pulled from a system that contained only a header and a shell item ID list (it had no LinkInfo block), and LinkParser didn't display anything. I also ran LinkParser against a couple of LNK files that I have been using to test my own tools, and it did not seem to parse the shell item ID lists. I then ran shellbagger against some test data I've been working with, and found that, similar to other popular tools, it missed some shell items completely. I did notice that when the tool found a GUID that it didn't know, it said so...but it didn't display the GUID in the GUI so that the analyst could look it up. I haven't yet had a chance to run some of the other tools, and there are reportedly more coming out in the future, so keep an eye on the web site.

ShadowKit - I saw via Chad Tilbury on G+ recently that ShadowKit v1.6 is available. Here's another blog post that talks about how to use ShadowKit; the process for setting up your image to be accessed is identical to the process I laid out in WFAT 3/e...so, I guess I'm having a little difficulty seeing the advantages of this tool over native tools such as vssadmin + mklink, beyond the fact that it provides a GUI.

Autopsy - Now has a graphical timeline feature; right now, this feature only appears to include the file system metadata, but this approach certainly has potential. Based on my experience with timeline analysis, I do not see the immediate value in this approach to bringing graphical features to the front end of timeline analysis. There are other tools that utilize a similar approach, and as with those, I don't see the immediate value, as most often I'm not looking for where or when the greatest number of events occur, but I'm usually instead looking for the needle in stack of needles. However, I do see the potential for the use of this technique in timeline analysis. Specifically, adding Registry, Windows Event Log, and other events will only increase the amount of data, but one means for addressing this would be to include alerts in the timeline data, and then show all events as one color, and alerts as another. Alerts could be based on either direct or indirect/consequential artifacts, and can be extremely valuable in a number of types of cases, directing the analyst's attention to critical areas for analysis.

NTFS TriForce - David Cowen has released the public beta of his NTFS TriForce tool. I didn't see David's presentation on this tool, but I did get to listen to the recording of the DFIROnline presentation - the individual artifacts that David describes are very useful, but real value is obtained when they're all combined.

Auto-rip - Corey has unleashed auto-rip; Corey's done a great job of automating data collection and initial analysis, with the key to this automation being that Corey knows and understands EXACTLY what he's doing and why when he launches auto-rip. This is really the key to automating any DFIR task..while some will say that "it goes without saying", too often there is a lack of understanding with respect to the underlying data structures and their context when automated tools are run.

WebLogParser - Eric Zimmerman has released a log parser with geolocation, DNS lookups, and more.

Tuesday, May 21, 2013

Plugin: SAMParse

I thought I'd take a moment to discuss the samparse.pl plugin. This plugin parses the SAM hive file for information regarding user accounts local to the system itself, as well as their group membership, both of which can be very valuable and provide a good amount of insight for the analyst, depending upon the case. The information retrieved by this plugin should be correlated against the output of the profilelist.pl plugin, as well as the user profiles found within the file system.

One of the initial sources for parsing the binary data maintained within the SAM hive is the Offline Windows Password and Registry Editor. There is also a good deal of useful information in this AccessData PDF document.

An interesting piece of information displayed by this plugin, if available, is the user password hint. This capability was part of the plugin starting on 20 Oct 2009 (the capability was included in XP), and discussed by SpiderLabs almost 3 years later. This may provide useful information for an analyst...I have actually seen what turned out to be the user's password here!

Perhaps one of the most confusing bits of information in the output of the samparse.pl plugin is the "Password not required" entry. This is based on a check of a flag value, and means just that...that a password is not required. It does NOT mean that the account does not have a password...it simply means that one is not required. As such, you may find that the account does, indeed, have a password. I've seen posts to various forums and lists that either ask about this setting, or simply state that the output of RegRipper is incorrect. I am always glad to entertain and consider issues where the interpretation of a Registry value or data flag setting is incorrect, particularly if it is supported with solid data.

If you're analyzing a Vista or Windows 7 system and run across something suspicious regarding the local user accounts, remember that you will have a copy of the SAM hive in the Windows\system32\config\RegBack folder that you can incorporate into your analysis, and that you may also have older SAM hives in available VSCs.

Finally, there's a version of this plugin that provides timeline (TLN) output for various bits of time stamped date, to include account creation date, the password reset date, the last password failure date, and the last login. Incorporating this into your timeline, along with the historical information available in other Registry resources (such as those mentioned in the above paragraph), can provide considerable insight into user activity on the system.

Resources
MS KB305144 -
Scripting Guy blog, 7/7/2006

Thursday, May 16, 2013

The Tool Validation "Myth-odology"

I posted recently about understanding data structures, and I wanted to continue with that thought process and line of reasoning into the area of the current state of tool validation.

What we have seen in the community for some time is that a new tool is announced or mentioned, and members of the community begin clamoring for their copy of that tool. Many times, one of the first questions is, "where can I download a copy of the tool?" The reasons most give for wanting to download a copy of the tool is so that they can "test" it, or use it to validate the output of other tools. To that, I would pose this question - if you do not understand what the tool is doing, what it is designed to do, and you do not understand the underlying data structures being parsed, how can you then effectively test the tool, or use that tool to validate other tools?

As such, the current state of tool validation, for the most part, isn't so much a methodology as it is a myth-odology. Obviously, this isn't associated with testing and validation processes such as those used by NIST and other organizations, and applies more to individual analysts.

There are tools out there right now that are being recommended as being THE tool for parsing a particular artifact or set of artifacts. The tools are, in fact, very good at what they do, but the fact is that some of them do not parse all of the data structures available within the set of artifacts, nor do they identify the fact that they're missing these structures in their output. I'm aware of analysts who, in some cases, have stated that the fact that the tool doesn't parse and display specific artifacts isn't an issue for them, because the tool showed them what they were looking for. I think what's happening is that someone will run a tool against a data set, see a lot of data in the output, and deem it "good". They may then run another tool against the same data set, see different output, and deem one of the tools "not good" or at the very least, "questionable". What I don't think is happening is that analysts are testing the tools against the data structures themselves, viewing the data itself as a 'blob' and relying on the tools to provide that layer of abstraction I mentioned in my previous post.

Consider the parsing of shell items, and shell item ID lists. These artifacts abound on Windows systems, more so as the versions of Windows increase. One place that they've existed for some time is in the Windows shortcuts (aka, LNK files). Some of the tools that we've used for years parse both the headers and LinkInfo blocks of these files, but it's only been in the past 12 - 18 months or so that tools have parsed the shell item ID lists. Why is this important? These blog posts do a great job of explaining why...give them a read. Another reason is that over the past year or so, I've run across several LNK files that consisted solely of the header and the shell item ID list...there was no LinkInfo block to parse. As such, some of the tools that were available at the time would simply return blank output.

There is also the issue of understanding how a tool performs it's function. Let's take a look at the XP Event Log example again. Tools that use the MS API for parsing these files are likely going to return the "corrupted file" message that we're all used to seeing, but tools that parse the files on a binary level, going record-by-record, will likely work just fine.

Another myth or misconception that is seen too often is that the quality of the tool is determined by how much space the output consumes. This simply is not the case. Again, consider the shell item ID lists in LNK files. Some of the structures that make up these lists contain time stamps, and a number of tools display the time stamps. What do these time stamps mean? How are they generated/produced? Perhaps equally important is the question, what format are the time stamps saved in? As it turns out, the time stamps are DOSDate format, consuming 32-bits and having a 2 second granularity. On NTFS systems, a folder entry (that leads to the target file) that appears in the shell item ID list will have a 64-bit FILETIME time stamp converted to a 32-bit DOSDate time stamp, with a corresponding loss in granularity. As such, it's important to not only understand the data structure and its various elements, but also the context of those structure elements. As such, if one tool lists all of the elements of the component data structures, and another does not, is the second tool any less valid or correct?

Returning to the subject of data structures, does this mean that every analyst must know and understand the details for every available data structure on, say, a Windows system? No, not at all...that's simply not realistic. The answer, IMHO, is that analysts need to engage. If you're unclear about something, ask. If you need a reference, ask someone. There are some great structure references posted on the ForensicsWiki, including those posted by Joachim Metz, but I think that far too few analysts use that site as a resource. By sharing what we know, and coupling that with what we need to know, we can approach a better method for validating the tools and methodologies that we use.