Monday, May 30, 2016

What's the value of data, and who decides?

A question that I've been wrestling with lately is, for a DFIR analyst, what is the value of data?  Perhaps more importantly, who decides the value of available data?

Is it the client, when they state what their goals, what they're looking for from the examination?  Or is it the analyst who interprets both the goals and the data, applying the latter to the former?

Okay, let me take a step back...this isn't the only time I've wrestled with this question. In fact, if you look here, you'll see that this is a question that has popped up in this blog before.  There have been instances over the past almost two decades of doing infosec work that I, and others, have tussled with the question, in one form or another.  And I do think that this is an important question to turn to and discuss time and again, not specifically to seek an answer from one person, but for all of us to share our thoughts and hear what others have to say and offer to the discussion.

Now, back to the question...who determines the relative value of data during an examination?  Let's take a simple example; a client has an image of an employee's laptop (running Windows 7 SP1), and they have a question that they would like answered.  That question could be, "Is/was the system infected with malware?", or "...did the employee perform actions in violation of acceptable use policies?", or "...is there evidence that data (PII, PHI, PFI, etc.) had been exfiltrated from the system?"  The analyst receives the image, and runs through their normal in-processing procedures, and at that point, they have a potential wealth of information available to them; Prefetch files, Registry data, autoruns entries, the contents of various files (hosts, OBJECTS.DATA, qmgr0.dat, etc.), Windows Event Log records, the hibernation file, etc.

Just to be clear...I'm not suggesting an answer to the question.  Rather, I'm putting the question out there for discussion, because I firmly believe that it's important for us, as a profession, to return to this question on a regular basis. Whether we're analyzing individual images, or performing enterprise incident response, I tend to think that sometimes we can get caught up in the work itself, and every now and then it's a good idea to take a moment and do a level set.

Data Interpretation
An issue that I see analysts struggling with is the interpretation of the data that they have available.  A specific example is what is referred to as the Shim Cache data.  Here are a couple of resources that describe what this data is, as well as the value of this data:

Mandiant whitepaper, 2012
Mandiant Presentation, 2013
FireEye blog post, 2015

The issue I've seen analysts at all levels (new examiners, experienced analysts, professional DFIR instructors) struggling with is in the interpretation of this data; specifically, updates to clients (as well as reports of analysis provided to a court of law) will very often refer to the time stamp associated with the data as indicating the date of execution of the resource.  I've seen reports and even heard analysts state that the time stamp associated with a particular entry indicates when that file was executed, even though there is considerable documentation readily available, online and through training courses, that states that this is, in fact, NOT the case.

Data interpretation is not simply an issue with this one artifact.  Very often, we'll look at an artifact or indicator in isolation, outside and separate from its context with respect to other data "near" it in some manner.  Doing so can be extremely detrimental, leading an analyst down the wrong road, down a rabbit hole and away from the real issue at hand.

GETALLTHETHINGS
The question then becomes, if we, as a community and a profession, do not have a solid grasp of the value and correct interpretation of the data that we do have available to us now, is it necessarily a good idea to continue adding even more data for which we may not have even a passing understanding?

Lately, there has been considerable discussion of shell items on Windows systems.  Eric's discussed the topic on his BinForay blog, and David Cowen recently conducted a SANS webcast on the topic.  Now, shell items are not a new topic at all...they've been discussed previously within the community, including within this blog (here, and here).  Needless to say, it's been known within the DFIR community for some time that shell items are the building blocks (in part, or in whole) for a number of Windows artifacts, including (but not limited to) Windows shortcut/*.lnk files, Jump Lists, as well as a number of Registry values.

Now, I'm not suggesting that we stop discussing shell items; in fact, I'm suggesting the opposite, that perhaps we don't discuss this stuff nearly enough, as a community or profession.

Circling back to the original premise for this post, how valuable is ALL the data available from shell items?  Yes, we know that when looking at a user's shellbag artifacts, we can potentially see a considerable number of time stamps associated with a particular entry...an MRU time, several DOSDATE time stamps, and maybe even an NTFS MFT sequence number.  All, or most, of this can be available along with a string that provides the path to an accessed resource.  Further, in many cases, this same information can be derived from other data sources that are comprised of shell items, such as Windows shortcut files (and by association, Jump Lists), not to mention a wide range of Registry values.

Many analysts have said that they want to see ALL of the available data, and make a decision as to its relative value.  But at what point is ALL the data TOO MUCH data for an analyst?  There has to be some point where the currently available data is not being interpreted correctly, and adding even more misunderstood/misinterpreted data is detrimental the analyst, to the case, and most importantly, to the client.

Reporting
Let's look at another example; a client comes to you with a Windows server system, says that the system appears to have been infected with ransomware a week prior, and wants to know the source of the infection; how did the ransomware get on the system in the first place?  At this point, you have what the client's looking for, and you also have a time frame on which to focus your examination. During your analysis, you determine the initial infection vector (IIV) for the ransomware, which appeared to have been placed on the system by someone who'd subverted the system's remote access capability.  However, during your examination, you also notice that 9 months prior to the ransomware infection, another bit of malware seemed to have infected the system, possibly due to a user's errant web surfing.  And you also see that about 5 months prior to that, there were possible indications of yet another malware infection of some kind.  However, having occurred over a year ago, the IIV and any impact of the infection is indeterminate.

The question is now, do you provide all of this to the client?  If the client asked a specific question, do you potentially bury that answer in all of your findings?  Perhaps more importantly, when you do share all of your findings with them, do you then bill them for the time it took to get to that point?  What if the client comes back and says, "...we asked you to answer question A, which you did; however, you also answered several other questions that we didn't ask, and we don't feel that we should pay for the time it took to do that analysis, because we didn't ask for it."

If a client asks you a specific question, to determine the access vector of a ransomware infection, do you then proceed to locate and report all of the potential malware infections (generic Trojans, BHOs, ect.) you could find, as well as a list of vulnerable, out-of-date software packages?

Again, I'm not suggesting that any of what I've described is right or wrong; rather, I'm offering this up for discussion.

Wednesday, May 11, 2016

...back in the Old Corps...

I was digging through some boxes recently and ran across a bit of "ancient history"....

MS-DOS, WFW, Win95 diskettes
Ah, diskettes...anyone remember those?  When I was in college, this was how we did networking.  Sneakernet.  Copy the file to the diskette, carry it over to another computer.  Pretty reliable protocol.

I have (3) MS-DOS 6.0 diskettes, MS-DOS 6.22 setup diskettes, (8) diskettes for Windows-for-Workgroups 3.11, and (13) diskettes for Windows 95.

And yes, I still have a diskette drive, one that connects to a system via USB.  I would be interesting to see if I could set up a VM in VirtualBox running any of these systems.

I guess the days of tweaking your autoexec.bat file are long gone.  Sigh.

I did find some interesting sites when I went looking around the purport to provide VirtualBox images:

Kirsle.net (this site refers the reader to downloading the files at Kirsle.net)
Vintage VMs, 386Experience

I wish I still had my OS/2 Warp disks.  I was in grad school when OS/2 Warp 3.0 came out, and I went up to Frye's Electronics in Sunnyvale, CA, and purchased a copy of OS/2 2.1, the box of which had a $15 off coupon for when you purchased version 3.0.  I remember installing it, and running a script provided by one of the CS professors that would optimize the driver loading sequence so that the system booted and ran quicker.  I really liked being able to have multiple windows open doing different things, and web browser that came with Warp was the first one where you could drag-n-drop images from the browser to the desktop.

Windows, Office on CD
Here's a little bit more history...Windows 95 Plus, and Windows NT 4.0 Workstation, along with a couple of copies of Office.  I also have the CDs for Windows NT 4.0 Server, Windows 2003 Server, and I have (2) copies of Windows XP.











OS/2 Warp 4.52, running in VirtualBox
Oh, and hey...this just happened!  I found a VM someone had uploaded of OS/2 Warp 4.52, and it worked right out of the box (no pun intended...well, maybe just a little...)

Saturday, May 07, 2016

Accessing Historical Information During DF Work

There are a number of times when performing digital forensic analysis work that you may want access to historical information on the system.  That is to say, you'd like to reach a bit further into the past history of the system beyond what's directly available within the image.

Modern Windows systems can contain hidden caches of historical information that can provide an analyst with additional visibility and insight into events that had previously occurred on a system.  Knowing where those caches are and how to access them can make all the difference in your analysis, and knowing how to access them efficiently doesn't significantly impact your analysis.

System Restore Points
In the course of the analysis work I do, I still see Windows XP and 2003 system images; most often, they'll be images of Windows 2003 Server systems.  Specifically when analyzing XP systems, I've been able to extract Registry hives from Restore Points and get a partial view of how the system "looked" in the past.

One specific example that comes to mind is that during timeline analysis, I found that a Registry key had been modified (i.e., via the key LastWrite time).  Knowing that a number of events could lead to the key being modified, I found the most recent previous version of the hive in the Restore Points, and found that at that time, one of the values wasn't visible beneath the key.  The most logical conclusion was then that the modification of the key LastWrite time was the result of the value (in this case, used for malware persistence) being written to the key.

The great thing is that Windows actually maintains an easily-parsed log of Restore Points that were created, which include the date, as well as the reason for the RP being created.  Along with the reasons that Microsoft provides for RPs being created, these logs can provide some much-needed context to your analysis.

RegBack Folder
Beginning with Vista, a number of system processes that ran as services were moved over to Scheduled Tasks that were part of the default installation of Windows.  The specific task is named "RegIdleBackup", is scheduled to run every 10 days, and creates backup copies of the hives in the system32\config folder, placing those copies in the system32\config\RegBack folder.

VSCs
The images I work with tend to be from corporations, and in a great many instances, Volume Shadow Copies are not enabled on the systems.  Some of the systems virtual machines, others images taken from servers or employee laptops.  However, every now and then I do find a system image with difference files available, and it is sometimes fruitful to investigate the extent to which historical information may be available.

Now, the Windows Forensic Analysis books have an entire chapter that details tools and methods that can be used to access VSCs, and I've used the information in those chapters time and time again.  Like I mentioned in a previous post, one of the reasons I write the books is so that I have a reference; there are a number of analysis tasks I'll perform, the first step of which is to pull one of the books of my shelf.  As an update to the information in the books, and many thanks to David Cowen for sharing this will me, I've used libvshadow to access VSC metadata and folders when other methods didn't work.

What can be found in a VSC is really pretty amazing...which is probably why a lot of threat actors and malware (ransomware) will disable and delete VSCs as part of their process.

Hibernation File
A while back, I was working on an issue where we knew a system had been infected with a remote access Trojan (RAT).  What initially got our client's attention was network alerts illustrating that the RAT was "phoning home" from this system.  Once we received an image of the system, we found very little to indicate the presence of the RAT on the system.

However, the system was a laptop, and the image contained a hibernation file.  Our analysis, along with the network alerts, provided us with an indication of when the RAT had been installed on the system, and the hibernation file had been created after that time, but before the system had been imaged. Using Volatility, we were able to not just see that the RAT had been running on the system; we were able to get the start time of the process, extract a copy of the RAT's executable image from memory, locate the persistence mechanism in the System hive extracted from memory, etc.

Remember, the hibernation file is a snapshot of the running system at a point in time, much like a photograph that your mom took of you on your first day of school.  It's frozen in time, and can contain an incredible wealth of information, such as running processes, executable images, Registry keys/values, etc.  If the hibernation file was last modified during the window of compromise, or anywhere within the time frame of the incident you're investigating, you may very well find some extremely valuable information to help add context to your examination.

Windows.Old Folder
Not long ago, I went ahead and updated my personal laptop from Windows 7 to Windows 10.  Once the update was complete, I ended up with a folder named "Windows.old".  As I ran through the subfolders, reviewing the files available within each, I found that I had Registry hives (in the system32\config folder, RegBack folder, and user folders), Windows Event Log files, a recentfilecache.bcf file, etc.  There was a veritable treasure trove of historical information about the system just sitting there, and the great thing was that it was all from Windows 7!  Whenever I come out with a new book, the first question people ask is, "...does it cover Windows ?"  Well, if that's a concern, when you find a Windows.Old folder, it's a previous version of Windows, so everything you knew about Windows still applies.

Deleted Keys/Values
Another area of analysis that I've found useful time and time again is to look within the unallocated space of Registry hive files themselves for deleted keys and values.  Much like a deleted file or record of some kind, keys and values deleted from the Registry will persist within the unallocated space within the hive file itself until that space is reclaimed and the information is overwritten.

Want to find out more about this subject?  Check out this book...seriously.  It covers what happens when keys and values are deleted, where they go, and tools you can use to recover them.

Wednesday, May 04, 2016

Updates

RegRipper Plugins
I don't often get requests on Github for modifications to RegRipper, but I got one recently that was very interesting. Duckexmachina said that they'd run log2timeline and found entries in one ControlSet within the System hive that wasn't in the one marked as "current", and as a result, those entries were not listed by the appcompatcache.pl plugin.

As such, as a test, I wrote shimcache.pl, which accesses all available ControlSets within the System hive, and displays the entries listed.  In the limited testing I've done with the new plugin, I haven't yet found differences in the AppCompatCache entries in the available ControlSets; in the few System hives that I have available for testing, the LastWrite times for the keys in the available ControlSets have been identical.

As you can see in the below timeline excerpt, the AppCompatCache keys in both ControlSets appear to be written at shutdown:

Tue Mar 22 04:02:49 2016 Z
  FILE                       - .A.. [107479040] C:\Windows\System32\config\SOFTWARE
  FILE                       - .A.. [262144] C:\Windows\ServiceProfiles\NetworkService\NTUSER.DAT
  FILE                       - .A.. [262144] C:\Windows\System32\config\SECURITY
  FILE                       - .A.. [262144] C:\Windows\ServiceProfiles\LocalService\NTUSER.DAT
  FILE                       - .A.. [18087936] C:\System Volume Information\Syscache.hve
  REG                        - M... HKLM/System/ControlSet002/Control/Session Manager/AppCompatCache 
  REG                        - M... HKLM/System/ControlSet001/Control/Session Manager/AppCompatCache 
  FILE                       - .A.. [262144] C:\Windows\System32\config\SAM
  FILE                       - .A.. [14942208] C:\Windows\System32\config\SYSTEM

Now there may be instances where this is not the case, but for the most part, what you see in the above timeline excerpt is what I tend to see in the recent timelines I've created.

I'll go ahead and leave the shimcache.pl plugin as part of the distribution, and see how folks use it.  I'm not sure that adding the capability of parsing all available ControlSets is something that is necessary or even useful for all plugins that parse the System hive.  If I need to see something from a historical perspective within the System hive, I'll either go to the RegBack folder and extract the copy of the hive stored there, or access any Volume Shadow Copies that may be available.

Tools
MS has updated their Sysmon tool to version 4.0.  There's also this great presentation from Mark Russinovich that discusses how the tool can be used in an infrastructure.  It's well worth the time to go through it.

Books
A quick update to my last blog post about writing books...every now and then (and it's not very often), when someone asks if a book is going to address "what's new" in an operating system, I'll find someone who will actually be able to add some detail to the request.  For example, the question may  be about new functionality to the operating system, such as Cortana, Continuum, new browsers (Edge, Spartan), new search functionality, etc., and the artifacts left on the system and in the Registry through their use.

These are all great questions, but something that isn't readily apparent to most folks is that I'm not a testing facility or company.  I'm one guy.  I do not have access to devices such as a Windows phone,  a Surface device, etc.  I'm writing this blog post using a Dell Latitude E6510...I don't have a touch screen device available to test functionality such as...well...the touch screen, a digital assistant, etc.  I don't have access to a Windows phone.

RegRipper is open source and free.  As some are aware, I end up giving a lot of the new books away.  I don't have access to a device that runs Windows 10 and has a touch screen, or can run Cortana.  I don't have access to MSDN to download and test new versions of Windows, MSOffice, etc.

Would I like to include those sorts of artifacts as part of RegRipper, or in a book?  Yes, I would...I think it would be really cool.  But instead of asking, "...does it cover...", ask yourself instead, "what am I willing to contribute?"  It could be devices for testing, or the data extracted from said devices, along with a description of the testing performed, etc.  I do what I can with the resources I have available, folks.

Analysis
I was pointed to this site recently, which begins a discussion of a technique for finding unknown malware on Windows systems.  The page is described as "part 1 of 5", and after reading through it, while I think that it's a good idea to have things like this available to DFIR analysts, I don't agree with the process itself.

Here's why...I don't agree that long-running processes (hash computation/comparison, carving unallocated space, AV scans, etc.) are the first things that should be done when kicking off analysis.  There is plenty of analysis that can be conducted in parallel while those processes are running, and the necessary data for that analysis should be extracted first.

Analysis should start with identified, discrete goals.  After all, imaging and analyzing a system can be an expensive (in terms of time, money, staffing resources, etc.) process, so you want to have a reason for going down this road.  Find all the bad stuff is not a goal; what constitutes bad in the context of the environment in which the system exists?  Is the user a pen tester, or do they find vulnerabilities and write exploits?  If so, bad takes on an entirely new context. When tasked with finding unknown malware, the first question should be, what leads us to believe that this system has malware on it?  I mean, honestly, when a sysadmin or IT director walks into their office in the morning, do they have a listing of systems on the wall and just throw a dart at it, and whichever system the dart lands on suddenly has malware on it?  No, that's not the case at all...there's usually something (unusual activity, process performance degradation, etc.) that leads someone to believe that there's malware on a system.  And usually when these things are noticed, they're noticed at a particular time.  Getting that information can help narrow down the search, and as such should be documented before kicking off analysis.

Once the analysis goals are documented, we have to remember that malware must execute in order to do damage.  Well, that is...in most cases.  As such, what we'd initially want to focus on is artifacts of process execution, and from there look for artifacts of malware on the system.

Something I discussed with another analyst recently is that I love analyzing Windows systems because the OS itself will very often record artifacts as the malware interacts with it's ecosystem.  Some malware creates files and Registry keys/values, and this functionality can be found within the code of the malware itself.  However, as some malware executes, there are events that may be recorded by the operating system that are not part of the malware code.  It's like dropping a rock in a pond...there's nothing about the rock, in and of itself, that requires that ripples be produced; rather, this is something that the pond does as a reaction to the rock interacting with it.  The same can very often be true with Windows systems and malware (or a dedicated adversary).

That being said, I'll look forward to reading the remaining four blog posts in the series.

Monday, May 02, 2016

Thoughts on Books and Book Writing

The new book has been out for a couple of weeks now, and already there are two customer reviews (many thanks to Daniel Garcia and Amazon Customer for their reviews).  Daniel also wrote a more extensive review of the book on his blog, found here.  Daniel, thanks for the extensive work in reading and then writing about the book, I greatly appreciate it.

Here's my take on what the book covers...not a review, just a description of the book itself for those who may have questions.

Does it cover ... ?
One question I get every time a book is released is, "Does it cover changes to ?"  I got the with all of the Windows Forensic Analysis books, and I got it when the first edition of this book was released ("Does it cover changes in Windows 7?").  In fact, I got that question from someone at a conference I was speaking at recently.  I thought that was pretty odd, as most often these questions are posted to public forums, and I don't see them.  As such, I thought I'd try to address the question here, so that maybe people could see my reasoning, and ask questions that way.

What I try to do with the books is address an analysis process, and perhaps show different ways that Registry data can be incorporated into the overall analysis plan.  Here's a really good example of how incorporating Registry data into an analysis process worked out FTW.  But that's just one, and a recent one...the book is full of other examples of how I've incorporated Registry data into an examination, and how doing so has been extremely valuable.

One of the things I wanted to do with this book was not just talk about how I have used Registry data in my analysis, but illustrate how others have done so, as well.  As such, I set up a contest, asking people to send me short write-ups regarding how they've used Registry analysis in their case work.  I thought it would be great to get different perspectives, and illustrate how others across the industry were doing this sort of work.  I got a single submission.

My point is simply this...there really is not suitable forum (online, book, etc.) or means by which to address every change that can occur in the Registry.  I'm not just talking about between versions of Windows...sometimes, it's simply the passage of time that leads to some change creeping into the operating system.  For example, take this blog post that's less than a year old...Yogesh found that a value beneath a Registry key that contains the SSID of a wireless network.  With the operating system alone, there will be changes along the way, possibly a great many.  Add to that applications, and you'll get a whole new level of expansion...so how would that be maintained?  As a list?  Where would it be maintained?

As such, what I've tried to do in the book is share some thoughts on artifact categories and the analysis process, in hopes that the analysis process itself would cast a wide enough net to pick up things that may have changed between versions of Windows, or simply not been discussed (or not discussed at great length) previously.

Book Writing
Sometimes, I think about why I write books; what's my reason or motivation for writing the books that I write?  I ask this question of myself, usually when starting a new book, or following a break after finishing a book.

I guess the biggest reason is that when I first started looking around for resources the covered DFIR work and topics specific to Windows systems, there really weren't any...at least, not any that I wanted to use/own.  Some of those that were available were very general, and with few exceptions, you could replace "Windows" with "Linux" and have the same book.  As such, I set out to write a book that I wanted to use, something I would refer to...and specifically with respect to the Windows Registry Forensics books, I still do.  In fact, almost everything that remained the same between the two editions did so because I still use it, and find it to be extremely valuable reference material.

So, while I wish that those interested in something particular in a book, like covering "changes to the Registry in ", would describe the changes that they're referring to before the book goes to the publisher, that simply hasn't been the case.  I have reached out to the community because I honestly believe that folks have good ideas, and that a book that includes something one person finds interesting will surely be of interest to someone else.  However, the result has been...well, you know where I'm going with this.  Regardless, as long as I have ideas and feel like writing, I will.