Monday, March 19, 2018

DFIR Brain Droppings

Live Response
It's been a while since I posted anything on the topic of live response, but I recently ran across something that really needed to be shared as widely as possible.

Specifically, Hadar Yudovich recently authored an article on the Illusive Networks blog about finding time stamps associated with network connections. His blog post is pretty fascinating, as he says some things that are probably true for all of us; in particular, we'll see a native tool (such as netstat.exe), and assume that the data that the tool presents is all that there is.  We simply don't remember that MS did not create an operating system with DFIR in mind.  However, Hadar demonstrates that there is a way to get time stamps for network connections on Windows systems, and wrote a Powershell script to do exactly that.

I downloaded and ran the Powershell script from a command prompt (not "Run as administrator") using the following command line:

powershell -ExecutionPolicy Bypass .\Get-NetworkConnections.ps1

The output is in a table format, but for anyone familiar with Powershell, I'm sure that it wouldn't be hard to modify the output to CSV, or some other format.  Either way, this would be a great addition to any volatile data collection script.

This is pretty cool stuff, and I can see it being included in volatile data collection processes going forward.

Processes and Procedures
Speaking of collecting volatile data...

A couple of things I've seen during my time as an incident responder is (a) a desire within the community for the "latest and greatest" volatile data collection script/methodology, and (b) a marked reticence to document processes and procedures.  I mention these two specifically because from my perspective, they seem to be diametrically opposed; after all, what is a volatile data collection script but a documented process?

Perhaps the argument I've heard against documented processes over the years is that having them "stifles creativity".  My experience has been the documenting my analysis process for activities such as malware detection within an acquired image, I'm able to ultimately spend more time on the fun and interesting aspects of analysis.  Why is that?  Well, for me, a documented process is a living document, one that is continually used and updated as necessary.  Also, the documented process serves as a means for automation.  As such, as I learn something new, I add it to the process, so that I don't forget something..God knows that most days, I can't remember what I had for breakfast, so how am I going to remember something that I read (didn't find or do as part of my own analysis) six months ago?  The simple fact is that I don't know everything, and I haven't seen everything, but I can take those things I have seen, as well as what others have seen (culled together from blog posts, etc.) and incorporate them into my process.  Having the process automated means that I spend less time doing those things that can be automated, and more time actually investigating those things that I need to be...well...investigating.

An example of this is eventmap.txt; I did not actually work the engagement where the event source/ID pair for "Microsoft-Windows-TaskScheduler/709" event record was observed.  However, based on what it means, and the context surrounding the event being recorded, it was most definitely something I wanted to incorporate into my analysis process.  Even if I never see the event again in the next 999 cases I work, that 1000th case where I do see it will make it worth the effort to document it.

Documenting processes and procedures for many is a funny thing.  Not long ago, with a previous employer, I was asked to draft analysis processes and workflows, because other analysts said that they didn't "have the credibility".  Interestingly enough, some of those who said this are now active bloggers.  Further, after I drafted the workflows, they were neither reviewed, nor actually used.  However, I found (and still find) a great deal of value in having a documented process or workflow, and I continue to use and develop my own.

All this talk of processes and workflows logically leads to questions of, where do I get the data and information that I turn into intelligence and incorporate into my workflows?  Well, for the most part, I tend to get the most intel from cases I work.  What better way to go from GB or TB of raw data to a few KB of actual information, with the context to turn that information into intelligence, than to do so working actual DFIR cases?  Ultimately, that's where it all starts, right?

So, we can learn from our own cases, but we can also learn from what others have learned and shared.  Ah, that's the key, though, isn't it...sharing the information and intelligence.  If I learn something, and keep it to myself, what good is it?  Does it mean that I have some sort of "power"?  Not at all; in fact, it's quite the opposite.  However, if I share that information and/or intelligence with others, then you get questions and different perspectives, which allows us to develop and sharpen that intelligence.  Then someone else can use that intelligence to facilitate their analysis, and perhaps include additional data sources, extending the depth and value of the intelligence. As such, pursuing OSINT sources is a great way to not only further develop your own intel, but to develop indicators that you can then use to further your analysis, and by extension, furthering your intel. 

This recent FireEye blog post is a great example (it's one, there are many others) of OSINT material.  For example, look at the reference to the credential theft tool, HomeFry.  This is used in conjunction with other tools; that alone is pretty powerful.  In the past, we've seen a variety of sources say things like, " X uses Y and Z tools..." without any indication as to how the tools are used.  In one of my own cases, it was clear that the adversary used Trojan A to get on an initial jump system, and from there, installed Trojan B, and the used that as a conduit to push our Trojan B to other systems.   So, it's useful to know that certain tools are used, but yes, those tools are easily changed.  Knowing how the tools are used is even more valuable. There's also a reference to lure documents that exploit the CVE-2017-11882 vulnerability; here's the PaloAlto Networks analysis of the exploit in the wild (they state that they skipped some of the OLE metadata...), and here are some apparent PoC exploits.

This write-up from ForcePoint is also very informative.  I know a lot of folks in the industry would look at the write-up and say, "yeah, so the malware persists via the user's Run key...big deal."  Yeah, it IS a big deal.  Why?  Because it still works.  You may have seen the use of the Run key for persistence of years, and may think it's passe, but what does it say about the security industry as a whole that this still works, and that this persistence mechanism is found through response post-mortem?

Here's a fascinating write-up on the QWERTY ransomware from BleepingComputer.  Some of what I thought was fascinating about it is the use of batch files and native tools...see, the bad guys automate their stuff, so maybe we should, too...right?  Part of what the ransomware does is use two different commands to delete volume shadow copies, and uses wbadmin to delete backups.  So how is this information valuable?  Well, do you use any of these tools in your organization?  If not, I'd definitely enable filters/alerts for their use in an EDR framework.  If you do use these tools, do you use them in the same way as indicated in the write-up?  No?  Well, have alerts you can add to your EDR framework.

Here's another very interesting blog post from Carbon Black, in part describing the use of MSOffice doc macros.  I really like the fact that they used Didier's to list the streams and extract the VBS code.  Another interesting aspect of the post is something we tend to see a great deal of from the folks at CarbonBlack, and that's the use of process trees to illustrate the context behind one suspicious or apparently malicious process.  Where did it come from?  Is this something we see a great deal of?  For example, in image 7, we see that wmiprvse.exe spawns certutil.exe, which spawns conhost.exe; is this something that we'd want to create an alert for?  If we do and run it in testing mode, or search the EDR data that we already have available, do we see a great deal of this?  If not, maybe we'd want to put that alert into production mode.

Recently, US CERT published an alert, TA18-074A, providing intel regarding specific threat actors.  As an example of what can be done with this sort of information and intelligence, Florian Roth published some revised Yara rules associated with the alert.

Something I saw in the alert that would be useful for both threat hunters and DFIR (i.e., dead box) analysis is the code snippets seen in section 2 of the alert.  Specifically, modified PHP code appears as follows:

img src="file[:]//[.]dd/main_logo.png" style="height: 1px; width: 1px;" /

Even knowing that the IP address and file name could change, how hard would it be to create a Yara rule that looks for elements of that line, such as "img src" AND "file://"?  More importantly, how effective would the rule be?  Would it be a high fidelity rule, in your environment?  I think that the answer to the last question depends on your perspective.  For example, if you're threat hunting within your own environment, and you know that (a) you have PHP code, and (b) your organization does NOT use "img src" in any of your code, then this would be pretty effective.

Something else that threat hunters and DFIR analysts alike may want to consider adding to their analysis processes is from the section of the alert that talks about the adversary's use of modified LNK files to collect credentials from systems.  Parsing LNK files and looking for "icon filename" paths to remote systems might not seem like something you'd see a great deal of, but I can guarantee you that it'll be worth the effort the first time you actually do find something like that.

Side Note: I updated my lnk parsing tool (as well as the source files) to display the icon filename path; the tool would already collect it, it just wasn't displaying it.  It does now. 

If you're looking into threat feeds as a "check in the compliance box", then none of this really matters.  But if you're looking to really develop a hunting capability, either during DFIR/"dead box" analysis, or on an enterprise-wide scale, the real value of threat intel is only realized in the context of your infrastructure, and operational policies. 

Thursday, March 15, 2018

DFIR Questions, How-Tos...

Not long ago, I finished up the content of my latest book, Investigating Windows Systems, and got it all shipped off to the publisher.  The purpose of this book is to go beyond my previous books; rather than listing artifacts and mentioning ways they can be used, I wanted to walk through examinations, using CTF and forensic challenge images that are available online.

A short-coming of this approach is that it leaves a lot of topics not addressed, or perhaps not as fully addressed as they could be.  For example, of the images I used in writing my book, there were no business email compromises, and little in the way of lateral movement, etc.  There was some analysis of user activity, but for the most part, it was limited.

Back in July 2013, I had some time available, and I wrote up about a dozen "How To" blog posts covering various Windows DFIR topics.  What I've thought might be of value to the community is to go back to those "How To" posts, expand and extend them a bit, add coverage for Windows 10, and include them in a book.

My question to the community at large is this...what are some of the topics that should be addressed, beyond those I blogged about almost 5 years ago? 

Now, when considering these questions, or opportunities for "How To" chapters, please understand that I may not be able to address all of them.  For example, I've never conducted a business email compromise (BEC) I've pointed out before, even in just over two decades of DFIR consulting, I haven't seen everything, and I don't know everything.  I also do not have access to an AD environment. 

Even so, I'd still appreciate your input, because some of the answers and thoughts I can provide may serve as building blocks for larger solutions. 

So, again...what are some DFIR analysis topics, specific to Windows systems, that provide good opportunities for "just in time" training via "How To" articles or documents?


Monday, March 12, 2018

New and Updated Plugins, Other Items

BAM Key and Process Execution, Updated Plugins
Recently, blog posts describing the "BAM Key" and it's viability as a process execution artifact began to appear (port139 blog, padawan-4n6 blog).  The potential for this key was previously mentioned last summer by Alex Ionescu, and this began to come to light as of Feb, 2018.  As such, I wrote two plugins for parsing this data, and

There's also been a good bit of testing going on and being shared with respect to when data in Registry transaction logs is committed to the hive files themselves.  In addition to the previously linked blog, there's also this blog post.

I recently ran across a couple of System hive files from Windows 10 systems for which the AppCompatCache data was not parsing correctly.  As such, I updated the and plugins, as well as their corresponding * variants.

Finally, Eric documented changes to the AmCache.hve file, but until recently, I hadn't seen any updated hive files.  Thankfully, Ali Hadi was kind enough to share some hives for testing, and I updated the and plugins accordingly. However, these updates were solely to address the process execution artifacts, and I have not yet updated them to include the device data that Eric pointed out in his blog post.

NOTE: Yes, I uploaded these plugins (a total of 8) to the GitHub repository.  Again, I do not modify the RegRipper profiles when I do so, so if you want to incorporate the plugins in your processes, you'll need to open Notepad and update the profiles yourself.

Part of the reason I made these updates is to perform testing of the various artifacts, specifically to see where the BAM key data falls out in the spectrum of process execution artifacts. 

The data that Ali shared with me included the AmCache.hve, System hive (as well as other hives from the config folder). and the user's NTUSER.DAT hive.  Using these sources, I created a micro-timeline using the AmCache process execution data, AppCompatCache data, BAM key data, and the user's UserAssist data, and the results were really quite fascinating. 

Here's an example of some of the data that I observed.  I created my overall timeline, and then picked out one entry (for time2.exe) for a closer look.  I used the type command to isolate just the events I wanted from the events file, and created the below timeline:

Thu Feb 15 16:49:16 2018 Z
  AmCache      - Key LastWrite - f:\time2.exe (46f0f39db5c9cdc5fe123807bb356c87eb08c48e)

Thu Feb 15 16:49:14 2018 Z
  BAM             - \Device\HarddiskVolume4\Time2.exe (S-1-5-21-441239525-4047580167-3361022386-1001)

Thu Feb 15 16:49:10 2018 Z
  REG             forensics - [Program Execution] UserAssist - C:\Users\forensics\Desktop\Time2 - Shortcut.lnk (1)
  REG             forensics - [Program Execution] UserAssist - F:\Time2.exe (1)

Mon Nov  2 22:20:14 2009 Z
  REG             - M... AppCompatCache - F:\Time2.exe

This is pretty fascinating stuff.  We know the context of the time stamps for the AppCompatCache data; even though we understand the data to be populated as a result of process execution events, the time stamp associated with the data is the file system last modification time (specifically, from the $STANDARD_INFORMATION attribute).  The UserAssist data illustrates the user launching the application, and in the following 6 seconds, the BAM and AmCache entries are created.

Keep in mind that this is just one example, from one test.  In the data I've got, not all entries in the AppCompatCache data have corresponding entries in the BAM key.  For example, there's a file called "slacker.exe" that appears in the AppCompatCache and AmCache data, but there doesn't seem to be an entry in the BAM key. 

Over on the Troy 4n6 blog, Troy has a couple of great comments on testing.  Yes, they're specific to the P2P cases he mentions in the blog, but they're also true throughout the rest of the DFIR community.

Processes and Checklists
Something I've done over the years is develop and maintain processes and checklists for the various types of analysis work I've done.  For example, as an incident responder, I've received a lot of those "...we think this system may have been infected with malware..." cases over the years, and what I've done is maintained processes and checklists for conducting this sort of analysis. 

Now, there's a school of thought within the DFIR community that follows the belief that having defined and documented processes stifles the creativity and innovation of the individual analyst. I wholeheartedly my experience, having a documented process means I don't forget things, and it leads directly to automating the process, so that I spend less time sifting through data and more time conducting actual analysis. 

Further, maintaining a documented process does not mean that the process is set in stone once it's written; instead, it's a living document that continues to be updated and developed as new things are learned.  As MS has developed not just new operating systems, but updated the currently available OSs, new artifacts (see the BAM key mentioned above) have been discovered.  And this is solely with respect to the OS, and doesn't take new applications (or new versions of current applications) into account.  Maintaining an ever-expanding list of Windows artifacts is neither as useful nor as viable as maintaining documented processes that illustrate how those artifacts are used in an investigation, so having documented processes is a key component of providing comprehensive and accurate analysis in a timely manner.

Word Metadata Verification
Phill's got a blog post up on documenting Word doc metadata bugs.  My take-aways from his post are:

1. Someone asked him for assistance in verifying what was thought to be a bug in a tool.  This isn't to point out that the first thing that was blamed was the tool...not at all.  Rather, it's to point out that someone said, hey, I'm seeing something odd, I'll see if others are seeing the same thing.  It isn't the first step I'd take, but it was still a good call. 

2. Phill documented and shared his testing methodology.  As someone who's written open source tools, I've gotten a lot of "...the tool doesn't work..." over the years, without any information regarding what was done, or how "doesn't work" was reached.  Sharing what you did and saw in a concise manner isn't an's necessary, and it allows others to have a baseline for further testing.

3. Phill stated, "...I decided I was being lazy by not actually looking for the word count in the docx file itself...".  Sometimes, the easiest approach escapes us completely.  I've seen/done the same thing myself, and I've gotten to the point where, if I get odd errors from RegRipper or Volatility, I'll open the target file in a hex editor to make sure it's not all zeroes (yes, that has happened).  I guess that's the benefit of being familiar with file formats; like Phill said, he opened up the .docx file he was using for testing in an archive tool and pulled out what he needed.

Sunday, March 11, 2018

Creating Tools and Solving Problems

I received a request recently for a blog post on a specific topic via LinkedIn; the request looked like this:

Have you blogged or written about creating your own tools, such as you do, from a beginner's standpoint? I am interested in learning more about how to do this.

A bit of follow-up revealed a bit more information behind the request:

I teach a DFIR class at a local university and would like to incorporate this into the class.

I don't often get requests, and this one seemed kind of interesting to me anyway, so I thought I'd take a shot at it.

To begin with, I did write a section of a blog post entitled "Why do I write my own tools?"  The post is just a bit more than three years old, and while the comments were short, they still apply today.

The Why?
Why do I write my own tools?  As I mentioned in my previous post, it helps me to understand the data itself much better, and as such, I understand the context and usage of the data much better, as well.

Another reason to write my own tools is to manage the data in a manner that best suits my analysis needs.  RegRipper started that way...well, that and the need for automation.  Over time, I've continued to put a great deal of thought into my analysis process and why I do the things I do, and why I do them the way I do them.  This is, in part, where my five-field TLN format came from, and it still holds up as an extremely successful methodology.

The timeline creation and analysis methodology has proven to be extremely successful in testing, as well.  For example, there was this blog post (not the first, and it won't be the last) that discusses the BAM key.  Again, this isn't the first blog post on the topic, and speaking with the author of the post recently, face-to-face (albeit through a translator), it was clear that he'd found some discrepancies in previously-posted findings regarding when the key is updated.  So, someone is enthusiastically focusing their efforts in determining the nature of the key contents, and as such, I have opted to focus my analysis on the context of the data, with respect to other process execution data (AmCache.hve, UserAssist, AppCompatCache/ShimCache, etc.)  In order to do this, I'd like to see all of the data sources normalized to a common format (TLN) so that I can look at them side-by-side, and the only way I'm going to do that is to write my own tools.  In fact, I have...I have a number of RegRipper plugins that I can use to parse this information out into TLN format, add Windows Event Log data to the Registry data, and boo-yah!  There it is.

Another advantage of writing my own tools is that I get to deal directly with the data itself, and in most cases, I don't have to go through an API call.  This is how I ended up writing the Event Log/*.evt file parser, and from that, went on to write a carving tool to look for individual records.  Microsoft has some really clear and concise information about the various structures associated with EVT records, making it really easy to write tools.  Oh, and if you think that's not useful anymore, remember the NotPetya stuff last year (summer, 2017)?  I used the tool I wrote to carve unallocated space for EVT records when a Win2003 server got hit.  You never know when something's going to be useful like that.

The How
How do I write my own tools?  That's a good it more about the process itself, or the thought process behind writing a tool.  Well, as I learned early in my military career, "it depends".

First, there are some basics you need to understand, of course...such as endianness.  There is also how to recognize, parse, and translate binary data into something useful.  I usually start out with a hex editor, and I've gotten to the point where I not only recognize 64-bit FILETIME time stamps in binary data, but specifically with respect to shellbags, I've gotten to the point where I recognize patterns that end up being GUIDs.  It's like the line from The Matrix..."all is see are blondes, brunettes, and redheads."

I start by understanding the structures within data, either by following a programming specification (MS has a number of good ones), or some other format definition.  Many times, I'll start with a hex editor, or a bit of code to dump arbitrary-length binary data to hex format, print it out, and go nuts with highlighters.  For Registry stuff, I started by using Peter Nordahl's offline password editing tool header files to understand the structure of the various cells available within the Registry.  When the Parse::Win32Registry Perl module came along, I used that for accessing the various cells, and was able to shift my focus to identifying patterns in binary data types within values, as well as determining the context of data through the use of testing and timelining.  For OLE files, I started with the MS-CFB definition, and like I said, MS maintains some really good info on Event Log structures.

The upshot of this is that I have a better understanding than most, regarding some of the various data types, particularly those that include or present time stamps. There are a lot of researchers who put effort into understanding the specific actions that cause an artifact to be created or modified, but I think it's also important to understand the time format itself.  For example, FILETIME objects are 64-bit time stamps with a granularity of 100 nanoseconds, where a DOSDate time stamp (embedded within many shell item artifacts) has a granularity of 2 seconds.  The 128-bit SYSTEMTIME structure only has a granularity of one second, similar to the Unix epoch time.

In addition to the understanding of time stamp formats, I've also found a good number of time stamps where most folks don't know they exist.  For example, when analyzing Word .doc files, the 'directories' within the OLE structure have time stamps associated with them, and tying that information to other document metadata, compile time stamps for executables found in the same campaign, etc., can all be used to develop a better understanding of the adversary.

Something else that can be valuable if you understand it is the metadata available within LNK files sent by the adversary as an attachment.  Normally, LNK files may be created on a victim system as the result of an installation process, but when the adversary includes an LNK file as an attachment, then you've got information about the adversary's system available to you, and all it takes to unlock that information is an understanding of the structure of the LNK files, which are composed, in part, of shell items.

Things To Consider
Don't get hung up on the programming language.  I started teaching myself Perl a long time ago, in part to assist some network engineering guys.  However, I later learned that at the time, Perl was the only language that had the necessary capability to access the data I needed to access (i.e., live Windows systems).  Perl later remained unique in that manner when it came to "dead box" and file analysis.  Over time, that changed as Python caught up.  Now, you can use languages like Go to parse Windows Event Logs.  Oh, and you can still do a lot with batch files.  So, don't get hung up on which language is "best"; the simple answer is, "the one that works for you", and everything else is just a distraction.

This isn't just about writing tools to get to the data, so that I can perform analysis.  One of the things I'm particular about is developing intelligence from the work I do, learning new things and incorporating or "baking it into" my tools and processes.  This is why I have the eventmap.txt file as part of my process for parsing Windows Event Logs (*.evtx files); I see and learn something new (such as the TaskScheduler/706 event), add it to the file with comments, and then I always have the information available.  Further, sharing it with others means that they can benefit from knowledge of others without having to have had the same experiences.

Closing Thoughts
Now, is everyone going to write their own tools?  No.  And that's not the expectation at all.  If everyone were writing their own tools, no one would ever get any actual work done.  However, understanding data structures to the point of writing your own tools can really open up new vistas for the use of available data, and of intelligence that can be developed from the analysis that we do.

However, if this is something you're interested in, then once you are able to start recognizing patterns and matching those patterns up to structure definitions, there isn't much that you can't do, as the skills are transferable.  It doesn't matter where the file comes from...from which device or'll be able to parse the files.