Thursday, May 28, 2015

Detecting Lateral Movement

Almost two years ago, I posted this article that addressed how to track lateral movement within an infrastructure.  At the time, I'd been using this information successfully during engagements, and I still use it today.

This morning, I saw this video from Rapid7, and I thought that Mike did a great job with the presentation.  Mike made some very good points during his presentation.  For example, "SMB" is native to a Windows infrastructure, and with the right credentials, an adversary can go just about anywhere they please.

There were some things missing in the presentation, some caveats that need to be mentioned; I do understand that they were likely left out for the sake of time.  However, they are important.  For example:

Security-Auditing/4698 events - Scheduled Task creation; under Advanced Security Audit Policy settings, for Object Access, you need to have Audit Other Object Access Events enabled for this event to appear in your Windows Event Logs.

Security-Auditing/4697 events - Service installation; similar to the previous events, systems are not configured to audit for system creation via the Security Event Log by default.

So, the take-away here is that in order for these (and other) events to be useful, what admins need to do is properly configure auditing on systems, as well as employ a SEIM with some sort of filtering capability.  Increasing auditing alone will not be useful...I've seen that time and time again when an incident is identified; auditing is ramped up suddenly, and the Security Event Logs start filling up and rolling over in a matter of a few hours, causing valuable information to be lost.  The best thing to do is to enable auditing that makes sense within your infrastructure ahead of time, employing the appropriate settings (what to audit, increasing the default size of the Windows Event Log files, etc.) before an incident occurs.

Also, consider the use of MS's Sysmon, sending the collected data to a SEIM (Splunk??).  Monitoring process creation (including the command line) is extremely valuable, and not just in incident response.  For IR, having the process creation information available (along with a means to monitor it in a timely manner) reduces IR engagements from days or weeks to hours or even minutes.  If setting up Sysmon, Splunk, and filters is too daunting a task, consider employing something like Carbon Black.

Thanks to Rapid7 for sharing the video...it's some great information.

Resources
Description of Security Events in Windows 7/Windows Server 2008 R2



Tuesday, May 26, 2015

Links and Stuff

Registry Goodness
I recently wrote a RegRipper plugin, based on this KB article; on 26 May, I committed it to the plugin repository.  I had tweeted to ask the DFIR community if this information was relevant to their investigations, and there was not a great deal of response on the topic...although there was apparently some confusion.  I hope that folks take the time to try it, and I hope it's of some use to the DFIR community.  I don't often (scratch that...in 15+ years of doing DFIR work, I've never...) need to determine the history of GPOs assigned to a system.

Speaking of RegRipper plugins, Dan posted recently about how he completed the SANS CEIC 2015 Challenge.  While he completed the challenge using Eric Zimmerman's Registry Explorer tool, he did state toward the end of the post that he could've used RegRipper to complete the challenge, as well.

From the recent CEIC Conference, you can see David Dym's slides for his Improving Windows External Device Investigations presentation.  I know that no matter how many times this subject is addressed and discussed, there will always be confusion as to what resources are available on Windows systems if you are conducting one of these investigations.  I think it's great that we've got others talking about this topic, particularly because there seems to be so much confusion in this area.  Cory Altheide and I published the initial research into this topic in 2005 (there's a link here), and as new versions of Windows have come out, more information has become available regarding not only which devices were connected to systems, but also which user may have accessed the device.

Speaking of the Registry, Eric Zimmerman recently released a command line tool for interacting with (including searching) a Registry hive file for specific items.  Be sure to get version 0.6 of the tool.  Eric's been doing a lot of work in creating freeware tools for accessing the Registry, so be sure to check out his other offerings.

AutoStart
Not related to Registry analysis, but TrendMicro recently had a blog post about what they seem to be presenting as a variation in autostarting malware.  More than anything else, the post left me more than a little confused...it says that the intruders found an application that was set to run when the system started, and then modified the application's import table by adding a reference to a malicious DLL.  It was the next sentence that left me confused:

It is almost impossible to find differences between the original version and the modified ones, as even their file sizes are almost identical.

The post then goes on to say that 4 of the 5 infected applications were discovered as the modified versions weren't signed.  However, there still seems to be more going on here, because adding a DLL to the import table of a .exe file, and then referencing the malicious function should make something of an impact on the size of the application, as well as other aspects of the system itself (MFT, USN change journal, etc.).

Processes
Speaking of stuff starting, Corey posted recently regarding some testing he'd done with an MSWord document that would launch an executable.

Something I really like about Corey's post is that there's enough detail in the way he presented the material to not only replicate what he did (if you can or want to get a copy of the file he used...).  Also, there's enough information in the post to create things like searches for the pattern after running LogParser against the Sysmon Event Log file, as well as to write Carbon Black watchlist queries.

Tuesday, May 05, 2015

Stuff

Plugin Updates
Eric Zimmerman reached to me a little while ago and let me know that he'd taken a look at the AppCompatCache data from a Windows 10 system, and found that...wait for it...the format of the data was different from previous versions of Windows.  O.  M.  G.

Thanks to the heads up from Eric, I've updated the RegRipper plugins for parsing this data.  However, my testing was extremely limited; I had only one System hive file (from a Windows 10 TP VM that I'd set up) on which to test the parsing code.  A dearth of testing data has been an issue since I started writing tools, and it seems that even TaoSecurity has recognized the need for test data.

Interestingly enough, I happened across a System hive file from a Windows 2012 system, and the updates to the parser seemed to work just fine.  Again, I said "a" hive...so testing has been extremely limited.

AppCompatCache/ShimCache
When working with the AppCompatCache or "ShimCache" data, analysts need to remember the context of the information, and in particular, the time stamps in the data. In most cases, the time stamp is the last modification time for the file in question, from the file system metadata, specifically, the $STANDARD_INFORMATION attribute. It is NOT the date and time that the application was executed.

Dumping Passwords
Speaking of the Registry, I ran across this little gem that describes how to dump passwords in plain text from Windows 8.1/2012 systems.  I'm a big proponent for finding out what things look like on systems, and it's pretty clear reading through the blog post what an analyst would look for in an acquired image, to determine if something similar to what was described in the post had occurred on the system.

However, I'll put this out there...rather than hoping to find these indicators on a system, why not make IR scoping easier on yourself through the use of process creation monitoring at the endpoints?

Malware Persistence
Here's a good blog post on malware persistence.  There's some focus on MS's AutoRuns tool, so it's likely to be familiar to a lot of folks.  What I thought was interesting is the number of times we see the Run key being the point of persistence for malware; while many will suggest that this location is 'well known', there are also those of us who see it used time and time again, even when the incident is 'detected' through external, third-party notification.  Some may think that the Run key is passe, but hey, it still works, and works well...so why not use it, right?

EVT vs EVTX
Every now and then, I still get an opportunity to analyze Windows XP and 2003 systems...most often, I tend to find myself working with Windows 7 and Windows 2008 R2 systems.  Working with older versions of Windows can sometimes necessitate the use of different tools in order to conduct analysis; specifically, when working with the Event Logs, they're in a different location, as well as in a different binary format.

When working with Windows XP and 2003 (and yes, Windows 2000) Event Logs (*.evt files), I'll use evtrpt and evtparse.  These tools were written specifically for the binary format of the *.evt files found on Windows 2000, XP, and 2003 systems, and are not intended for use against the *.evtx files found on Vista+ systems.

Evtrpt
This is a tool I wrote a while back to provide me with information about the contents of an EVT file; how many records exist, how many source/ID pairs exist, and what date range do the events cover.  What's cool about this tool is that it doesn't use the MS API...which means that it can be run on Linux, and it doesn't rely on the header information of the EVT file to tell it how many records exist.

Evrtp is a command line tool, and takes just one argument...the path to the file you're interested in parsing.  If you just type the name of the tool at the command prompt, you'll get a message that says "You must enter a filename".  That seems to be pretty straightforward, and adding the full path to the .evt file of interest is all I need to do.

So, I run the tool against an Event Log file that I've extracted from an image, and one of the things I see in the output is the date range for the event records in that file:

Fri Sep 27 17:32:26 2013 to Tue Apr 28 17:37:58 2015

Cool, it covers the time that I'm interested in.  Above the date range in the output, I get a list of event sources and IDs, as well as a count of each.  For login attempts, the tool also breaks down the login/logoff type.  For example, I see the following entry:

Security                                      538,3    18938

What this tells me is that for the source "Security", there are 18938 events with ID 538 and type 3.

I ran the tool against the other Event Log files from the system, as well.  For the Application Event Log, the date range was:

Thu Jan 10 01:52:02 2013 to Tue Apr 28 15:36:04 2015

What I found most interesting from the output of the tool was this entry:

Symantec AntiVirus                               51        2

Remember the tool I use for Windows Event Logs, wevtx.bat?  The one that uses the eventmap.txt file?  Well, in that file, Symantec AntiVirus/51 event records indicate that the product detected malware.  Understanding the context of these event source/ID pairs can be extremely valuable, and the evtrpt tool can give you an idea of what pairs are available.  For McAfee, I'd look for McLogEvent/257 pairs.

Something of interest from the output of the System Event Log was:

Application Popup                                26       11
Application Popup                                44        1

So, hopefully by now you can see how useful this tool can be for a quick look at the Event Log files.

Evtparse
This tool operates similar to the evtrpt tool, but the output allows you to see more about the Event Log records.  Typing just the name of the tool at the prompt will show you the syntax information, along with example command lines you can use to run the tool.  I use this tool to parse through the *.evt file (or files) and add the entries to a timeline of system activity.  The command line to launch this tool does not take many arguments, because most of the data that you may want to include in each timeline entry (system name, user name, etc.) is included within each event record.

Why Do I Need The Event Source and the event ID?
Something I see a great deal of within the DFIR community is that Windows Event Log records are referred to only by their ID number.  Ok, you're probably wondering...so what?  Who cares?  Well, it can be important...if someone says that they have mulitple event ID 4100 records, I would have to ask, which source?  There are a number of event IDs that have multiple different event sources, each of which can provide completely different context to the situation.

So What?
Why does any of this matter? Context is extremely important when conducting analysis, and in particular when communicating findings to other analysts, as well as to clients.  One example was mentioned earlier in this post...the time stamps in the AppCompatCache/ShimCache data is the last modification time of the file, from the file system metadata (the Mandiant white paper is only 5 pages long and is an easy read...).

Another example comes from the output of another RegRipper plugin...the LastWrite time for one of the subkeys from the USBStor key is just that...the key LastWrite time, which is analogous to a file's last modification time.  It is NOT the last time that USB device was written to.

So, when communicating findings from the Event Log (or, Windows Event Log on Vista+ systems), providing the event source AND ID can be extremely important for context.  If someone just says, "...event ID 4100...", then the very next question should be, "...which event source?"  When documenting findings in your case notes, include both the source and ID, as well as a reference, for (possible) inclusion in the eventmap.txt file.

Sunday, April 26, 2015

Timeline Analysis Process

Having discussed timeline analysis in a couple of blog posts so far (such as here...), I thought I'd take an opportunity to dig a bit deeper into the process I've been using for some time to actually conduct analysis of timelines that I create during engagements.

Before we get into the guts of the post, I'd like to start by saying that this is what I've found that works for me, and it does not mean that this is all there is.  This is what I've done, and been doing, tweaking the process a bit here and there over the years.  I've looked at things like visualization to try to assist in the analysis of time lines, but I have yet to find something that works for me.  I've seen how others "do" timeline analysis, and for whatever reason, I keep coming back to this process.  However, that does not mean that I'm not open to other thoughts, or discussion of other processes.  It's quite the opposite, in fact...I'm always open to discussing ways to improve any analysis process.

Usually when I create a timeline, it's because I have something specific that I'm looking for that can be shown through a timeline; in short, I won't create a timeline (even a micro-timeline) without a reason to do so.  It may be a bit of malware, a specific Registry entry or Windows Event Log record, a time frame, etc.  As is very often the case, I'll have an indicator, such as web shell file on a system, and find that in some cases, the time frames don't line up between systems, even though the artifact is the same across those systems.  It's this information than I can then use, in part, to go beyond the "information" of a particular case, and develop intelligence regarding an adversary's hours of operations, action on objectives, etc.

When creating a timeline, I'll use different tools, depending upon what data I have access to.  As I mentioned before, there are times when all I'll have is a Registry hive file, or a couple of Windows Event Logs, usually provided by another analyst, but even with limited data sources, I can still often find data of interest, or of value, to a case.  When adding Registry data to a timeline, I'll start with regtime to add key Last Write times from a hive to the timeline.  This tool doesn't let me see Registry values, only key Last Write times, but it's value is that it lets me see keys that have been created or modified during a specific time frame, telling me where I need to take a closer look.  For example, when I'm looking at a timeline and I see where malware was installed as a Windows service, I'll usually see the creation of the Registry key for the service beneath the Services key (most often beneath both, or all, ControlSets).

When I want to add time stamped information from Registry value data to a timeline, I'll turn to RegRipper and use the plugins that end in *_tln.pl.  These plugins (generally speaking) will parse the data from Registry values for the time stamped information, and place it into the necessary format to include it in a timeline.  The key aspect to doing this is that the analyst must be aware of the context of the data that they're adding.  For example, many analysts seem to believe that the time stamp found in the AppCompatCache (or ShimCache) data is when the program was executed, and in several cases (one announced publicly by the analyst), this misconception has been passed along to the customer.

I also use several tools that let me add just a few events...or even just one event...to the time line.  For example, the RegRipper secrets_tln.pl plugin lets me add the Last Write time of the Policy/Secrets key from the Security hive to the timeline.  I can check the time stamp first with the secrets.pl plugin to see if it's relevant to the time frame I'm investigating, and if it is, add it to the timeline.  After all, why add something to the timeline that's not relevant to what I'm investigating?

If I want to add an event or two to the timeline, and I don't have a specific parser for the data source, I can use tln.exe (image to the left) to let me add that event.  I've found this GUI tool to be very useful, in that I can use it to quickly add just a few events to a timeline, particularly when looking at the data source and only finding one or two entries that are truly relevant to my analysis.  I can fire up tln.exe, add the time stamp information to the date and time fields, add a description with a suitable tag, and add it to the timeline.  For example, I've used this to add an event to a timeline, indicating when the available web logs on a system indicated the first time that a web shell that had been added to the system was accessed.  I added the source IP address of the access in the description field in order to provide context to my timeline, and at the same time, adding the event itself provided an additional (and significant) level of relative confidence in the data I was looking at, because the event I added corresponded exactly to file system artifacts that indicated that the web shell had been accessed for the first time.  I chose this route, because adding all of the web log data would've added significant volume to my timeline without adding any additional context or even utility.

When creating a timeline, I start with the events file (usually, events.txt), and use parse.exe to create a full timeline, or a partial timeline based on a specific date range.  So, after running parse.exe, I have the events.txt file that contains all of the events that I've extracted from different data sources using different tools (whichever applies at the time), and I have either full timeline (tln.txt) or a shortened version based on a specific date range...or both.  To begin analysis, I'll usually open the timeline file in Notepad++, which I prefer to use because it allows me to search for various things, going up or down in the file, or it can give me a total count of how many times a specific search term appears in the file.


Once I begin my analysis, I'll open another tab in Notepad++, and call it "notes.txt".  This is where all of my analysis notes go while I'm doing timeline analysis.  As I start finding indicators within the timeline, I'll copy-and-paste them out of the timeline file (tln.txt) and into the notes file, keeping everything in the proper sequence.

Timeline analysis has been described as being an iterative process, often performed in "layers".  The first time going through the timeline, I'll usually find a bunch of stuff that requires my attention.  Some stuff will clearly be what I'm looking for, other stuff will be, "...hey, what is this..."...but most of what's in the timeline will have little to do with the goals of my exam.  I'm usually not interested in things like software and application updates, etc., so having the notes file available lets me see what I need to see.  Also, I can easily revisit something in the timeline by copying the date from the notes file, and doing a search in the timeline...this will take me right to that date in the timeline.

Recently while doing some timeline analysis, I pulled a series of indicators out of the timeline, and pasted them into the notes file.  Once I'd followed that thread, I determined that what I was seeing as adware being installed.  The user actively used the browser, and the events were far enough back in time that I wasn't able to correlate the adware installation with the site(s) that the user had visited, but I was able to complete that line of analysis, note what I'd found, remove the entries from the notes file, and move on.

As timeline analysis continues, I very often keep the data source(s) open and available, along with the timeline, as I may want to see something specific, such as the contents of a file, or the values beneath a Registry key.  Let's say that the timeline shows that during a specific time frame that I'm interested in, the Last Write time of the HKLM/../Run key was modified; I can take a look at the contents, and add any notes I may have ("...there is only a single value named 'blah'...") to the notes file.

Many times, I will have to do research online regarding some of the entries in the timeline.  Most often, this will have to do with Windows Event Log entries; I need to develop an understanding of what the source/ID pair refers to, so that I can fill in the strings extracted from the record and develop context around the event itself.  Sometimes I will find Microsoft-Windows-Security-Auditing/5156 events that contain specific process names or IP addresses of interest.  Many times, Windows Event Log record source/ID pairs that are of interest will get added to my eventmap.txt file with an appropriate tag, so that I have additional indicators that automatically get identified on future cases.

Not everything extracted from the timeline is going to be applicable to the goals of my analysis.  I've pulled data from a timeline and my research has determined that the events in question were adware being installed.  At that point, I can remove the data from my notes file.

By the time I've completed my timeline analysis, I have the events file (all of the original events), the timeline file, and the notes file.  The notes file is where I'll have the real guts of my analysis, and from where I'll pull things such as indicator clusters (several events that, when they appear together, provide context and a greater level of confidence in the data...) that I've validated from previous engagements, and will use in future engagements, as well as intel I can use in conjunction with other analysis (other systems, etc.) to develop a detailed picture of activity within the enterprise.

Again, this is just the process I use and have found effective...this is not to say that this is what will work for everyone.  And I have tried other processes.  I've had analysts send me automatically-created colorized spreadsheets and to be honest, I've never been very clear as to what they've found or thought to be the issue.  That is not to say that this method of analysis isn't effective for some, as I'm sure it is...I simply don't find it effective.  The process I described has been effective for me at a number of levels...from having a single data source from a system (i.e., a single Registry hive...), to having an entire image, to analyzing a number of systems from the same infrastructure.  And again, I'm not completely stuck to this process...I'm very open to discussion of other processes, and if I do find something that proves to be effective, I have no problem adding it to what I do, or even changing my process all together.

Sunday, April 19, 2015

Micro- & Mini-Timelines

I don't always create a timeline of system activity...but sometimes when I do, I don't have all of the data from within the system image available.  Many times, I will create a mini-timeline because all I have available is either limited data sources, or even just a single data source.  I've been sent Event Logs (.evt files) or a couple of Windows Event Logs (.evtx files), and asked to answer specific questions, given some piece of information, such as an indicator or a time frame.  I've had other analysts send me Registry hive files and ask me to determine activity within a specific time frame, or associated with a specific event.

Mini, micro, and even nano-timelines can assist an analyst in answering questions and addressing analysis goals in an extremely timely and accurate manner.

There are times where I will have a full image of a system, and only create a mini- or nano-timeline, just to see if there are specific indicators or artifacts available within the image.  This helps me triage systems and prioritize my analysis based upon the goals that I've been given or developed.  For example, if the question before me is to determine if someone accessed a system via RDP, I really only need a very limited number of data sources to answer that question, or even just to determine if it can be answered.  I was once asked to determine the answer to that question for a Windows XP system, and all I needed was the System Registry hive file...I was able to show that Terminal Services was not enabled on the system, and it hadn't been.  Again, the question I was asked  (my analysis goal) was, "...did someone use RDP to access this system remotely?", and I was able to provide the answer to that question (or perhaps more specifically, if that question could be answered).

Sometimes, I will create a micro-timeline from from specific data sources simply to see if there are indicators that pertain to the time frame that I've been asked to investigate.  One example that comes to mind is the USN change journal...I'll extract the file from an image and parse it, first to see if it covers the time frame I'm interested in.  From there, I will either extract specific events from that output to add to my overall timeline, or I'll just add all of the data to the events file so that it's included in the timeline.  There are times when I won't want all of the data, as having too much of it can add a significant amount of noise to the timeline, drowning out the signal.

There are times when I simply don't need all of the available data.  For example, consider a repurposed laptop (provided to an employee, later provided to another employee), or a system with a number (15, 25, or more)  of user profiles; there are times that I don't need information from every user profile on the system, and including it in the timeline will simply make the file larger and more cumbersome to open and analyze.

I've also created what I refer to as nano-timelines.  That is, I'll parse a single Windows Event Log (.evtx) file, filter it for a specific event source/ID pair, and then create a timeline from just those events so that I can determine if there's something there I can use.

For example, let's say I'm interested in "Microsoft-Windows-Security-Auditing/5156" events; I'd start by running the Security.evtx file through wevtx.bat:

C:\tools>wevtx.bat F:\data\evtx\security.evtx F:\data\sec_events.txt

Now that I have the events from the Security Event Log in a text file, I can parse out just the events I'm interested in:

C:\tools>type F:\data\sec_events.txt | find "Microsoft-Windows-Security-Auditing/5156" > F:\data\sec_5156_events.txt

Okay, now I have an events file that contains just the event records I'm interested in; time to create the timeline:

C:\tools>parse -f F:\data\sec_5156_events.txt > F:\data\sec_5156_tln.txt

Now, I can open the timeline file, see the date range that those specific events cover, as well as determine which events occurred at a specific time. This particular event can help me find indications of malware (RAT, Trojan, etc.), and I can search the timeline for a specific time frame, correlating outbound connections from the system with firewall or IDS logs.  Because I still have the events file, I can write a quick script that will parse the contents of the events file, and provide me statistics based on specific fields, such as the destination IP addresses of the connections.

What's great about the above process is that an analyst working on an engagement can archive the Windows Event Log file in question, send it to me, and I can turn around an answer in a matter of minutes.  Working in parallel, I can assist an analyst who is neck-deep in an IR engagement by providing solid answers to concrete questions, and do so in a timely manner.  My point is that we don't always need a full system image to answer some very important questions during an engagement; sometimes, a well-stated or well-thought-out question can be used as an analysis goal, which leads to a very specific set of data sources within a system being examined, and the answer to whether that system is in scope or not being determined very quickly.

Analysis Process
Regardless of the size of the timeline (full-, mini-, micro-), the process I follow during timeline analysis is best described as iterative.  I'll use initial indicators...time stamps, file names/paths, specific Registry keys...to determine where to start.  From there, I'll search "nearby" within the timeline, and look for other indicators.

I mentioned the tool wevtx.bat earlier in this post; something I really like about that tool is that it helps provide me with indicators to search for during analysis.  It does this by mapping various event records to tags that are easy to understand, remember, and search for.  It does this through the use of the eventmap.txt file, which is nothing more than a simple text file that provides mappings of events to tags.  I won't go into detail in this blog post, as the file can be opened in viewed in Notepad.  What I like to do is provide references as to how the tag was "developed"; that is, how did I decide upon the particular tag.  For example, how did I decide that a Microsoft-Windows-TerminalServices-LocalSessionManager record with event ID 22 should get the tag "Shell Start"?  I found it here.

This is very useful, because my timeline now has tags for various events, and I know what to look for, rather than having to memorize a bunch of event sources and IDs...which, with the advent of Vista and moving into the Window 7 system and beyond, has become even more arduous due to the shear number of Event Logs now incorporated into the systems.

So, essentially, eventmap.txt serves as a list of indicators, which I can use to search a timeline, based upon the goals of my exam.  For example, if I'm interested in locating indications of a remote access Trojan (RAT), I might search for the "[MalDetect]" tag to see if an anti-virus application on the system picked up early attempts to install malware of some kind (I should note that this has been pretty helpful for me).

Once I find something related to the goals of my exam, I can then search "nearby" in the timeline, and possibly develop additional indicators.  I might be looking for indications of a RAT, and while tracking that down, find that the infection vector was via lateral movement (Scheduled Task).  From there, I'd look for at least one Security-Auditing/4624 type 3 event, indicating a network-based logon to access resources on the system, as this would help me determine the source system of the lateral movement.  The great thing about this is that this sort of activity can be determined from just three Windows Event Log files, two Registry hives, and you can even throw in the MFT for good measure, although it's not absolutely required.  Depending on the time frame of response...that is, was the malicious event detected and the system responded to in a relatively short time (an hour or two), or is this the result of a victim notification of something that happened months ago...I may include the USN change journal contents, as well.

My efforts are most often dedicated toward finding clusters of multiple indicators of specific activity, as this provides not only more context, but also a greater level of confidence in the information I'm presenting.  Understanding the content of these indicator clusters is extremely helpful, particularly when anti-forensics actions have been employed, however unknowingly.  Windows Event Logs may have "rolled over", or a malware installer package may include the functionality to "time stomp" the malware files.

Friday, April 10, 2015

Talk Notes

Thanks to Corey Harrell, I was watching the Intro to Large Scale Detection Hunting presentation from the NoLaSec meeting in Dec, 2014, and I started to have some thoughts about what was being said.  I looked at the comments field on YouTube, as well as on David's blog, but thought I would put my comments here instead, as it would give me an opportunity to structure them and build them out a bit before hitting the "send" button.

First off, let me say that I thought the talk was really good.  I get that it was an intro talk, and that it wasn't going to cover anything in any particular depth or detail.  I liked it enough that I not only listened to it beginning-to-end twice this morning, but I also went back to certain things to re-listen to what was said.

Second, the thoughts I'm going to be sharing here are based on my perspective as an incident responder and host/endpoint analyst.

Finally, please do not assume that I am speaking for my employer.  My thoughts are my own and not to be misconstrued as being the policies or position of my employer.

Okay, so, going into this, here are some comments and thoughts based on what I saw/heard in the presentation...

Attribution...does it matter?

As David said, if you're gov, yeah (maybe).  If you're a mom-and-pop, not so much.  I would suggest that during both hunting and IR, attribution can be a distraction.  Why is that?

Let's look at it this way...what does attribution give us?  What does it tell us?  Some say that it informs as to the intent of the adversary, and that it tells us what they're after.  Really?    Many times, an organization that has been compromised doesn't fully understand that they have that's "of value".  Is it data of some kind?  Marketing materials?  Database contents?  Manufacturing specs?  Or, is it the access that organization has to another organization?  If you're doing some hunting, and run across an artifact or indicator that puts you on high alert, how are you able to perform attribution?

Let's say that you find a system with a bunch of batch files on it, and it looks as if the intruder was performing recon, and even dumping credentials from systems...at this point, how do you perform attribution?  How do you determine intent?

Here's an example...about 5 yrs ago, I was asked to look at a hard drive from a small company that had been compromised.  Everyone assumed that the intruder was after data regarding the company's clients, but it turned out that this small organization's money, which was managed via online banking.  The intruder had been able to very accurately determine who managed the account, and compromised that specific system with a keystroke logger that loaded into memory, monitored keystrokes sent to the browser when specific web sites were open, and sent the captured keystrokes off of the system without writing them to a file on disk.  It's pretty clear that the bad guy thought ahead, and knew that if the the employee was accessing the online banking web site, that they could just send the captured data off of the system to a remote site.

"If you can identify TTPs, you can..."

...but how do you identify TTPs?  David talked about identifying TTPs and disrupting those, to frustrate the adversary; examples of TTPs were glossed over, but I get that, it's an intro talk.  This goes back to what does something "look like"...what does a TTP "look like"?

I'm not saying that David missed anything by glossing over this...not at all.  I know that there was a time limit to the talk, and that you can only cover so much in a limited time.

Can't automate everything...

No, you can't.  But there's much that you can automate.  Look at your process in a Deming-esque manner, and maybe even look at ways to improve your process, using automation.

"Can't always rely on signatures..."

That really kind of depends on the "signatures" used.  For example, if you're using "signatures" in the sense that AV uses signatures, then no, you can't rely on them.  However, if you're aware that signatures can be obviated and MD5 hashes can be changed very quickly, maybe you can look at things that may not change that often...such as the persistence mechanism.  Remember Conficker?  MS identified five different variants, but what was consistent across them was the persistence mechanism.

This is a great time to mention artifact categories, which will ultimately lead to the use of an analysis matrix (the link is to a 2 yr old blog post that was fav'd on Twitter this morning...)...all of which can help folks with their hunting.  If you understand what you're looking for...say, you're looking for indications of lateral movement...you can scope your hunt to what data you need to access, and what you need to look for within that data.

It's all about pivoting...

Yes, it is.

...cross-section of behaviors for higher fidelity indicators...

Pivoting and identifying a cross-section of behaviors can be used together in order to build higher fidelity indicators.  Something else that can be used is to do the same thing is...wait for it...sharing.  I know that this is a drum that I beat that a lot of folks are very likely tired of hearing, but a great way of creating higher fidelity indicators is to share what we've seen, let others use it (and there's no point in sharing of others don't use it...), and then validate and extend those indicators.

David also mentioned the tools that we (hunters, responders) use, or can put to use, and that they don't have to be big, expensive frameworks.  While I was working in an FTE security engineer position a number of years ago, I wrote a Perl script that would get a list of systems active in the domain, and then reach out to each one and dump the contents of the Run key from the Registry, for both the system and the logged on user.  Over time, I built out a white list of known good entries, and ended up with a tool I could run when I went to lunch, or I could set up a scheduled task to have it run at night (the organization had shifts for fulfillment, and 24 hr ops).  Either way, I'd come back to a very short list (half a page, maybe) of entries that needed to be investigated.  This also let me know which systems were recurring...which ones I'd clean off and would end up "infected" all over again in a day or two, and we came up with ways to address these issues.

So, my point is that, as David said, if you know your network and you know your data sources, it's not that hard to put together an effective hunting program.

At one point, David mentioned, "...any tool that facilitates lateral movement...", but what I noticed in the recorded presentation was that there were no questions about what those tools might be, or what lateral movement might look like in the available data.

Once we start looking at these questions and their responses, the next step is to ask, do I have the data I need?  If you're looking at logs, do you have the right logs in order to see lateral movement?  If you have the right sources, are they being populated appropriately?  Is the appropriate auditing configured on the system that's doing the logging?  Do you need to add additional sources in order to get the visibility you need?

"Context is King!"

Yes, yes it is.  Context is everything.

"Fidelity of intel has to be sound" 

Intel is worthless if it's built on assumption, and you obviate the need for assumption by correlating logs with network and host (memory, disk) indicators.

"Intel is context applied to your data"

Finally, a couple of other things David talked about toward the end of the presentation were with respect to post-mortem lessons learned (and feedback loops), and that tribal knowledge must be shared.  Both of these are critical.  Why?

I guess that another way to ask that question is, is it really "cost effective" for every one of us to have to all learn the same lessons on our own?  Think about how "expensive" that is...you may see something and even if I were hunting in the same environment, I may not see that specific combination of events for the next 6 months or a year, if ever.  Or, I may see it, but not recognize it as something that needs to be examined.

Sharing tribal knowledge can also mean sharing what you've seen, even though others may have already "seen" the same thing, for two reasons:  (1) it validates the original finding, and (2) it lets others know that what they saw 6 months ago is still in use.

Consider this...I've seen many (and I do mean, MANY) malware analysts simply ignore the persistence mechanism employed by malware.  When asked, some will say, "...yeah, I didn't say anything because it was the Run key...".  Whoa, wait a second...you didn't think that was important?  Here it is 2015, and the bad guys are still using the Run key for persistence?  That's HUGE!  That not only tells us where to look, but it also tells us that many of the organizations that have been compromised could have detected the intrusion much sooner with even the most rudimentary instrumentation and visibility.

Dismissing the malware persistence mechanism (or any other indicator) in this way blinds the rest of the hunting team (and ultimately, the community) to it's value and efficacy in the overall effort.

Summary
Again, this was a very good presentation, and I think serves very well to open up further conversation.  There's only so much one can talk about in 26 minutes, and I think that the talk was well organized.  What needs to happen now is that people who see the presentation start implementing what was said (if they agree with it), or asking how they could implement it.

Resources
Danny's blog - Theoretical Concerns
David Bianco's blog - Pyramid of Pain
Andrew Case's Talk - The Need for Proactive Threat Hunting (slides)

Monday, April 06, 2015

Blogging

I caught an interesting thread on Twitter last week..."interesting" in the sense that it revisited one of the questions I see (or hear) quite a bit in DFIR circles; that is, how does one get started in the DFIR community?  The salient points of this thread covered blogging (writing, in general) and interacting within the community.  Blogging is a great way for anyone, regardless of how long you've been "doing" DFIR, to engage and interact with the community at large.

Writing
Writing isn't easy.  I get it.  I'm as much a nerd as anyone reading this blog, and I feel the same way most of you do about writing.  However, given my storied background, I have quite a bit of experience writing.  Even though I was an engineering major in college, I had to take writing classes.  One of my English professors asked if I was an English major, saying that I wrote like one...while handing back an assignment with a C (or D) on it.  I had to write in the military....fitreps, jagmans, etc.  I had jobs in the military that required other kinds of writing, for different audiences.

Suffice to say, I have some experience.  But that doesn't make me an expert, or even good at it.  What I've found is that the needs of the assignment, audience, etc., vary and change.

So how do you get better at writing?  Well, the first step is to read.  Seriously.  I read a lot, and a lot of different things.  I read the Bible, I read science fiction, and I read a lot of first person accounts from folks in special ops (great reading while traveling).  Some of the stuff I've read recently has included:

The Finishing School (Dick Couch) - I've read almost all of the books Mr. Couch as published

Computer Forensics: InfoSec Pro Guide (David Cowen)

Do Androids Dream of Electric Sheep (Philip K. Dick)

I've also read Lone Survivor, American Sniper, and almost every book written by William Gibson.

Another way to get better at writing is to write.  Yep, you read that right.  Write.  Practice writing.  A great way to do that is to open MSWord or Notepad, write something and hand it to someone.  If they say, "....looks good..." and hand it back, give it to someone else.  Get critiqued.  Have someone you trust read what you write.  If you're writing about something you did, have the person reading it follow what you wrote and see if they can arrive at the same end point.  A couple of years ago, I was working with some folks who were trying write a visual timeline analysis tool, and to get started, the first thing the developer did was sit down with my book and walk through the chapter on timelines.  He downloaded the image and tools, and walked through the entire process.  He did this all on his own accord and initiative, and produced EXACTLY what I had developed.  That was pretty validating for my writing, that someone with no experience in the industry could sit down and just read, and the process was clear enough that he was able to produce exactly what was expected.

Try creating a blog.  Write something.  Share it.  Take comments...ignore the anonymous comments, and don't worry if someone is overly critical.  You can ignore them, too.

My point is, get critiqued.  You don't sharpen a knife by letting it sit, or rubbing it against cotton.  The way to get better as a writer, and as an analyst, is to expose yourself to review.  The cool thing about a blog is that you can organize your thoughts, and you can actually have thoughts that consist of more than 140 characters.  And you don't have to publish the first thing you write.  At any given time, I usually have half a dozen or more draft blog posts...before starting this post, I deleted two drafts, as they were no longer relevant or of interest.

Writing allows you to organize your thoughts.  When I was writing fitness reports for my Marines, I started them days (in some cases, weeks) prior to the due date.  I started by writing down everything I had regarding that Marine, and then I moved it around on paper.  What was important?  What was truly relevant?  What needed to be emphasized more, or less?  What did I need to take out completely?  I'd then let it sit for a couple of days, and then come back to it with a fresh set of eyes.  Fitreps are important, as they can determine if a Marine is promoted or able to re-enlist.  Or they can end a career.  Also, they're critiqued.  As a 22 yr old 2ndLt, I had Majors and Colonels reviewing what I wrote, and that was just within my unit.  Getting feedback, and learning to provide constructive feedback, and go a long way toward making you a better writer.

I included a great deal of my experiences writing reports in chapter 9 of Windows Forensic Analysis Toolkit 4/e, and included an example scenario (associated with an image), case notes and report in the book materials.  So, if you're interested, download the materials and take a look.

One of the tweets from the thread:

it's a large sea of DFIR blogs and could be very intimidating to newbies in the field. What can they offer that is not there

Let's break this down a bit.  Yes, there are a lot of DFIR blogs out there, but as Corey tweeted, The majority of the DFIR blogs in my feed are either not active or do a few posts a year.  The same is true in my feed (and I suspect others will see something similar)...there are a number of blogs I subscribe to that haven't been updated in months or even a year or more (Grayson hasn't updated his blog in over two years).  There are several blogs that I've removed, either because they're completely inactive, or about ever 6 months or so, there's a "I know I haven't blogged in a while..." post, but nothing more.

There's no set formula for blog writing.  There are some blogs out there that have a couple of posts a month, and don't really say anything.  Then there are blogs like Mari's...she doesn't blog very much, but when she does, it's usually pure gold.  Corey's blog is a great example of how there's always something that you can write about.

...but I'm a n00b...
The second part of the above tweet is something I've seen many times over the years...folks new to the community say that they don't share thoughts or opinions (or anything else) because they're too new to offer anything of value.

That's an excuse.

A couple of years ago, one of the best experiences in my DFIR career was working with Don Weber.  I had finished up my time in the military as a Captain, and Don had been a Sgt.  On an engagement that we worked together, he was asking me why we were doing certain things, or why we were doing things a certain way.  Don wasn't completely new to the DFIR business, but he was new to the team, and he had fresh perspective to offer.  Also, his questions got me to thinking...am I doing this because there's a good reason to do so, or am I doing it because that's the way I've always done it?

One of the things that the "...I'm a n00b and have nothing to offer..." leads to is a lack of validation within the community.  What do I mean by that?  Well, there's not one of us in the field who's seen everything that there is to see.  Some folks are new to the field and don't have the experience to know where to look, or to recognize what they're seeing.  Others have been in the field so long that they no longer see what's going on "in the weeds"; instead, all they have access to is an overview of the incident, and maybe a few interesting tidbits.  Consider the Poweliks malware; I haven't had an investigation involving this malware, but I know folks who have.  My exposure to it has been primarily through AV write-ups, and if someone hadn't shared it with me, I never would've known that it uses other Registry keys for persistence, including CLSID keys, as well as Windows services.  My point is that someone new the community can read about a particular malware variant, and then after an exam, say, "...I found these four IOCs that you described, and this fifth one that wasn't in any of the write-ups I read...", and that is a HUGE contribution to the community.

Even simply sharing that you've seen the same thing can be validating.  "Yes, I saw that, as well..." lets others know that the IOC they found is being seen by others, and is valid.  When I read the Art of Memory Forensics, and read about the indicator for the use of a credential theft tool, I could have left it at that.  Instead, I created a RegRipper plugin and looked for that indicator on cases I worked, and found a great deal of validation for the indicator...and I shared that with one of the book authors.  "Yes, I'm seeing that, as well..." is validating, and "...and I'm also seeing this other indicator..." serves to move the community forward.

If you're not seeing blog posts about stuff that you are interested in, reach out and ask someone.  Sitting behind your laptop and wondering, "...why doesn't anyone post about their analysis process?" doesn't inherently lend itself to people posting about their analysis process.  Corey's post about his process, I've done it, Mari's done it...if this is something you like to see, reach out to someone and ask them, "hey, could you post your thoughts/process regarding X?"

As Grayson said, get out and network.  Engage with others in the industry.  Reading a blog is passive, and isn't interacting.  How difficult is it to read a blog post, think about it, and then contact the author with a question, or post a comment (if the author has comments enabled)?   Or link to that blog in a post of your own.

Not seeing content that you're interested in in the blogs you follow?  Start your own blog.  Reach out to the authors of the blogs you follow, and either comment on their blogs or email them directly, and share your thoughts.  Be willing to refine or elaborate on your thoughts, offering clarity.  If you are interested in how someone would perform a specific analysis task, be willing to offer up and share data.  It doesn't matter how new you are to the industry, or if you've been in the industry for 15 years...there's always something new that can be shared, whether it's data, or even just a perspective.

Blogging is a great way to organize your thoughts, provide context, and to practice writing.  Who knows, you may also end up learning something in the long run.  I know I have.