Tuesday, April 14, 2020

Registry Analysis, pt II

In my last blog post, I provided a brief description of how I perform "Registry analysis", and I thought it would be a good idea to share the actual mechanics of getting to the point of performing Registry analysis.

First off, let me state clearly that I rarely perform Registry analysis in isolation from other data sources and artifacts on the system.  Most often, I'll incorporate file system metadata, as well as Windows Event Log metadata, into my analysis in order to develop a clearer picture of the activity.  Doing this helps me to 'see' activity that might be associated with a threat actor, and it goes a long way towards removing guesses and speculation from my analysis.

For instance, I'll incorporate Windows Event Log record metadata using the following command:

C:\tools>wevtx.bat d:\case\*.evtx > d:\case\evtx_events.txt

The above command places the record metadata, decorated using intrusion intel from the eventmap.txt file, into an intermediate file, with all of the entries in the 5-field TLN format.  I can then make use of just this file, or I can incorporate it into my overall timeline events file using the 'type' command:

C:\tools>type evtx_events.txt >> events.txt

That being said, there are times when I have been asked to "take a look at the Registry", and during those times, my hope is to have something around which to pivot...a service name, a specific date and time, some particular event, etc. I'll start this process by listing all of the Registry keys in the Software and System hives based on the key LastWrite times, using the following commands:

C:\tools>regtime -m HKLM/Software/ -r d:\case\software > d:\case\reg_events.txt
C:\tools>regtime -m HKLM/System/ -r d:\case\system >> d:\case\reg_events.txt

Note: RegRipper will tell you if the hive you're accessing is 'dirty', and if so, you'll want to strongly consider merging the transaction logs into the hive prior to parsing.  I like to do this as a separate process because I like to have the original hive file available so that I can look for deleted keys and values.

If there's a suspicion or evidence to suggest that a local user account was created, then adding metadata from the SAM hive is pretty simple and straightforward:

C:\rr3>rip -r d:\case\sam -p samparse_tln >> d:\case\reg_events.txt

When I say "evidence to suggest" that the threat actor added a local account to the system, one way to check for that is to hope the right auditing was enabled, and that you'd find the appropriate records in the Security Event Log. Another way to check is to parse the SAM Registry hive:

C:\rr3>rip -r d:\case\sam -p samparse

Then, correlate what you see to the ProfileList key from the Software hive:

C:\rr3>rip -r d:\case\software -p profilelist

Looking at these two data sources allows us to correlate user accounts and RIDs to user profiles on the system.  In many cases, we'll have to consider domain accounts (different SIDs), as well.

I'll also include other specific information from the Registry hives in the timeline:

C:\rr3>rip -r d:\case\system -p shimcache_tln >> d:\case\reg_events.txt
...

I'll also incorporate the AmCache metadata, as well:

C:\rr3>rip -r d:\case\amcache.hve -p amcache_tln >> d:\case\reg_events.txt

For a user, I generally want to create a separate mini-timeline, using similar commands as above:

C:\tools>regtime -m HKCU/ -r d:\case\user\ntuser.dat -u user > d:\case\user\reg_events.txt
C:\tools>regtime -m HKCU/ -r d:\case\user\usrclass.dat -u user >> d:\case\user\reg_events.txt
C:\rr3>rip -r d:\case\user\usrclass.dat -u user -p shellbags_tln >> d:\case\user\reg_events.txt
C:\rr3>rip -r d:\case\user\ntuser.dat -u user -p userassist_tln >> d:\case\user\reg_events.txt
...

Note: If you're generally looking at the same artifacts within a hive (NTUSER.DAT, etc.) over and over, it's a good idea to open Notepad and create a RegRipper profile.  That way, you have a documented, repeatable process, all in a single command line.

Note: If you're looking at multiple systems, it's not only a good idea to differentiate users on the system via the "-u" switch, but also differentiate the system by using the "-s" switch in the RegRipper command lines.  You can get the system name via the compname.pl RegRipper plugin.

Once the events file has been created, I have a source for parsing out specific items, specific time frames, or just the entire timeline, using parse.exe.  I can create the entire timeline, and based on items I find to pivot on, go back to the events file and pull out specific items using combinations of the type and find commands.  The complete timeline is going to contain all sorts of noise, much of it based on legitimate activity, such as operating system and application updates, normal user activity (logins, logoffs, day-to-day operations, etc.), and sometimes it's really helpful to be able to look at just the items of interest, and then view them in correlation with other items of interest.

Note: If you have a standard extraction process, or if you mount images using a means that makes the files accessible, all of this can be automated with something as simple as a batch or shell script.

Once I get to this point, the actual analysis begins...because parsing and display are not "analysis".  Getting one value or the LastWrite time to one key is not "analysis".  For me, analysis is an iterative process, and what I described above is just the first step.  From there, I'll keep a viewer handy (usually MiTeC's WRR) and a browser open, allowing me to dig in deeper and research items of interest.  This way, I can see values for keys for which there are not yet RegRipper plugins, such as when a new malware variant creates keys and values, or when a threat actor creates or modifies keys. When I do find something new like that (because the process facilitates finding something new), that then becomes a RegRipper plugin  (or a modification to an existing plugin), decoration via the eventmap.txt file, etc.  The point is that whatever 'new thing' is developed gets immediately baked back into the overall process.

For example, did a threat actor disable Windows Defender, and if so how? Via a batch file?  No problem, we can use RegRipper to check the Registry keys and values.  Via GPO?  Same thing...use RegRipper.  Via an sc.exe command?  No problem...we can use RegRipper for that, as well.

What about open source intrusion intel, such as this FireEye blog post?  The FireEye blog post is rich with intrusion intel that can be quickly and easily turned into plugins and event decoration, so that whenever those TTPs are visible in the data, the analyst is immediately notified, providing pivot points and making analysis vastly more consistent and efficient.

Sunday, April 12, 2020

Registry Analysis

When you see the words, "Registry analysis", what comes to mind? 

Okay, now...what actually happens when we 'do' this thing we call "Registry analysis"?  More often than not, what this refers to manifests itself as opening a Registry hive file in a viewer, "looking around", or maybe doing some searches or sorting based on dates.  But is that really Registry analysis, or is it simply parsing and viewing?

Often, when you get right down to it and peel back all of the layers (like an onion), "analysis" (in general) from an operational perspective manifests as:
  • Get a data source, often based on a list provided by an external resource
  • Open that data source in a viewer, or parse it and open the output in another application (Excel)
  • Locate specific items, again often based on an externally-provided list; this can include conducting a search based on some items (time) or keywords
  • Do the same with another data source
  • Lather, rinse, repeat
For example, an analyst might extract the MFT, parse it via a tool such as AnalyzeMFT or MFTECmd, search for specific files, or for files created or modified during a specific time frame, and then manually transpose that information into a spreadsheet.  If other data sources are then examined, the process is repeated, and as such, the overall approach to getting to the point of actually conducting analysis (i.e., looking at the output from more than one data source) is very manual, very time intensive, and as a result, very expensive.

To that point, 'cost' isn't just associated with time and expense.  It's also directly tied in with what's included in the analyst's final spreadsheet; more specifically, the approach lends itself to important artifacts and TTPs being missed.  OSINT regarding a threat actor group, based on analysis of the malware associated with the group, most often focuses on IOCs does not account for TTPs and behaviors (i.e., how the malware and tools are used...).  This includes not just of the threat actor's behaviors on the system, but also as a result of the threat actor's interactions with the ecosystem in which they're operating.  OSINT is not intrusion intelligence, and if the analyst uses that OSINT as the totality of what they look for, rather than just the beginning, then critical data is going to be missed.

One way of overcoming this is the use of automation to consume and correlate multiple data sources simultaneously, viewing them in relation to each other.  Some have looked at automation tools such as log2timeline or plaso, but have experienced challenges with respect to how the tools are used. Perhaps a better approach is a targeted, 'sniper forensics' approach, rather than the usual "spray and pray" approach.

For many analysts, what "Registry analysis" means is that they may have a list of "forensically relevant" items (i.e., keys and values), perhaps in a spreadsheet, that they use to manually peruse hive files.  As such, they'll open a hive in a viewer and use the viewer to navigate to specific keys and values (Eric's Registry Explorer makes great use of bookmarks).  This list of "forensically relevant" items within the Registry may be based on lists provided to the analyst, rather than developed by the analyst, and as such, may not be complete.  In many cases, these lists are stagnant, in that once they are received, they are neither extended, nor are new items (if determined) shared back with the source.

Rather than maintaining a list of keys and values that are "forensically relevant", analysts should instead consider what is "forensically relevant" based on the analysis goals of the case, and employ a process that allows them to not only find the items they're looking for, but to also 'see' new things.  For example, I like to employ a process that creates a timeline of activity, using Registry key LastWrite times, as well as parsing specific values based on their associated time stamps.  This process correlates hive files, as well...doing this using the Software hive, user's NTUSER.DAT and USRCLASS.DAT, as well as the AmCache.hve file, all in combination, can be extremely revealing.  I've used this several times to 'see' new things, such as what happens on a system when a user clicks on an ISO email attachment.  Viewing all of the 'events' from multiple sources, side-by-side, in a consolidated timeline provides a much more complete picture and a much more granular view than the traditional "manually add it to a spreadsheet" approach.

Adding additional sources...MFT, Windows Event Logs, etc...can be even more revealing of the overall TTPs, than simply viewing each of these data sources in isolation.

Sunday, April 05, 2020

Going Beyond

As an industry and community, we need to go beyond...go beyond looking at single artifacts to indicate or justify "evidence", and we need to go beyond having those lists of single artifacts provided to us.  Lists, such as the SANS DFIR poster of artifacts, are a good place to start, but they are not intended to be the end-all.  And we need to go beyond our own analysis, in isolation, and discuss and share what we see with others.

Here's a good example...in this recent blog post, the author points to Prefetch artifacts as evidence of file execution.  Prefetch artifacts are a great source of information, but (a) they don't tell the full story, and (b) they aren't the only artifact that illustrates "file execution".  They're one of many.  While it's a good idea to start with one artifact, we need to build on that one artifact and create (and pursue) artifact constellations.

This post, and numerous others, tend to look at artifacts in isolation, and not as part of an overall artifact constellation.  Subsequently, attempts at analysis fall apart (or simply fall short) when that one artifact, the one we discussed in isolation, is not present.  Consider Prefetch files...yes, they are great sources of information, but they are not the only source of information, and they are not present by default on Windows servers. 

And, no, I do not think that one blog post speaks for the entire community...not at all.  Last year, I took the opportunity to explore the images provided as part of the DefCon 2018 CTF.  I examined two of the images, but it was the analysis of the file server image that I found most interesting.  Rather than attempting to answer all of the questions in the CTF (CTF questions generally are not a good representation of real world engagements), I focused on one or two questions in particular.  In the case of the file server, there was a question regarding the use of an anti-forensics tool.  If you read my write-up, you'll see that I also reviewed three other publicly available write-ups...two relied on a UserAssist entry to answer the question, and the third relied on a Registry value that provided information about the contents of a user's desktop.  However, none of them (and again, these are just the public write-ups that I could find quickly) actually determined if the anti-forensics tool had been used, if the functionality in question had been deployed.

Wait...what?  What I'm saying is that one write up had answered the question based on what was on the user's desktop, and the two others had based their findings on UserAssist entries (i.e., that the user had double-clicked on an icon or program on their desktop).  However, neither had determined if anything had actually been deleted. I say this because there was also evidence that another anti-forensics tool (CCleaner) had been of interest to the user, as well. 

My point is that when we look at artifacts in isolation from each other, we only see part of the picture, and often a very small part.  If we only look at indications of what was on the user's desktop, that doesn't tell us if the application was ever launched.  If we look at artifacts of program execution (UserAssist, Prefetch, etc.), those artifacts, in and of themselves, will not tell us what the user did once the application was launched; it won't tell us what functionality the user employed, if any.

Here's another way to look at it.  Let's say the user has CCleaner (a GUI tool) on their desktop.  Looking at just UserAssist or Prefetch...or, how about UserAssist and Prefetch...artifacts, what is the difference between the user launching CCleaner and deleting stuff, and launching CCleaner, waiting and then closing it?

None.  There is no difference. Which is why we need to go beyond just the initial, easy artifacts, and instead look at artifact clusters or constellations, as much as possible, to provide a clear(er) picture of behavior.  This is due to the nature of what we, as examiners, are looking at today.  None of the incidents we're looking at...targeted threats/APTs, ransomware/crimeware, violations of acceptable use policies, insider threats, etc...are based on single events or records. 

Consider ransomware...for the most part, these events were looked at, more often than not, as, "files were encrypted". End of story. But the reality is that in many cases, going back years, ransomware incidents involved much more than just encrypting files.  Threat actors were embedded within environments for weeks or months before ever encrypting a file, and during that time they were collecting information and modifying the infrastructure to meet their needs.  I say "were", but "still are" applies equally well.  And we've seen an evolution of this "business model" over the past few months, in that we know that data was exfil'd during the time the actor was embedded within the infrastructure, not due to our analysis, but because the threat actor releases it publicly, in order to "encourage" victims to pay. A great deal of activity needs to occur for all of this to happen...settings need to be modified, tools need to be run, data needs to be pulled back to the threat actor's environment, etc.  And because these actions occur over time, we cannot simply look to one, or a few artifacts in isolation, in order to see the full picture (or as full a picture as possible).

Dr. Ali Hadi recently authored a pair of interesting blog posts on the topic of USB devices (here, and here).  In these posts, Dr. Hadi essentially addresses the question of, how do we go about performing out usual analysis when some of the artifacts in our constellation are absent? 

Something I found fascinating about Dr. Hadi's approach is that he's essentially provided a playbook for USB device analysis.  While he went back and forth between two different tools, both of his blog posts provide sufficient information to develop that playbook in either tool.  For example, while Dr Hadi incorporated the use of Registry Explorer, all of the artifacts (as well as others) can also be derived via RegRipper plugins.  As such, you can create RegRipper profiles of those plugins, and then run the automatically against the data you've collected, automating the extraction of the necessary data.  Doing so means that while some things may be missing, others may not, and analysts will be able to develop a more complete picture of activity, and subsequently, more accurate findings.  And automation will reduce the time it takes to collect this information, making analysis more efficient, more accurate, and more consistent across time, analysts, etc.

Okay, so what?  Well, again...we have to stop thinking in isolation.  In this case, it's not about just looking at artifact constellations, but it's also about sharing what we see and learn with other analysts.  What one analyst learns, even the fact that a particular technique is still in use, is valuable to other analysts, as it can be used to significantly decrease their analysis time, while at the same time increasing accuracy, efficiency, and consistency. 

Let's think bigger picture...are we (DFIR analysts) the only ones involved?  In today's business environment, that's highly unlikely.  Most things of value to a DFIR analyst, when examined from a different angle, will also be valuable to a SOC analyst, or an EDR/EPP detection engineer.  Here's an example...earlier this year, I read that a variant of Ryuk had been analyzed and found to contain code for deploying Wake-on-LAN packets in order to increase the number of systems it could reach, and encrypt. As a result, I wrote a detection rule to alert when such packets were found originating from a system; the point is that something found by malware reverse engineers could be effectively employed by SOC analysts, which in turn would result in more effective response from DFIR analysts.

We need to go beyond.  It's not about looking at artifacts in isolation, and it's not about conducting our own analysis in isolation.  The bad guys don't do it...after all, we track them as groups.  So why not pursue all aspects of DFIR as a 'group'; why not look at groups of artifacts (constellations), and share our analysis and findings not just with other DFIR analysts, but other members of our group (malware RE, threat intel analyst, SOC analyst, etc.), as well?