Monday, April 03, 2023

On Validation

I've struggled with the concept of "validation" for some time; not the concept in general, but as it applies specifically to SOC and DFIR analysis. I've got a background that includes technical troubleshooting, so "validation" of findings, or the idea of "do you know what you know, or are you just guessing", has been part of my thought processes going back for about...wow...40 years.

Here's an example...when setting up communications during Team Spirit '91 (military exercises in South Korea), my unit had a TA-938 "hot line" with another unit. This is exactly what it sounds like...it was a directly line to that other unit, and if one end was picked up, the other end would automatically ring. Yes, a "Bat phone". Just like that. Late one evening, I was in the "SOC" (our tent with all of the communications equipment) and we got a call that the hot line wasn't working. We checked connections, checked and replaced the batteries in the phone (the TA-938 phones took 2 D cell batteries, both facing the same direction), etc. There were assumptions and accusations thrown about as to why the phone wasn't working, as my team and I worked through the troubleshooting process. We didn't work on assumption; instead, we checked, rechecked, and validated everything. In the end, we found nothing wrong with the equipment on our end; however, the following day, we did find out what the issue was - at the other end, there was only one Marine in the tent, and that person had left the tent for a smoke break during the time of the attempted calls.

We could have just said, "oh, it's the batteries...", and replaced them...and we'd have the same issue all over again. Or, we could have just stated, "...the equipment on the other end was faulty/broken...", and we would not have made a friend of the maintenance chief from that unit. There were a lot of assumptions we could have made, conclusions we could have jumped to...and we'd have been wrong. We could have made stated findings that were trusted, and resulted in decisions being made, assets and resources being allocated, etc., all for the wrong reason. The end result is that my team and I (especially me, as the officer) would have lost credibility, and the trust and confidence of our fellow team members, and our commanding officer. As it was, validating our findings led to the right decisions being made, which were again validated during the exercise after action meetings.

Okay, so jump forward 32 years to present day...how does this idea of "validation" apply to SOC and DFIR analysis? I mean, this seems like such an obvious thing, right? Of course we validate our findings...but do we, really?

Case Study #1
A while back, I attended a conference during which one of the speakers walked through a PCI investigation they'd worked on. As the speaker walked through their presentation, they talked about how they'd used a single artifact, a ShimCache entry for the malware, to demonstrate program execution. This single artifact was used as the basis of the finding that the malware had been on the system for four years.

For those readers not familiar with PCI forensic investigations, the PCI Council specifies a report format and "dashboard", where the important elements of the report are laid in a table at the top of the report. One of those elements is "window of compromise", or the time between the original infection and when the breach was identified and remediated. Many merchants track the number of credit card transactions they process on a regular basis, including not only during periods of "regular" spending habits, but also off-peak and peak/holiday seasons, and as a result, the "window of compromise" can give the merchant, the bank, and the brand an approximate number of potentially compromised credit card numbers. As you'd imagine, given any average, the number of compromised credit card numbers would be much greater over a four year span than it would for, say, a three week "window of compromise". 

As you'd expect, analysts submitting reports rarely, if ever, find out the results of their work. I was a PCI forensic analyst for about three and a half years, and neither I nor any of my teammates (that I'm aware of) heard what happened to a merchant after we submitted our reports. Even so, I cannot imagine that a report with a "window of compromise" of four years was entirely favorable.

But that begs the question - was the "window of compromise" really four years? Did the analyst validate their finding using multiple data sources? Something I've seen multiple times is that malware is written to the file system, and then "time stomped", often using time stamps retrieved from a native system file. This way, the $STANDARD_INFORMATION attribute time stamps from the $MFT record for the file appear to indicate that the file is "long lived", and has existed on the system for quite some time. This time stomping occurs before the Application Compatibility functionality of the Windows operating system creates an entry for the file, and the last modification time that's recorded for the entry is the one that's "time stomped". As a result, a breach that occurred in May 2013 and was discovered three weeks later ends up having the malware itself being reported as placed on the system in 2009. What impact this had, or might have had on a merchant, is something that we'll never know.

Misinterpreting ShimCache entries has apparently been a time-honored tradition within the DFIR community. For a brief walk-through (with reference links) of ShimCache artifacts, check out this blog post.

Case Study #2
In the spring of 2021, analysts were reporting, based solely on EDR telemetry, that within their infrastructure threat actors were using the Powershell Set-MpPreference module to "disable Windows Defender". This organization, like many others, was tracking such things as control efficacy (the effectiveness of controls) in order to make decisions regarding actions to take, and where and how to allocate resources. However, these analysts were not validating their findings; they were not checking the endpoints themselves to determine if Windows Defender had, in fact, been disabled, and if the threat actor's attempts had actually impacted the endpoints. As it turns out, that organization had a policy at the time of disabling Windows Defender on installation, as they had chosen another option for their security stack. As such, stating in tickets that threat actors were disabling Windows Defender, without validating these findings, led to quite a few questions, and impacted the credibility of the analysts

Artifacts As Composite Objects
Joe Slowik spoke at RSA in 2022, describing indicators, or technical observables, as "composite objects". This is an important concept in DFIR and SOC analysis, as well, and not just in CTI. We cannot base our findings on a single artifact, treating it as a discrete, atomic indicator, such as an IP address just being a location, or tied to a system, or a ShimCache entry denoting time of execution. We cannot view a process command line within EDR telemetry, by itself, as evidence of program execution. Rather, we need to recognize that artifacts are, in fact, composite objects; in his talk, Joe references Mandiant's definition of indicators of compromise, which can help us understand and visualize this concept. 

Composite objects are made up of multiple elements. An IP address is not just a location, as the IP address is an observable with context. Where was the IP address observed, when was it used, and how was it used? Was it the source of an RDP, or a type 3 login? If the IP address was the source of a successful login, what was the username used? Was the IP address the source of a connection seen in web server or VPN logs? Is it the C2 address? 

If we consider a ShimCache entry, we have to remember that (a) the entry itself does NOT explicitly demonstrate program execution, and that (b) the time stamp is mutable. That is, what we see could have been modified before we saw it. For example, we often see analysts hold up a ShimCache entry as evidence of program execution, often as the sole indicator. We have to understand and remember that the time stamp associated with a ShimCache entry is the last modification time for the entry, taken from the $STANDARD_INFORMATION attribute within the MFT. I've seen several instances where the file is placed on the system and then time stomped (the time stamp is easily mutable) before the entry was added to the Application Compatibility database. This is all in addition to understanding that an entry in the ShimCache does NOT mean that the file was executed. Note that the same is true for AmCache entries, as well.

We can validate indicators of compromise by including them in constellations, including them alongside other associated indicators, as doing so increases fidelity and brings valuable context to our analysis. We see this illustrated when performing searches for PCI data within acquired images; if you just search for a string of 16 characters starting with "4", you're going to get a LOT of results. If you look for strings of characters based on a bank ID number (BIN), length of the string, and if it passes the Luhn check, you're still going to get a lot of results, but not as many. If you also search for the characteristics associated with track 1 and track 2 data, your search results are going to be a smaller set, but with much higher fidelity because we've added layers of context. 

Cost
So the question becomes, what is the cost of validating something versus not validating it? What is the impact or result of either? This seems on the surface like it's a silly question, maybe even a trick question. I mean, it looks that way when I read back over the question after typing it in, but then I think back to all the times I've seen when something hasn't been validated, and I have to wonder, what prevented the analyst from validating their finding, rather than simply basing their finding on a single artifact, out of context?

Let's look at a simple example...we receive an alert that a program executed, based on SIEM data or EDR telemetry. This alert can be based on elements of the command line, process parentage, or a combination thereof. Let's say that based on a number of factors and reliable sources, we believe that the command line is associated with malicious activity.

What do you report?

Do you report that this malicious thing executed, or do you investigate further to see if the malicious thing really did execute, and executed successfully? How would be go about investigating this, what data sources would be look to? 

As you're thinking about this, as you're walking through this exercise, something I'd like you to keep in mind is that question, what would prevent you from actually examining those data sources you identify? Is there some "cost" (effort, time, other resources) that prevent you from doing so?

4 comments:

Andreas said...

Thanks for the article.

Validation is needed during preparation (what do my tools really do?) and after using them, as you described, to understand the findings and validate them. Context matters and awareness needed to challenge assumptions and the output of the tools:
- what is the context of my finding and does it make
sense? understand the constellation and "plausibility
check". Based on other facts, is it possible that
something was executed a few years ago?
- do I cover all aspects to answer my questions?
(what else could I use to prove or disprove something?)
- can I disprove a finding because something is not there
which should be to confirm that finding?
- is a finding related to the case at hand or just a
coincidence during the investigation?

As you already wrote and which is an important reminder, program execution artefacts tell only part of the story. The program could be started but failed to execute successfully. Or could be started but nothing was done with it.

Timestamps are one of the things to look for, or program executions (legitimate binaries, windows updates) running at the same time as the malware which leads to confusing conclusions, same for networks, legitimate network indicators mixed up with the network behaviour of a malware. Validate the relationship between indicators.

> As you're thinking about this, as you're walking through this exercise, something I'd like you to keep in mind is that question, what would prevent you from actually examining those data sources you identify? Is there some "cost" (effort, time, other resources) that prevent you from doing so?

Additionally to the effort and time needed for verification, the lack of knowledge/training and awareness should be considered too. One must understand the ShimCache limitations when writing about program execution, one must understand the timestamp handling around various artefacts for correct interpretation of the result. And analysts must be aware of all this validation questions. It's a question of training and understanding, to apply the relevant questions to challenge findings.

At the end, missing validation will result in wrong or missing relevant actions which will result at the end with costs in not doing it.

H. Carvey said...

Andreas,

Thanks for the comment! That's a lot to take in!

Andreas said...

These validation questions remind me of one case, where an attacker placed a run key using a legitimate graphics card persistence and binary naming. I oversaw that for a long time because I missed the most important question: Is that graphics product installed on that system by default?

Knowing the setup, the context, the tooling used by default, are of great value in validating findings.

H. Carvey said...

I saw the same thing with DLL side-loading PlugX back in the day...Symantec, Sophos, or Kaspersky EXEs were used (dropped in a ProgramData folder), but no one ever looked to see if that application was in use within the environment...