Monday, April 24, 2023

On Validation, pt III

From the first two articles (here, and here) on this topic arises the obvious question...so what? Not
validating findings has worked well for many, to the point that the lack of validation is not recognized. After all, who notices that findings were not verified? The peer review process? The manager? The customer? Given just the fact how pervasive training materials and processes are that focus solely on single artifacts in isolation should give us a clear understanding that validating findings is not a common practice. That is, if the need for validation is not pervasive in our industry literature, and if someone isn't asking the question, "...but how do you know?", then what leads us to assume that validation is part of what we do?

Consider a statement often seen in ransomware investigation/response reports up until about November 2019; that statement was some version of "...no evidence of data exfiltration was observed...". However, did anyone ask, "...what did you look at?" Was this finding (i.e., "...no evidence of...") validated by examining data sources that would definitely indicate data exfiltration, such as web server logs, or the BITS Client Event Log? Or how about indirect sources, such as unusual processes making outbound network connections? Understanding how findings were validated is not about assigning blame; rather, it's about truly understanding the efficacy of controls, as well as risk. If findings such as "...data was not exfiltrated..." are not validated, what happens when we find out later that it was? More importantly, if you don't understand what was examined, how can you address issues to ensure that these findings can be validated in the future?

When we ask the question, "...how do you know?", the next question might be, "...what is the cost of validation?" And at the same time, we have to consider, "...what is the cost of not validating findings?"

The Cost of Validation
In the previous blog posts, I presented "case studies" or examples of things that should be considered in order to validate findings, particular in the second article. When considering the 'cost' of validation, what we're asking is, why aren't these steps performed, and what's preventing the analyst from taking the steps necessary to validate the findings? 

For example, why would an analyst see a Run key value and not take the steps to validate that it actually executed, including determining if that Run key value was disabled? Or parse the Shell-Core Event Log and perhaps see how many times it may have executed? Or parse the Application Event Log to determine if an attempt to execute the program pointed to resulted in an application crash? In short, why simply state that program execution occurred based on nothing more than observing the Run key value contents? 

Is it because taking those steps is "too expensive" in terms of time or effort, and would negatively impact SLAs, either explicit or self-inflicted? Does it take too long do so, so much so that the ticket or report would not be issued in what's considered a "timely" manner? 

Could you issue the ticket or report in order to meet SLAs, make every attempt to validate your findings, and then issue an updated ticket when you have the information you need?

The Cost of Not Validating
In our industry, an analyst producing a ticket or report based on their analysis is very often well abstracted from the final effects, based on decisions made and resources deployed due to their findings. What this means is that whether in an internal/FTE or consulting role, the SOC or DFIR analyst may not ever know the final disposition of an incident and how that was impacted by their findings. That analyst will likely never see the meeting where someone decides either to do nothing, or to deploy a significant staff presence over a holiday weekend.

Let's consider case study #1 again, the PCI case referenced in the first post. Given that it was a PCI case, it's likely that the bank notified the merchant that they were identified as part of a common point of purchase (CPP) investigation, and required a PCI forensic investigation. The analyst reported their findings, identifying the "window of compromise" as four years, rather than the three weeks it should have been. Many merchants have an idea of the number of transactions they send to the brands on a regular basis...for smaller merchants, it may be a month, and for larger vendors, a week. They also have a sense of the "rhythm" of credit card transactions; some merchants have more transactions during the week and fewer on the weekends. The point is that when the PCI Council needed to decide on a fine, they take the "window of compromise" into account.

During another incident in the financial sector, a false positive was not validated, and was reported as a true positive. This led to the domain controller being isolated, which ultimately triggered a regulatory investigation.

Consider this...what happens when you tell a customer, "OMGZ!! You have this APT Umpty-Fratz malware running as a Windows service on your domain controller!!", only to later find out that every time the endpoint is restarted, the service failed to start (based on "Service Control Manager/7000" events, or Windows Error Reporting events, application crashes, etc.)? The first message to go out sounds really, REALLY bad, but the validated finding says, "yes, you were compromised, and yes, you do need a DFIR investigation to determine the root cause, but for the moment, it doesn't appear that the persistence mechanism worked."

Conclusion
So, what's the deal? Are you validating findings? What say you?

4 comments:

Anonymous said...

In the end it really comes down to the case, the value of the particular artifact to the overall findings and the clients goals/resources.

PCI is a very specific case type, but in a ransomware case or an intrusion where destruction of evidence were expected the responsible thing to do is just caveat the report and provide confidence levels for high level statements.

With this series I haven’t disagreed with any factual statements you’ve made but I think what we should be exploring isn’t whether or not we should validate: it’s the circumstances where it’s definitely needed/not needed and most importantly the grey areas where we all need better guidance. Ie: when an analyst is trying to force a story on the data rather than trying to strike at ground truth.

H. Carvey said...

Anonymous,

I appreciate your comment, but I have to disagree.

"... where destruction of evidence were expected..."

Expected? Yes, I agree, we see it...sometimes. But how often do threat actors really do a good job of actually cleaning up and hiding their activity? To say "where destruction of evidence were expected" is, IMHO, to concede.

"PCI is a very specific case type..."

While some specific activities are required of PCI cases, the cases themselves are really no different from any other cyber crime case.

"... we should be exploring isn’t whether or not we should validate:..."

See, I disagree. We should always attempt to validate, to the point where we can automate a great deal of what's needed to do so; data collection, data parsing and normalization, data decoration and enrichment, and then present the data to the analyst for "analysis".

Determining whether or not to validate opens the door to where we are now, which is an over-reliance on single artifacts on which to base a finding.

Anonymous said...

"opens the door to where we are now, which is an over-reliance on single artifacts on which to base a finding"

Having not seen the actual forensic reports you're critiquing via these case studies; In practice I'm not really convinced that this problem is as widespread as your examples want to indicate. The examples feel a bit forced, a carefully constructed strawman to be taken down by the author. In my comment I was hoping to encourage you to expand away from dogma, and provide the reader with a nuanced framework to navigate their cases by.

In any reports I've written or peer reviewed: "findings" are confidence grounded statements based on clusters of available artifacts (which appears to be what you mean by validated). If an analyst gave me a draft report where a conclusion or compromise window was being extrapolated from a single artifact it would be flagged during my review of that report. This could be a maturity thing, where some firms just hadn't automated enough evidence collection/pre-processing to internally hold each other to that standard. Bad firms do exist, so do bad analysts or analysts having a bad day -> but this a systemic failing of our industry do not make. (My cases were almost exclusively multi-system cyber intrusions where a client had already declined full forensic analysis during scoping)

"threat actors really do a good job of actually cleaning up and hiding their activity"

Even when they do a bad job -> A destructive attack itself does enough damage to the evidence to hinder confidence within an investigation. Encrypted prefetch artifacts aren't parsed, volume shadows were deleted to prevent recovery, modified/access times of the whole filesystem are mangled ect ect. Beyond that event logs often do get removed. In these cases analysts do need to stitch a lot of the story together on very shaky evidence; but to your point it should never be for lack of effort.

Most of the modern forensic triage collection/analysis frameworks attempt a kitchen sink collection/processing approach. If anyone is out there hand jamming forensic triage in 2023 they need to be called out on a lot more than single artifact validation.

Anonymous said...

Not that you need another stark example of a county forensic lab failing to validate findings before someone’s life was ruined. https://dailyvoice.com/new-jersey/mercer/news/authorities-drop-charges-after-cybersleuth-proves-ramsey-responder-was-wrongfully-accused/835192/

This is a potential miscarriage of Justice caused by an examiner failing to follow up on a Dropbox account.