Sunday, August 16, 2020

Toolmarks: The "How"

 I recently published a post where I discussed DFIR toolmarks, and not long after sharing it on Twitter, someone asked me for a list of resources that describe the "how"; that is, how to go about finding or determining toolmarks for various activities.  The specific use case asked about was malware; how to determine malware toolmarks after running the sample in a sandbox.  The best response I could come up with was Windows Forensic Analysis, 4/e, and I included the caveat that they'd have to learn the material and then apply it.  As they'd already found out, there isn't a great deal of information available related to identifying DFIR toolmarks.

One way to think of toolmarks is as artifact constellations, as these are the "scratches" or "indentations" left behind either as a direct result of the activity, or as a result of the activity occurring within the "eco-system" (i.e., operating system, audit configuration, applications, etc.).

The way I have gone about identifying toolmarks is through the use of timelines, at multiple levels.  What I mean by that is that I will start with a system timeline, using data sources such as the MFT, USN change journal, selected Windows Event Logs, Registry hives, SRUM database, etc.  From there, if I find that a specific user account was used in a manner that likely populated a number of artifacts (i.e., type 10 login as opposed to a type 3 login), I might need to create mini- or micro-timelines, based on user activity (browser history, Registry hives, timeline activity database, etc.).  This would allow me to drill down on specific activity without loosing it amongst the 'noise' of the system timeline.  Throughout the timeline analysis process, there's also considerable enrichment or "decoration" of the data that goes on; for example, the timeline might show that a Registry key had been modified, and decorating the data with the specific value that was created or modified will add context, and likely provide a pivot point for further analysis.  Data can also be decorated using online resources (knowledgebase articles, blog posts, etc.), providing additional insight and context, and even leading to additional pivot points.

The theDFIRReport write-up on Snatch ransomware attacks states that the threat actors "turned off Windows Defender"; however, there's no mention as to how this is accomplished. Now, like many will tell you, there are a number of ways to do this, and that's absolutely true...but it doesn't mean that we walk away from it.

Disabling Windows Defender falls under MITRE ATT&CK subtechnique T1562.001, and even so, doesn't go quite deep enough and address how defenses were impaired.  The data from the analyzed cases [c|sh]ould be used to develop the necessary toolmarks.  For example, was sc.exe or some other tool (GMER?) used to disable the Windows service?  Or was reg.exe or Powershell used to add or modify Registry values, such as disabling some aspect of Defender, or adding exclusions?  Given that the threat actors were found to access systems via RDP, was a graphical means used to disable Defender, such as Defender Control?  Each of these approaches has very different toolmarks, and each of these approaches is trivial to check for and identify, as most collection processes (be they triage or full image acquisition) already collect the necessary data.

So, for me, identifying toolmarks starts with timeline creation and analysis.  However, I do understand that this approach doesn't scale, and isn't something that everyone even wants to do, even if there is considerable value in doing so. As such, the way to scale something like this is to start with timeline analysis, and then incorporate findings from this analysis back into the process, 'baking' it back in to the way you do analysis.  One approach to automating this is to incorporate data decoration into a timeline creation process, so that something learned from one engagement is then used to enrich future engagements. A very simple example of how to do this can be found in eventmap.txt, which "tags" specific Windows Event Log records (based on source/ID pairs) as the Windows Event Logs are processed from the native state and being incorporated into a timeline.

If your data collection process results in data in a standard structure (i.e., acquired image, triage data in a zipped archive, etc.) then you can easily automate the parsing process, even at the lab intake point rather than having the analyst spend time doing it. If your analysts are able to articulate their findings and incorporate those back into the parsing and decoration process via some means, they are then able to share their knowledge with other analysts such that those other analysts don't have to have the same experiences.

Another example of toolmarks can be found in this u0041 post that discusses artifacts associated with the use of smbexec.py, one of the Impacket tools used for lateral movement, similar to PSExec. We can then create a timeline events file (from an acquired image, or from triage data), and add Windows Event Log records to the events file using wevtx.bat, employing data decoration via eventmap.txt.  An additional means of identifying pivot points in analysis would include running Yara rules over the timeline events file, looking for specific lines with the event source and IDs, as well as the specific service name that, per the u0041 blog post, is hard-coded into smbexec.py.  All of this can be incorporated into an automated process that is run via the lab intake team, prior to the data being handed off to the analyst to complete the case.  This would mean that the parsing process was consistent across the entire team of analysts, regardless of who was performing analysis, or what case they received.  It would also mean that anything learned by one analyst, any set of toolmarks determined via an investigation, would then be immediately available to other analysts, particularly those who had not been involved in, nor experienced, that case.  The same can be said for research items such as the u0041 blog post...one analyst might read about it, create the necessary data decoration and enrichment, and boom, they're done.  

Conclusion

Identifying toolmarks is just the first step.  Again, the means I've used is timeline analysis....not creating one big timeline, but creating mini-timelines or "overlays" (think cellophane sheets teachers used to use with overhead projectors) that allow me to see the important items without having to wade through the "noise" of regular system activity (i.e., operating system and application updates, etc.).

Once toolmarks have been identified, the next step is to document and 'bake' them back into the analysis process, so that they are immediately available for future use.

No comments: