Wednesday, February 19, 2025

Lina's Write-up

Lina recently posted on LinkedIn that she'd published another blog post. Her blog posts are always well written, easy to follow, fascinating, and very informative, and this one did not disappoint.

In short, Lina says that she found a bunch of Chinese blog posts and content describing activity that Chinese cybersecurity entities have attributed to what they refer to as "APT-C-40", or the NSA. So, she read through them, translated them, and mapped out a profile of the NSA by overlaying the various write-ups.

Lina's write-up has a lot of great technical information, and like the other stuff she's written, is an enthralling read. Over the years, I've mused with others I've worked with as to whether or not our adversaries had dossiers on us, or other teams, be they blue or red. As it turns out, thanks to Lina, we now know what they do, what those dossiers might look like, and the advantage that the eastern countries have over the west.

For me, the best part of the article was Lina's take-aways. It's been about 30 yrs since I touched a Solaris system, so while I found a lot of what Lina mentioned in the article interesting (like how the Chinese companies knew that APT-C-40 were using American English keyboards...), I really found the most value in the lessons that she learned from her review and translation of open Chinese reporting. Going forward, I'll focus on the two big (for me) take-aways:

There is a clear and structured collaboration...

Yeah...about that.

A lot of this has to do with the business models used for DFIR and CTI teams. More than a few of the DFIR consulting teams I've been a part of, or ancillary to, have been based on a utilization model, even the ones that said they weren't. A customer call comes in, and the scoping call results in an engagement of a specific length; say, 24 or 48 hrs, or something like that. The analyst has to collect information, "do" analysis and write a report, eating any time that goes over the scoped time frame, or taking shortcuts in analysis and reporting to meet the timeline. As such, there's little in the way of cross-team collaboration, because, after all, who's going to pay for that time?

In 2016, I wrote a blog post about the Samas (or SamSam) ransomware activity we'd seen to that point. This was based on correlation of data across half a dozen engagements, each worked by a different analyst. The individual analysts did not engage with each other; rather, they simply proceeded through the analysis and reporting of their engagement, and were then assigned to other engagements.

Shortly after that blog post was published, Kevin Strickland published his analysis of another aspect of the attacks; specifically, the evolution of the ransomware itself.

Two years later, additional information was published about the threat group itself, some of which had been included in the original blog post.

The point is that many DFIR teams do not have a business model that facilitates communications across engagements, and as such, analysts aren't well practiced at large scale communications. Some teams are better at this than others, but that has a lot to do with the business model and culture of the team itself. 

Overall, there really isn't a great deal of collaboration within teams and organizations, largely because everyone is silo'd off by business models; the SOC has a remit that doesn't necessarily align with DFIR, and vice versa; the CTI team doesn't have much depth in DFIR skill sets, and what the CTI team publishes isn't entirely useful on a per-engagement basis to the DFIR team. I've worked with CTI analysts who are very, very good at what they do, like Allison Wikoff (re: Mia Ash), but there was very little overlap between the CTI and IR teams within those organizations.

Now, I'm sure that there's a lot of folks reading this right now who're thinking, "hey, hold on...I/we collaborate...", and that may very well be the case. What I'm sharing is my own experience over the passed 25 yrs, working in DFIR as a consultant, in FTE roles, running and working with SOCs, working in companies with CTI teams, etc.

This is an advantage that the east has over the west; collaboration. As Lina mentioned, a lot of the collaboration in the west is through closed, invite-only groups, so a lot of what is found isn't necessarily shared widely. As a result, those that are not part of those groups don't have access to information or intel that might validate their own findings, or fill in some gaps. Further, those who aren't in these groups have information that would fill in gaps for those who are, but that information can't be shared, nor developed.

...Western methodologies typically focus on constructing a super timeline...

My name is Harlan, and I'm a timeliner. Not "super timelines"...while I'm a huge fan of Kristinn (heck, I bought the guy a lollipop with a scorpion inside once), I'm a bit reticent to had over control of my timeline development to log2timeline/plaso. This is due, in part, to knowing where the gaps are, what artifacts the tool parses, and which ones it doesn't. Plaso and it's predecessor are great tools, but they don't get everything, particularly not everything I need for my investigations, based on my analysis goals. 

Okay, getting back on point...I see what Lina's saying, or perhaps it's more accurate to say, yes, I'm familiar with what she describes. In several instances, I've done a good bit of adversary profiling myself, without the benefit of "large scale data analysis using AI" because, well, AI wasn't available, and I started out my investigation looking for those things. In one instance, I could see pretty clearly not just the hours of operation of the adversary, but we'd clearly identified two different actors within the group going through shift changes on a regular basis. On the days where there was activity on one of the nexus endpoints, we'd see an actor log in, open a command prompt/cmd.exe, and then interact with the Event Logs (not clearing them). Then, about 8 hrs later (give or take), that actor would log out, and another actor would log in and go directly to PowerShell. 

Adversary profiling, going beyond IOCs and TTPs to look at hours of operation/operational tempo, situational awareness, etc., is not something that most DFIR teams are tasked or equipped for, and deriving that sort of insight from intrusion data is not something either DFIR or CTI teams are necessarily equipped/staffed for. This doesn't mean that it doesn't happen, just that it's not something that we, in the West, see in reporting on a regular basis. We simply don't have a culture of collaboration, neither within nor across organizations. Rather, if detailed information is available, many times it's thought to be held close to the vest, as part of a competitive advantage. In my experience, it's less about competitive advantage, and more often the case that, while the data is available, it's not developed into intel, nor insights.

Conclusion
I really have to applaud Lina for not only taking the time to, as she put it, dive head-first into this rabbit hole, and for putting forth the effort and having the courage to publish her findings. In his book Call Sign Chaos, Gen. Mattis referred to the absolute need to be well-read, and that applies not just to warfighters, but across disciplines, as well. However, in order for that to be something that we can truly take advantage of, we need writing like Lina's to educate and inspire us. 

Sunday, February 16, 2025

The Role of AI in DFIR

The role of AI in DFIR is something I've been noodling over for some time, even before my wife first asked me the question of how AI would impact what I do. I guess I started thinking about it when I first saw signs of folks musing over how "good" AI would be for cybersecurity, without any real clarity, nor specification as to how that would work.

I recently received a more pointed question regarding the use of AI in DFIR, asking if it could be used to develop investigative plans, or to identify both direct and circumstantial evidence of a compromise. 

As I started thinking about the first part of the question, I was thinking to myself, "...how would you create such a thing?", but then I switched to "why?" and sort of stopped there. Why would you need an AI to develop investigative plans? Is it because analysts aren't creating then? If that's the case, then is this really a problem set for which "AI" is a solution?

About a dozen years ago, I was working at a company where the guy in charge of the IR consulting team mandated that analysts would create investigative plans. I remember this specifically because the announcement came out on my wife's birthday. Several months later, the staff deemed the mandate a resounding success, but no one was able to point to a single investigative plan. Even a full six months after the announcement, the mandate was still considered a success, but no one was able to point to a single investigative plan. 

My point is, if your goal is to create investigative plans and you're looking to AI to "fill the gap" because analysts aren't doing it, then it's possible that this isn't a problem for which AI is a solution. 

As to identifying evidence or artifacts of compromise, I don't believe that's necessarily a problem set that needs AI as the solution, either. Why is that? Well, how would the model be trained? Someone would have to go out and identify the artifacts, and then train the model. So why not simply identify and document the artifacts?

There was a recent post on social media regarding investigating WMI event consumers. While the linked resource includes a great deal of very valuable information, it's missing one thing...specific event records within the WMI-Activity/Operational Event Log that apply to event bindings. This information can be found (it's event ID 5861) and developed, and my point is that sometimes, automation is a much better solution than, say, something like AI, because what we see at the 'training set' is largely insufficient. 

What do I mean by that? One of the biggest, most recurring issues I continue to see in DFIR circles is the misrepresentation (some times subtle, some times gross) of artifacts such as AmCache and ShimCache. If sources such as these, which are very often incomplete and ambiguous, leaving pretty significant gaps in understanding of the artifacts, are what constitutes the 'training set' for an AI/LLM, then where is that going to leave us when the output of these models is incorrect? And at this point, I'm not even talking about hallucinations, just models being trained with incorrect information.

Expand that beyond individual artifacts to a SOAR-like capability; the issues and problems simply become compounded as complexity increases. Then, take it another step/leap further, going from a SOAR capability within a single organization, to something so much more complex, such as an MDR or MSSP. Training a model in a single environment is complex enough, but training a model across multiple, often wildly disparate environments increases that complexity by orders of magnitude. Remember, one of the challenges all MDRs face is that what is a truly malicious event in one environment is often a critical business process in others.

Okay, let's take a step back for a moment. What about using AI for other tasks, such as packet analysis? Well, I'm so glad you asked! Richard McKee had that same question, and took a look at passing a packet capture to DeepSeek:

















The YouTube video associated with the post can be found here.

Something else I see mentioned quite a bit is how AI is going to impact DFIR, by allowing "bad guys" to uncover zero day exploits. That's always been an issue, and I'm sure that the new issue with AI is that bad guys will cycle faster on developing and deploying these exploits. However, this is only really an issue for those who aren't prepared; if you don't have an asset inventory (of both systems and applications), haven't done anything to reduce your attack surface, haven't established things like patching and IR procedures...oh, wait. Sorry. Never mind. Yeah, that's going to be an issue for a lot of folks.