Wednesday, February 19, 2025

Lina's Write-up

Lina recently posted on LinkedIn that she'd published another blog post. Her blog posts are always well written, easy to follow, fascinating, and very informative, and this one did not disappoint.

In short, Lina says that she found a bunch of Chinese blog posts and content describing activity that Chinese cybersecurity entities have attributed to what they refer to as "APT-C-40", or the NSA. So, she read through them, translated them, and mapped out a profile of the NSA by overlaying the various write-ups.

Lina's write-up has a lot of great technical information, and like the other stuff she's written, is an enthralling read. Over the years, I've mused with others I've worked with as to whether or not our adversaries had dossiers on us, or other teams, be they blue or red. As it turns out, thanks to Lina, we now know what they do, what those dossiers might look like, and the advantage that the eastern countries have over the west.

For me, the best part of the article was Lina's take-aways. It's been about 30 yrs since I touched a Solaris system, so while I found a lot of what Lina mentioned in the article interesting (like how the Chinese companies knew that APT-C-40 were using American English keyboards...), I really found the most value in the lessons that she learned from her review and translation of open Chinese reporting. Going forward, I'll focus on the two big (for me) take-aways:

There is a clear and structured collaboration...

Yeah...about that.

A lot of this has to do with the business models used for DFIR and CTI teams. More than a few of the DFIR consulting teams I've been a part of, or ancillary to, have been based on a utilization model, even the ones that said they weren't. A customer call comes in, and the scoping call results in an engagement of a specific length; say, 24 or 48 hrs, or something like that. The analyst has to collect information, "do" analysis and write a report, eating any time that goes over the scoped time frame, or taking shortcuts in analysis and reporting to meet the timeline. As such, there's little in the way of cross-team collaboration, because, after all, who's going to pay for that time?

In 2016, I wrote a blog post about the Samas (or SamSam) ransomware activity we'd seen to that point. This was based on correlation of data across half a dozen engagements, each worked by a different analyst. The individual analysts did not engage with each other; rather, they simply proceeded through the analysis and reporting of their engagement, and were then assigned to other engagements.

Shortly after that blog post was published, Kevin Strickland published his analysis of another aspect of the attacks; specifically, the evolution of the ransomware itself.

Two years later, additional information was published about the threat group itself, some of which had been included in the original blog post.

The point is that many DFIR teams do not have a business model that facilitates communications across engagements, and as such, analysts aren't well practiced at large scale communications. Some teams are better at this than others, but that has a lot to do with the business model and culture of the team itself. 

Overall, there really isn't a great deal of collaboration within teams and organizations, largely because everyone is silo'd off by business models; the SOC has a remit that doesn't necessarily align with DFIR, and vice versa; the CTI team doesn't have much depth in DFIR skill sets, and what the CTI team publishes isn't entirely useful on a per-engagement basis to the DFIR team. I've worked with CTI analysts who are very, very good at what they do, like Allison Wikoff (re: Mia Ash), but there was very little overlap between the CTI and IR teams within those organizations.

Now, I'm sure that there's a lot of folks reading this right now who're thinking, "hey, hold on...I/we collaborate...", and that may very well be the case. What I'm sharing is my own experience over the passed 25 yrs, working in DFIR as a consultant, in FTE roles, running and working with SOCs, working in companies with CTI teams, etc.

This is an advantage that the east has over the west; collaboration. As Lina mentioned, a lot of the collaboration in the west is through closed, invite-only groups, so a lot of what is found isn't necessarily shared widely. As a result, those that are not part of those groups don't have access to information or intel that might validate their own findings, or fill in some gaps. Further, those who aren't in these groups have information that would fill in gaps for those who are, but that information can't be shared, nor developed.

...Western methodologies typically focus on constructing a super timeline...

My name is Harlan, and I'm a timeliner. Not "super timelines"...while I'm a huge fan of Kristinn (heck, I bought the guy a lollipop with a scorpion inside once), I'm a bit reticent to had over control of my timeline development to log2timeline/plaso. This is due, in part, to knowing where the gaps are, what artifacts the tool parses, and which ones it doesn't. Plaso and it's predecessor are great tools, but they don't get everything, particularly not everything I need for my investigations, based on my analysis goals. 

Okay, getting back on point...I see what Lina's saying, or perhaps it's more accurate to say, yes, I'm familiar with what she describes. In several instances, I've done a good bit of adversary profiling myself, without the benefit of "large scale data analysis using AI" because, well, AI wasn't available, and I started out my investigation looking for those things. In one instance, I could see pretty clearly not just the hours of operation of the adversary, but we'd clearly identified two different actors within the group going through shift changes on a regular basis. On the days where there was activity on one of the nexus endpoints, we'd see an actor log in, open a command prompt/cmd.exe, and then interact with the Event Logs (not clearing them). Then, about 8 hrs later (give or take), that actor would log out, and another actor would log in and go directly to PowerShell. 

Adversary profiling, going beyond IOCs and TTPs to look at hours of operation/operational tempo, situational awareness, etc., is not something that most DFIR teams are tasked or equipped for, and deriving that sort of insight from intrusion data is not something either DFIR or CTI teams are necessarily equipped/staffed for. This doesn't mean that it doesn't happen, just that it's not something that we, in the West, see in reporting on a regular basis. We simply don't have a culture of collaboration, neither within nor across organizations. Rather, if detailed information is available, many times it's thought to be held close to the vest, as part of a competitive advantage. In my experience, it's less about competitive advantage, and more often the case that, while the data is available, it's not developed into intel, nor insights.

Conclusion
I really have to applaud Lina for not only taking the time to, as she put it, dive head-first into this rabbit hole, and for putting forth the effort and having the courage to publish her findings. In his book Call Sign Chaos, Gen. Mattis referred to the absolute need to be well-read, and that applies not just to warfighters, but across disciplines, as well. However, in order for that to be something that we can truly take advantage of, we need writing like Lina's to educate and inspire us. 

Sunday, February 16, 2025

The Role of AI in DFIR

The role of AI in DFIR is something I've been noodling over for some time, even before my wife first asked me the question of how AI would impact what I do. I guess I started thinking about it when I first saw signs of folks musing over how "good" AI would be for cybersecurity, without any real clarity, nor specification as to how that would work.

I recently received a more pointed question regarding the use of AI in DFIR, asking if it could be used to develop investigative plans, or to identify both direct and circumstantial evidence of a compromise. 

As I started thinking about the first part of the question, I was thinking to myself, "...how would you create such a thing?", but then I switched to "why?" and sort of stopped there. Why would you need an AI to develop investigative plans? Is it because analysts aren't creating then? If that's the case, then is this really a problem set for which "AI" is a solution?

About a dozen years ago, I was working at a company where the guy in charge of the IR consulting team mandated that analysts would create investigative plans. I remember this specifically because the announcement came out on my wife's birthday. Several months later, the staff deemed the mandate a resounding success, but no one was able to point to a single investigative plan. Even a full six months after the announcement, the mandate was still considered a success, but no one was able to point to a single investigative plan. 

My point is, if your goal is to create investigative plans and you're looking to AI to "fill the gap" because analysts aren't doing it, then it's possible that this isn't a problem for which AI is a solution. 

As to identifying evidence or artifacts of compromise, I don't believe that's necessarily a problem set that needs AI as the solution, either. Why is that? Well, how would the model be trained? Someone would have to go out and identify the artifacts, and then train the model. So why not simply identify and document the artifacts?

There was a recent post on social media regarding investigating WMI event consumers. While the linked resource includes a great deal of very valuable information, it's missing one thing...specific event records within the WMI-Activity/Operational Event Log that apply to event bindings. This information can be found (it's event ID 5861) and developed, and my point is that sometimes, automation is a much better solution than, say, something like AI, because what we see at the 'training set' is largely insufficient. 

What do I mean by that? One of the biggest, most recurring issues I continue to see in DFIR circles is the misrepresentation (some times subtle, some times gross) of artifacts such as AmCache and ShimCache. If sources such as these, which are very often incomplete and ambiguous, leaving pretty significant gaps in understanding of the artifacts, are what constitutes the 'training set' for an AI/LLM, then where is that going to leave us when the output of these models is incorrect? And at this point, I'm not even talking about hallucinations, just models being trained with incorrect information.

Expand that beyond individual artifacts to a SOAR-like capability; the issues and problems simply become compounded as complexity increases. Then, take it another step/leap further, going from a SOAR capability within a single organization, to something so much more complex, such as an MDR or MSSP. Training a model in a single environment is complex enough, but training a model across multiple, often wildly disparate environments increases that complexity by orders of magnitude. Remember, one of the challenges all MDRs face is that what is a truly malicious event in one environment is often a critical business process in others.

Okay, let's take a step back for a moment. What about using AI for other tasks, such as packet analysis? Well, I'm so glad you asked! Richard McKee had that same question, and took a look at passing a packet capture to DeepSeek:

















The YouTube video associated with the post can be found here.

Something else I see mentioned quite a bit is how AI is going to impact DFIR, by allowing "bad guys" to uncover zero day exploits. That's always been an issue, and I'm sure that the new issue with AI is that bad guys will cycle faster on developing and deploying these exploits. However, this is only really an issue for those who aren't prepared; if you don't have an asset inventory (of both systems and applications), haven't done anything to reduce your attack surface, haven't established things like patching and IR procedures...oh, wait. Sorry. Never mind. Yeah, that's going to be an issue for a lot of folks.

Monday, January 20, 2025

Artifacts: Jump Lists

In order to fully understand digital analysis, we need to have an understanding of the foundational methodology, as well as the various constituent artifacts on which a case may be built. The foundational methodology starts with your goals...what are you attempting to prove or disprove...and once you understand the goals of your analysis, you can assemble the necessary artifacts to leverage in pursuit of those goals.

Like many of the artifacts we might examine on a Windows system, Jump Lists can provide useful information, but they are most useful when viewed in conjunction with other artifacts. Viewing artifacts in isolation deprives the analyst of valuable context.

Dr. Brian Carrier recently published an article on Jump List Forensics over on the CyberTriage blog. In that article, he goes into a good bit of depth regarding both the Automatic and Custom Jump Lists, and for the sake of this article, I'm going to cover just the Automatic Jump Lists. 

As Brian stated in his article, Jump Lists have been around since Windows 7; I'd published several articles on Jump Lists going back almost 14 years at this point. Jump Lists are valuable to analysts because they're (a) created as a result of user interaction via the Windows Explorer shell, (b) evidence of program execution, and (c) evidence of data or file access. 

Automatic Jump Lists follow the old Windows OLE "structured storage" format. Microsoft refers to this as the "compound file binary" format and has thoroughly documented the format structures. Some folks who've been around the industry for a while will remember that the OLE format is what Office documents used to use, and that there was a good bit of metadata associated with these documents. In fact, a good way to find the old school "OG" analysts still hanging around the industry is to mention the Blair document. And the format didn't disappear when Office was updated to the newer style format; rather, the format is used an other areas, such as Jump Lists, and at one point was used for Sticky Notes.

Here's my code for parsing the "structured storage" format; this was specifically developed for Windows 7 Automatic Jump Lists, but the basic code can be repurposed for OLE files, in general, or specifically updated for specific field (i.e., the DestList stream) in newer versions of Windows.

As you saw in Brian's article, Automatic Jump Lists are specific to each user, and are found within the user's profile path. Each Automatic Jump List is named using an "application identifier" or "AppID". This is a value that identifies the application used to open the target files (Notepad, Notepad++, MSWord, etc.), and is consistent across platforms. This means that an AppID that refers to a particular application on a Windows system will remain the same on other Windows systems. 

Microsoft has referred to the "structured storage" format as a "file system within a file"; if you do a study of the format, you'll see why. This structure results in various 'streams' being within the file, and for Automatic Jump Lists, there two types of streams. Most of the streams in a Automatic Jump List file contain a stream structure that follows the Windows shortcut/LNK file format.

The other type of stream is referred to as the "DestList" stream, and the structure of this stream on Windows 7 systems was first documented about 14 yrs ago. The following figure illustrates an Automatic Jump List opened in the Structured Storage Viewer, with the DestList stream highlighted.










The structure of the DestList stream changed slightly between Windows 7 and 10 (and maybe again with Windows 11, I haven't looked yet...), but the overall structure of the Automatic Jump List files remains essentially the same.

Summary
Automatic Jump Lists help analysts validate that a user was active on the system via the Windows shell (as well as when), that they launched applications (program execution), and that they used those applications to open files (file/data access), and when they did so. As such, parsing Jump Lists and including the data in a timeline can add a good deal of granularity and context to the timeline, particularly as it pertains to user activity.

As always, Automatic Jump Lists should be used in conjunction with other artifacts, such as Prefetch, UserAssist, RecentDocs, etc., and should not be viewed in isolation, pursuant to the analyst's investigative goals.

Something else to remember is this...Automatic Jump Lists are generated by the operating system as the user interacts with the environment. As such, if an application is added, the user uses that application and Automatic Jump Lists are generated, and then the user removes the application, the Automatic Jump Lists remain. The same thing happens with other artifacts, such as Recents shortcuts/LNK files, Registry values, etc. So, as with other artifacts, Automatic Jump Lists can provide indications of applications previously installed or files that previously existed on (or were accessed from) the endpoint.

Monday, January 06, 2025

Carving

Recovering deleted data, or "carving", is an interesting digital forensics topic; I say "interesting" because there are a number of different approaches and techniques that may be valuable, depending upon your goals. 

For example, I've used X-Ways to recover deleted archives from the unallocated space of a web server. A threat actor had moved encrypted archives to the web server, and we'd captured the password they used via EDR telemetry. The carving revealed about a dozen archives, which we opened using the captured password, which allowed our customer to understand what data had been exfil'd, and their risk and exposure. 

But carving can be about more than just recovering files from unallocated space. We can carve files and records from unstructured data, or we can treat 'structured' data as unstructured and attempt to recover records. We did this quite a bit during PCI forensic investigations, and found a much higher level of accuracy/fidelity when we carved for track 1 and 2 data, rather than just credit card numbers. 

We can also carve within files themselves. Several common file formats are essentially databases, and some are described as a "file system within a file". As such, deleted records and data can be recovered from such file formats, if necessary.

I recently ran across a fascinating post from TheDFIRJournal recently, regarding file carving encrypted virtual disks. The premise of the post is that some file encryption/ransomware software does not encrypt entire files, just rather just part of it, for the sake of speed. In the case of virtual disks, a partially encrypted file may mean that, while the disk itself is useable, there may be valuable evidence available within the virtual disk file itself. 

I should note that I did recently see a ransomware deployment that used a "--mode fast" switch at the command line, possibly indicating that the entire file would not be encrypted, but rather only a specific number of bytes of the file. As such, with larger files, such as virtual disks, WEVT files, etc., there might be an opportunity to recover valuable data, so file and record carving techniques would be valuable, depending upon your specific investigative goals.

The premise raised in the article is not unique; in fact, I've run into it before. In 2017, when NotPetya hit, we received a number of system images from customers where the MBR was overwritten. We had someone on our team who could reconstruct the MBR, and we also ran carving for WEVTX records, recovering Security-Auditing/4688 records indicating process creation. The customers had not enabled full command lines being recorded, but we were able to reconstruct enough data to illustrate the sequence of processes specific to the infection and impact. So, having a disk image where the MBR and/or the MFT is overwritten is not a new situation, simply one we haven't encountered recently.

TheDFIRJournal article covers a number of tools, including PhotoRec, scalpel (not currently being maintained), and Willi Ballenthin's EVTXtract. The article also covers Simson Garfinkel's bulk_extractor, but looking at the bulk_extractor Github, there do not appear to be releases for Windows starting with version 2.0. While some folks have stated that bulk_extractor-rec's capabilities have been added to bulk_extractor, that's kind of a moot point, and the latest release of bulk_extractor-rec will have to suffice. 

Addendum, 7 Jan 2025: Thanks to Brian Maloney for sharing that the bulk_extractor 2.0 for Windows CLI tool can be found here.

Also from the article, the author mentioned the use of a customer EVTXParser script, which can be found here. I like this approach, as I'd done something similar with the WinXP/2003 EVT files, where I'd written lfle.pl to parse EVT records from unstructured data, which could include a .EVT file. I wrote this script (a 'compiled' Windows EXE is also available) after finding two complete records embedded in an .EVT file that were not "visible" via the Event Viewer, nor any other tools that started off by reading the file header to determine where the records were located. The script then evolved into something you could run against any data source. While not the fastest tool, at the time it was the only tool available that would take this approach. 

In the past, I've done carving on unallocated space within a disk image, using something like blkls to get the uallocated space into on contiguous file of unstructured data. From there, running tools like bulk_extractor allow for record carving.

I've also has pretty good success running bulk_extractor across memory dumps; this is something I talked about/walked through in my book, Investigating Windows Systems.

Carving can also be done on individual files. For example, in 2013, Mari DeGrazia published a great blog post on recovering deleted data from SQLite databases, and carving Registry hive files for deleted keys and values, as well as examining unallocated space within hive files is something I've been a fan of for quite some time. My thanks go to Jolanta Thomassen for 'cracking the code' on deleted cells within Registry hive files!

Here's a presentation I put together a while back that includes information regarding unallocated space within Registry hive files.

Addendum, 13 Jan: Damien Attoe released his first blog post regarding a tool he's working on called "sqbite"; the alpha functionality is what's currently available, and Damien plans to release additional functionality in March. Reading through his blog post, it appears that Damien is working toward something similar to what Mari talked about and released. It's going to be interesting to see what he develops!

Tuesday, December 24, 2024

UEPOTB, LNK edition

A while back, Jesse Kornblum published a paper titled, "Using Every Part of the Buffalo in Windows Memory Analysis". This was, and still is, an excellent paper, based on it's content and how it pertained to the subject (Windows memory analysis). However, what Jesse shared in that paper had additional value, in that it he expressed the idea of using everything available to the analyst.

Since then, we've seen more than a few papers, blog posts and articles where the author(s) go only so far with the extent of what they share regarding the data they're looking at, and do not truly use all the parts of the buffalo. Examples of these include blog posts or articles where LNK files are delivered as part of a phishing campaign, and the author only goes so far as to show the basic properties of the LNK file, perhaps up through the command line, but then stopping there.

Now and again, we do see articles published by teams that truly do strive to use all the parts of the buffalo, leveraging everything they have available, but those are still few and far between. Perhaps one of the most notable examples is a Mandiant article from 19 Nov 2018 that referred to a phishing campaign (using LNK files) by APT29/"Cozy Bear". In the article, the authors compare activity from a similar campaign from 2016, using LNK files from the previous campaign (see figures 5 & 6).

One such example of where the content falls short is a recent blog post from Cyble. The article contains references to three LNK files using in phishing campaigns, each illustrated in the article by opening the file via the Properties tab, as seen in figure 1. This shows specific elements of the LNK file, but only those visible via the Properties tab. 

Figure 1: LNK Properties

What we see in figure 1 also shows that the author(s) had access to the LNK files themselves, and could have done so much more with them. 

Hashes for the LNK files are also listed in the IoCs table at the end of the article, and I was able to find one of the file available for download via another site online, and was able to extract the metadata illustrated in figure 2.

Figure 2: TrackerDataBlock

"christmas-destr" seems to be pretty unique workstation name, and might prove to be interesting in tracking and retro-hunts. And yes, if you do an OUI lookup on the node IDs shown in figure 2, they'll align with "VMWare, Inc.", which isn't surprising. 

Interestingly enough, the LNK file also contains a fairly well-populated PropertyStoreDataBlock, which isn't something we see in every LNK file, and can provide insight into how the shortcut file was "constructed". Different methodologies and tools for creating LNK files leave different toolmarks.

Taking things a step further, in addition to the information in the rest of the LNK file discussed above, we can see from figure 3 that the information within the file header and the shell item ID list can be fairly illuminating.

Figure 3: LNK header+

The original article was published on 19 Dec 2024, which provides some idea as to the timeframe of when the LNK file would have been deployed in a campaign, collected, and analyzed. Using the information illustrated in figure 3, we get some additional insight as to the timeframe specifically associated with the LNK file, particularly those time stamps within the shell items.

In addition to the volume serial number (i.e., "280C-1822"), the time stamps and MFT reference numbers extracted from the shell items provides additional indicators that can be used to align with LNK files from other campaigns. 

Another such example is a blog from Blackberry that discusses a Pakistani threat group dubbed "Transparent Tribe". I was able to find a couple of the LNK files listed in the appendix available for download, based on the provided hashes. What I found when comparing the metadata extracted from two LNK files was that aside from some minor differences, mostly in the command lines, they were nearly identical. Same volume serial number, same machine ID, same time stamps in the shell items, etc. In this case, the machine ID value was "desktop-e7n7e7f", which (a) should be compared across the other LNK files as well as other data sources, and (b) can be used along with other elements of metadata as part of a VirusTotal retro-hunt to expand the intelligence aperture even further, potentially associating the system with other campaigns. 

These LNK files also contained a PropertyStoreDataBlock, but rather than being as verbose as the previous example from the Cyble article, these LNK files simply contained a SID:

S-1-5-21-3861309104-3271506253-2070734288-1001

Okay, but so what? Why does any of this matter? 

Well, these indicators, when combined and added to other indicators, tell us a good bit about the operational processes of the threat actor, as well as the development environment and processes employed by the threat actor/group.

Beyond the basic indicators, the structure of the LNK file itself can tell us a good bit about how the LNK file was constructed, as well as the situational awareness of the actor or group. For example, there have been instances where the time stamps in the shell items have been zero'd out, essentially removing those indicators. However, I'd be careful about any assumptions made regarding a threat group's situational awareness or operational security based on metadata within LNK files; the simple fact is that this information is largely left unused by many firms, so why bother with the extra steps or work to remove the indicators?

Tuesday, November 26, 2024

Program Execution: The ShimCache/AmCache Myth

I recently saw another LinkedIn post from someone supporting and sending readers to a site that was reportedly started using the SANS DFIR poster as a reference. As illustrated in figure 1, this site references the ShimCache artifact as providing evidence of program execution, and does the same for the AmCache artifact, as well.

Figure 1: ShimCache Entry



Now, while yes, it is true that these artifacts can provide evidence of program execution, that is not always the case, and this needs to be understood throughout the community. 

What I'm going to do with this blog post is provide the resources to show why these artifacts do not solely provide evidence of "program execution", so that others in the community can reference these resources.

ShimCache
Mandiant's article regarding ShimCache includes the statement illustrated in figure 2.

Figure 2: Article excerpt








Notice the highlighted section in figure 2, which states, "...that were not actually executed.", and then goes on to say that entries can be added if a user browses to a folder. The ShimCache value is written to the System hive, and does not provide a reference to the user who may have browsed to a folder. As such, this is something that would need to be resolved through user profile analysis, via artifacts such as JumpLists, recently accessed files, shellbags, etc. 

However, the important thing to understand here is that this reductionist approach of saying that ShimCache is evidence solely of program execution is incorrect. 

We also need to remember that the ShimCache is written to the Registry value at shutdown; this is trivial to demonstrate via a timeline. 

AmCache
Blanche Lagney's Analysis of AmCache v2 paper is the definitive reference for all things AmCache. The research is thorough, and presented in a manner that, while it does require some reading to address specific questions, should remove all doubt as to the value of the artifact on specific Windows builds.

If you're looking at just the AmCache.hve file as an artifact to determine program execution, you're going to need to find the closest match to the Windows build and libraries, based on the keys you're looking at, to better understand the nature of the artifacts you're seeing.

Analysis Process
The key here is not to try to memorize the "value" of individual artifacts in isolation, but to have an analysis process where multiple data sources and artifacts are viewed together, so that through this process you can 'see' the context of the events in question. For example, when it comes to program execution, we might look to JumpLists, Security (if configured) and/or Sysmon (if installed) Event Logs, UserAssist entries, and on workstation platforms, Prefetch files. On Windows 11, we might also look to data within the PCA folder. To validate that a program executed, we might look to impacts in the Registry and/or file system, or to the Application Event Log to determine if the program generated an error or crashed.

For teaching/instructional purposes, it would be extremely valuable to start by describing one data source, such as the file system, and then show how that data source can be viewed via a timeline. Then, add another data source, such as the Windows Event Log or the Registry, and add that data source to the timeline. When discussing the Registry (as well as the ShimCache and AmCache artifacts), it will be important for analysts to understand the value of time-based metadata (key LastWrite times), as well as time-based data embedded within individual values, all of which can help better address analysis categories such as "program execution". 

Conclusion
While it is valuable to have an understanding of various artifacts, the most important takeaway from this article is that analysts should not consider artifacts in isolation during an investigation, but should instead look to multiple data sources and artifacts, viewed together, to determine the nature and context of events in question.

Thursday, October 31, 2024

FTSCon

I had the distinct honor and pleasure of speaking at the "From The Source" Conference (FTSCon) on 21 Oct, in Arlington, VA. This was a 1-day event put on prior to the Volexity memory analysis training, and ran two different tracks...Maker and Hunter...with some really great presentations in both tracks!

Before I start in with providing my insights and observations, I wanted to point out that the proceeds of the event went to support Connect Our Kids, a really cool project to get folks, and especially kids, connected with birth parents, etc.

Keynote
Sean Koessel and Steven Adair provided the keynote, which was a look into a fascinating case they worked. In this case, the threat actor gained access to their customer by compromising near-by infrastructures and traversing/moving laterally via the wireless networks; hence the title, "The Nearest Neighbor Attack". Sean and Steven put a lot of effort into crafting and delivering their fascinating story, all about how they worked through this incident, with all of the bumps, detours, and delays along the way. They also promised that they'd be putting together a more comprehensive review of the overall incident on the Volexity blog, so keep an eye out.

Something Sean said at the very beginning of the presentation caught my attention, and got me thinking. He referred to the incident as something, "...no one's ever seen before." As Sean and Steven described this particular incident, it was more than just a bit of a complicated. As such, the question becomes, were they able to get as far as they did due to the knowledge, experience, and teamwork they brought to bear? Would someone else, say a single individual with different or lesser experience, have been able to do the same, or would this incident have been more akin the blind men trying to describe an elephant?

Or, had someone seen this before, and just not thought to share it? Not long ago in my career, I worked with different teams of DFIR consultants, and time after time, I spoke to analysts who insisted that they didn't share what they were seeing, because they assumed, often incorrectly, that "...everyone's already seen this...".

Yarden Sharif gave a really interesting presentation on enclaves, something I hadn't heard of prior to the event (Matthew Geiger graciously explained what they were for me). 

At one point during her presentation, Yarden mentioned that enclaves can be enumerated, and I'm sure I'm not the only one who thought, "whoa, wait...what happens if the bad guy creates an enclave???"

Lex Crumpton's presentation was titled, "ATT&CKing the MITRE NERVE Incident: Operationalizing Threat Intelligence for a Safer Tomorrow." What got my attention most was that Lex said she's interested in "behavior analysis", which, when considered from a DFIR perspective, is something that's fascinated me for quite some time. 

John Hammond is always an entertaining and educational (as well as knowledgeable) presenter, and his malware presentation was pretty fascinating to watch. John does a great job of illustrating his walk-through and sharing his thought processes when finding and unraveling new challenges. 

Andrew Case shared some insight into how Volatility could be used to detect EDR-evading malware, which I thought was pretty interesting. I've used Volatility before, and Andrew shared that there are some plugins that already detected the techniques used by some malware, and that other plugins were created to address gaps.

Something to keep in mind is that all of the techniques Andrew talked about are used by malware to directly address/attack EDR. There are other techniques at play, such as EDR Silencer, which creates WFP filter rules to prevent the EDR from talking to it's cloud infrastructure. This way, it doesn't directly interact with the EDR agent. As pointed out in the WindowsIR blog post, another technique that would work, would leave fewer artifacts, and would likely be missed by younger, less experienced analysts is to modify the hosts file (shoutz to Dray for that one!)

Andrew's presentation of the same title, from DefCon, is available here.

Addendum, 22 Nov: Steve and Sean's blog post was published today, find it here. As one would expect, it was picked up by folks like Brian Krebs.


Saturday, October 26, 2024

Artifact Tracking: Workstation Names

Very often in cybersecurity, we share some level of indicators of compromise (IOCs), such as IP addresses, domain names, or file names or hashes. There are other indicators associated with many compromises or breaches that can add a great deal of granularity or insight to the overall incident, particularly as the intrusion data and intel applies to other observed incidents.

One such indicator is the workstation name, so named based on the indicator as found within Microsoft-Windows-Security-Auditing/4624 event records, indicating a successful login, as well as within Microsoft-Windows-Security-Auditing/4625 and Microsoft-Windows-Security-Auditing/4776 events.

The value of the workstation name can depend upon the type of incident you're responding to, examining, or attempting to detect earlier in the attack cycle (i.e., moving "left of bang"). For example, many organizations become aware that files have been encrypted and they've been ransomed after those two things have happened. However, for someone to access an infrastructure or network, often they first need to access or log into an endpoint. Depending upon how this is achieved, there may be indicators left in popular Windows Event Logs. 

Huntress analysts have observed an IAB or Akira ransomware affiliate during multiple incidents with initial activity (logins via RDP) originating from a workstation named "WIN-JGRMF8L11HO".

While investigating a ReadText34 ransomware incident, Huntress analysts found that RDP logins originated from a workstation named "HOME-PC".

Huntress analysts have also observed the recurrence of workstation names such as "kali" and "0DAY-PROJECT" across multiple incidents. In most (albeit not all) instances, the workstation names associated with the identified malicious activity have not aligned with the naming scheme used by the organization. In fact, in some cases, Huntress analysts have been able to filter through the authentication logs and associate user account names with workstation names and IP addresses to clearly identify the malicious activity; that is, when there is a radical change in the workstation name normally associated with a user account.

While we can also extract workstation names from Splashtop Event Logs, we're not limited simply to Windows Event Log records. For example, Huntress analysts saw logins via legacy TeamViewer installations ahead of attempts to deploy LockBit3.0 ransomware, and in multiple observed incidents, logins originated from a workstation named WIN-8GPEJ3VGB8U. 

Correlation
If you've followed my blog for any amount of time, you'll likely have noticed that I'm very interested in file metadata, particularly LNK file metadata. In many cases, LNK/Windows shortcut files will contain a "machine ID" or NetBIOS name of the endpoint on which it was created. This information can be correlated with workstation names, looking for links between usage, campaigns on which the endpoints appear, etc.

Addendum, 14 Nov
This blog post was published today, regarding SafePay ransomware attacks observed within customer infrastructures. The table of IOCs contains two workstation names.

Addendum, 26 Nov
Others have publicized workstation names as part of their blogs, as well. Last year, Intrinsec published a blog post containing Akira indicators; the section on lateral movement contains six observed workstation names.

Tuesday, October 15, 2024

Analysis Process

Now and again, someone will ask me, "...how do you do analysis?" or perhaps more specifically, "...how do you use RegRipper?" 

This is a tough question to answer, but not because I don't have an answer. I've already published a book on that very topic, and it seems that my process for doing analysis is apparently very different from the way most people do analysis. 

Now, I can't speak to how everyone else goes about analyzing an endpoint, but when I share my process, it seems that that's the end of the conversation. 

My analysis process, laid out in books like "Investigating Windows Systems", is, essentially:


1. Document investigative goals. These become the basis for everything you do in the investigation, including the report.


Always start with the goals, and always start documentation by having those goals right there at the top of your case notes file. When I was active in DFIR consulting, I'd copy the investigative goals into the Executive Summary of the report, and provide 1-for-1 answers. So, three goals, three answers. After all, the Executive Summary is a summary for executives, meant to stand on it's own.


2. Collect data sources.


This one is pretty self-explanatory, and very often based on your response process (i.e., full images vs "triage" data collections). Very often, collection processes will include the least amount of data extracted from a system for the biggest impact, based upon the predominance of business needs, leaving other specific sources for later/follow-on collection, if needed.


3. Parse, normalize, decorate, enrich those data sources.


Basically, create a timeline, from as many data sources as I can or makes sense, based on my investigative goals. Easy-peasy.


Timelines are not something left to the end of the investigation, to be assembled manually into a spreadsheet. Rather, creating a timeline as a means of initiating an investigation provides for much needed context.


4. Identify relevant pivot points.


RegRipper and Events Ripper are great tools for this step. Why is that? Well, within the Registry, often items of interest are encoded in some manner, such as binary, hex, ROT-13, or some folder or other resource represented by a GUID; many of the RegRipper plugins extract and display that info in human-readable/-searchable format. So, running RegRipper TLN plugins to incorporate the data into a timeline, and then run "regular output" plugins to develop pivot points. Events Ripper is great for extracting items of interest from events files with (hundreds of) thousands of lines.


5. Identify gaps, if any, and loop back to #2.


Based on the investigative goals, what's missing? What else do you need to look for, or at? You may already have the data source, such as if you need to look for deleted content in Registry hives,


6. Complete when goals are met, which includes being validated.


An issue we face within the industry, and not just in DFIR, is validation. If a SOC analyst sees a "net user /add" command in EDR telemetry, do they report that a "user account was created" without (a) checking the audit configuration of Security Event Log, and (b) looking for Security-Auditing event records that demonstrate that a user account was created? If it was a local account, is the SAM checked?


Or, if msiexec.exe is seen (via EDR telemetry) running against an HTTP/HTTPS resource, is the Application Event Log checked for MsiInstaller events?


My point is, are we just saying that something happened, or are we validating via the available data sources that it actually happened?


7. Anything "new" gets baked back in


The great thing about timelines and other tools is that very often, you'll find something new, something you hadn't seen before, and was relevant (or could be) to your investigation. This is where most of the Events Ripper plugins have originated; I'll see something "new", often based on an update to Windows, or some installed application, and I'll "bake it back into" the process by creating a plugin.


Yes, documenting it is a good first step, but adding it back into your automation is taking action. Also, this way, I don't have to remember to look for it...it's already there.


For example, several years ago, another analyst mentioned seeing something "new" during a response; looking into it, this new thing was a Microsoft-Windows-TaskScheduler/706 event record, so once I got a little more info about it, and dug into the investigation myself just a bit, I added it to eventmap.txt. After that, I never had to remember to look for it, and I had the necessary references to support the finding already documented.

Rundown

I ran across a fascinating post from Cyber Sundae DFIR recently that talked about the Capability Access Manager, and how with Windows 11 it includes database of applications that have accessed devices such as the mic or camera, going beyond just the Registry keys and values we know about. 

It should surprise no one that this is an artifact found on Windows 11; after all, there've been more than a few changes to Windows 10, even just between various individual builds. As such, depending upon the nature of your case, and your investigative goals, this may be a value resource to explore. 

As a reminder, RegRipper has two plugins that query various values beneath the CapabilityAccessManager\ContentStore subkey, contentstore.pl and location.pl. The contentstore.pl plugin also comes in a TLN variant, as well, so that the information can be included in an investigative timeline.

I also ran across an interesting article regarding artifacts of data exfiltration on various platforms, including Windows. While the list of these artifacts, the one specific to Windows, is a good one, IMHO, it misses some very useful artifacts. Some of the artifacts listed in the article, such as Prefetch files, are not definitive, and need to be used in conjunction with other artifacts to even provide a hint of data exfiltration. After all, you can call something whatever you want on Windows systems and not impact the functionality; you can rename net.exe to winrar.exe, and the Prefetch file will be for winrar.exe, and unfortunately, command line arguments are not stored in the Prefecth files.

Also, the article states that the Shimcache, "...stores information about executables that have been run on the system, even if the file has been deleted. Investigators can use this to trace the usage of data exfiltration tools." The Shimcache does not only/solely store information about executables that have been run on the system, something that has been documented again and again. Executables can be included in the ShimCache if the user has browsed to the folder where the EXE resides. So, yes, the ShimCache does include executables that have been run on the system, but those with little experience often interpret this statement to mean that this is all that the ShimCache includes, and is therefore "evidence of execution". 

There are other, perhaps more definitive data sources that point to data exfiltration. For example, querying the BITS Client Event Log for upload jobs would reveal a good deal of information regarding data exfiltration. One data source I've used in the past is the IIS web server logs; a threat actor moved archive files to the web server, and then issued GET requests for the files. Looking back through the logs we had available, there had been no prior instances of .zip files being requested.

Yes, the SRUM db is a great place to look for evidence of data exfiltration, very much so. However, as with other data sources, we have to keep the context of the data source in mind when conducting an investigation.

Even with this list, there are number of ways to exfil data off of a Windows system, including the use of finger.exe (one of my favorites!).

Wednesday, October 09, 2024

Exploiting LNK Metadata

Anyone who's followed me for a bit knows that I'm a huge proponent of metadata, and in particular, exploiting metadata in LNK files that threat actors create, use as lures, and send to their targets.

I read an article not long ago from Splunk titled, LNK or Swim: Analysis & Simulation of Recent LNK Phishing. The article covered a good bit of information regarding LNK files sent by some threat actors, and even included a list of metadata items that could be used for "threat intel purposes", as illustrated in figure 1.

Fig. 1: Splunk article excerpt






However, what's illustrated in figure 1 was as far as they went. In fact, reading through the article and looking at the images of LNK parser tool output, each of those images is cut off before embedded metadata and "extra data blocks" can be seen. Even then, including this information in the images would require analysts to manually transpose this information, which is a very inefficient and error-prone process, particularly given how small some of these images are within the article.

I will say that the article does go on to talk about the use of LNK files in phishing campaigns, and provides a link to an LNK generator tool. There are some definite opportunities here for a research project, where LNK metadata is compared across different creation methods (righ-click on the Desktop, PowerShell, the generator tool, etc.).

In December, 2016, JPCERT published an article describing how threat actors reveal clues about their development environment when they sent LNK files to their targets. The LNK files would contain metadata associated with the system on which they were created, which from a CTI perspective is "free money".

Figure 1 from the JPCERT article, extracted and illustrated in figure 2, demonstrates one way that the LNK file metadata can be used. In this figure, various elements of metadata are used in a graph to illustrate relationships amongst data that would not be obvious via a spreadsheet.

Fig. 2: Figure 1, excerpted from JPCERT article















At this point, you're probably asking, "how would this metadata be used in the real world?" Almost 2 years after the JPCERT article was published, the folks at Mandiant published an article regarding the comparison of data across two Cozy Bear campaigns, one in 2016 and the other in 2018. Within that article, at figures 5 and 6, the Mandiant analysts compared LNK file from the two campaigns, illustrating not just the differences, but also the similarities, such as the volume serial number (fig. 5) and the machine IDs (fig. 6). While there were differences in time stamps and other metadata, there were also consistencies between the two campaigns, 2 yrs apart.

If you're saying, "...but I don't do CTI..." at this point, that's okay. There may be steps we can take to use what we know about LNK files to protect ourselves.

If you have Sysmon installed on endpoints, Sysmon event ID 11 events identify file creation or modifications; you can monitor the Sysmon Event Log for such events, and extract the full file name and path. If the file extension is ".lnk", you can verify that that file is an LNK file based the "magic number" within the file header and the GUID that follows it. From there, you can then either flag the file based on the path, or take an extra step to compare the machine ID to the current endpoint name; if they're not the same, definitely flag or even quarantine the file. 

Is implementing this yourself kind of scary? No problem. If you're using an EDR vendor (directly, or through an MDR) and the EDR generates similar telemetry (keep in mind, not all do), contact the vendor about adding the capability. Detecting behaviors based on LNK files is notoriously difficult, so why not detect them when they're written to disk, and take action before a user double-clicks it?


Tuesday, October 08, 2024

Shell Items

I ran across a Cyber5W article recently titled, Windows Shell Item Analysis. I'm always very interested in not only understanding parsing of various data sources from Windows systems, but also learning a little something about how others view the topic. 

Unfortunately, there was very little actual "analysis" in the article, an excerpt of which is shown in figure 1.

Figure 1: Text from article






I'm not sure I can agree with that statement; tools, be they open source or commercial, tend to be very good at extracting, parsing, and presenting/displaying data, but analyzing that data really depends on the investigative goals, something to which tools are generally not privy. 

But we do see that quite often in the industry, don't we? We'll see something written up, and it will say, "...<tool name> does analysis of...", and this is entirely incorrect. Tools are generally very good at what they do; that is, parsing and displaying information, that an analyst then analyzes, in the context of their investigative goals, as well as other data sources and artifacts.

The rest of the article doesn't really dig into either the metadata embedded within shell items, nor the analysis of the various artifacts themselves. In fact, there's no apparent mention of the fact that there are different types of shell items, all of which contain different information/metadata. 

I've written quite a bit regarding Windows shell items embedded within various data sources. In fact, looking at the results of a search across this blog, there are more than a few posts. Yes, several of them are from 2013, but that's just the thing...the information still applies, when it comes to shell item metadata. Just because it was written a decade or more ago doesn't mean that it's "out of date" or that it's no longer applicable. 

While it is important to understand the nature and value of various data sources and artifacts, we must also keep in mind that tools do not do analysis, it's analysts and examiners who collect, correlate and analyze data based on their investigative goals.

RegRipper Educational Materials

A recent LinkedIn thread led to a question regarding RegRipper educational materials, as seen in figure 1; specifically, are there any.

Figure 1: LinkedIn request








There are two books that address the use of RegRipper; Windows Registry Forensics, and Investigating Windows Systems (see figure 2). Together, these books provide information about the Windows Registry, RegRipper, and the use of RegRipper as part of an investigation. 


Figure 2: IWS



























Demonstrating the use of RegRipper in an investigation is challenging, as RegRipper is only one tool I typically use during an investigation. Investigations do not rest on a single data source, nor on a single artifact. The challenge, then, is in demonstrating the use of RegRipper in an analysis process, such as any of the case studies in Investigating Windows Systems, that most folks are simply unfamiliar with; the value of the demo isn't diminished, it's completely lost if the overall process isn't understood.

The analysis process demonstrated multiple times in IWS is the same process I've used for years, well prior to the publication of the book. It's also the same process I use today, sometimes multiple times a day, as part of my role at Huntress. Any demonstration of RegRipper, or even Events Ripper, as part of the process would fall short, as most analysts do not already follow that same process. 

If you are interested in educational materials associated with RegRipper, I would be very much willing to learn a bit more about what you're looking for, and have a conversation pursuant to those needs. Feel free to reach to me on LinkedIn, or via email.