Thursday, March 20, 2025

Know Your Tools

In 1998, I was in a role where I was leading teams on-site to conduct vulnerability assessments for

organizations. For the technical part of the assessments, we were using ISS's Internet Scanner product, which was a commercial scanner. Several years prior, while I was in graduate school, the SATAN scanner had been released, but it was open source, and you could look at the code and see what it was doing. This wasn't the case with Internet Scanner.

What we started to see, when we began looking closely, was that the commercial product was returning results that weren't...well...correct. One really huge example was the AutoAdminLogon setting; you could set this value to "1", and the Administrator account name you chose would be included in another value, and the password would be included in a third value, in plain text. When the system was restarted, those credentials would be used to automatically login to the system.

Yep. Plain text.

Anyway, we ran a scan across an office within a larger organization, and the product returned 21 instances where the AutoAdminLogon capability was enabled. However, the organization knew that only one had that functionality actually set; the other 20 had had it set at one point, but the capability had been disabled. On those 20 systems, the AutoAdminLogon value was set to "0". We determined that the commercial product was checking for the existence of the AutoAdminLogon value only, and not going beyond that...not checking to see if the value was set to "1", and not checking to see if the value that contained the plain text password actually existed. 

We found a wide range of other checks that were similarly incorrect, and others that were highly suspect. So, we started writing a tool to replace the commercial product, called NTCAT. This tool had various queries, all of which were based on research and included references to the Microsoft KnowledgeBase, so anyone running the tool could look it up, understand what was being queried, what responses meant, and could explain it to the customer.

Later, when supporting CBP in a consulting role and waiting for my agency clearance, the Nessus scanner was a scanning tool that was popular at the time. One day, I heard a senior member of the CIRT talking to someone else who was awaiting their clearance, telling them that the Nessus scanner determined the version of the Windows operating system by firing off a series of specially crafted TCP packets at the target endpoint (perhaps via nmap), and then mapping the responses to a matrix. I listened, thinking that was terribly complicated. I found a system with Nessus installed, and started reading through the various modules, and found that Nessus determined the version of Windows by attempting to make an SMB connection to the target endpoint, and reading the Registry. If the scanner was run without the necessary privileges, a lot of the modules would not connect, and would simply fail. 

Jump forward to 2007 and 2008, and the IBM ISS X-Force ERS team was well into performing PCI forensic exams. At the time, we were one of seven companies on a list of certified organizations; merchants informed by their banks that they needed a forensic exam would go to the PCI web site, find the names of the companies listed, and invariably call through all seven companies to see who could offer the best (i.e., lowest) price, not realizing that, at the time, the price was set by Visa (this is prior to the PCI Council being stood up).

As our team was growing, and because we were required to meet very stringent timelines regarding providing information and reporting, Chris Pogue and I settled on a particular commercial tool that most of our analysts were familiar with, and provided the documented procedures for them to move efficiently through the required processes, including file name, hash, path searches, and scans for credit card numbers.

During one particular engagement, someone from the merchant site informed us that JCB and Discover cards were processed by the merchant. This was important, because our PCI liaison needed to get points of contact at those brands, so we could share compromised credit card numbers (CCNs). We started doing our work, and found that we weren't getting any hits for those two card brands.

The first step was to confirm that, in fact, the merchant was processing those brands...we got the thumbs-up. Next, we went to the brands, got testing CCNs, and ran our process across those numbers...only to get zero hits. Nada. Nothing. Zippo. 

It turned out that the commercial suite we were using included an internal function called IsValidCreditCard(), and through extensive testing, and more than a few queries on the user forums, found out that the function did not recognize those two brands as valid. So, with some outside assistance, we wrote a function call to override the internal function call, and had everyone on our team add it to their systems. The new function ran a bit slower, but Chris and I were adamant with the team that long-running processes like credit card and AV scans should be run in the evening, not started at the beginning of your work. This way, you didn't tie up an image with a long running process when you could be doing actual work. 

In 2020, I was working at an IR consulting provider, and found that some of the team used CyLR; at the time, the middleware was plaso, and the backend was Kibana. In March of that year, Microsoft released a fascinating blog post regarding human-operated ransomware, in which they described the DoppelPaymer ransomware as using WMI persistence. Knowing that the team had encountered multiple ransomware engagements involving that particular variant, I asked if they'd seen any WMI persistence. They responded, "no". I asked, how did you determine that? They responded that they'd check the Kibana output for those engagements, and got no results.

The collection process for that toolset obtained a copy of the WMI repository, so I was curious as to why no results were observed. I then found out that, at least at the time, plaso did not have a parser for the WMI repository; as such, the data was collected, but not parsed...and the result of "no findings" in the backend was accepted without question. 

All of this is just to say that it's important to know and understand how your tools work. When I ran an internal SOC, the L3 DF analysts were able to dump memory from endpoints using the toolset we employed. Very often, they would do so in order to check for a particular IP address; however, most of them felt that running strings and searching for the IP address in question was sufficient. I had to work with them to get them to understand that (a) IP addresses, for the most part, are not stored in memory in ASCII/Unicode, and (b) different tools (Volatility, bulk_extractor) look for different structures in order to identify IP addresses. So, if they were going to dump memory, running strings was neither a sufficient nor appropriate approach to looking for IP addresses. 

Know how your tools work, how they do what they do. Understand their capabilities and limitations. Sometimes you may encounter a situation or circumstance that you hadn't thought of previously, and you'll have to engage, ask questions, and intentionally engage in order to make a determination as the tools ability to address the issue.

Monday, March 17, 2025

WMI

The folks over at CyberTriage recently shared a complete guide to WMI; it's billed as a "complete guide to WMI malware", and it covers a great deal more than just malware. They cover examples of discovery and enumeration, as well as execution, but what caught my attention was persistence. This is due in large part to an investigation we'd done in 2016 that led to a blog post about a novel persistence mechanism. The persistence mechanism illustrated in the blog post bore remnants similar to what was seen in this Mandiant BlackHat 2014 presentation (see slide 44).

What's interesting is that we continue to see this WMI persistence mechanism used again and again, where event consumers are added to the WMI repository. In addition to the 2016 blog post mentioned previously, MS's own Human-operated ransomware blog post from 2020 includes the statement, "...evidence suggests that attackers set up WMI persistence mechanisms...".

In addition to some of the commands offered up by the CyberTriage guide and other resources, MS's own AutoRuns tool includes a check for WMI persistence mechanism on live systems.

There are also a number of tools for parsing the WMI repository/OBJECTS.DATA file for event consumers added for persistence during disk or "dead box" forensics, such as wmi-parser and flare-wmi

Chad Tilbury shared some really great info in his blog post, Finding Evil WMI Event Consumers with Disk Forensics.

Disk forensics isn't just about parsing the WMI repository; there's also the Windows Event Log. From this NXLog blog post regarding WMI auditing, look for event ID 5861 records in the WMI/Operational Event Log.

I know that some folks like to use plaso, and while it is a great tool, I'm not sure that it parses the WMI repository. I found this issue regarding adding the capability, but I haven't seen where the necessary parser has been added to the code. If this capability has been added, I'd greatly appreciate it if someone could link me to a resource that describes/documents this fact. Thanks!

Monday, March 10, 2025

The Problem with the Modern Security Stack

I read something interesting recently that stuck with me. Well, not "interesting", really...it was a LinkedIn post on security sales. I usually don't read or follow such things, but for some reason, I started reading through this one, and really engaging with the content. This piece said that "SOC analysts want fewer alerts", and went on with further discussions of selling solutions for the "security stack". I was drawn in by the reference to "security stack", and it got me to thinking...what constitutes a "security stack"? What is that, exactly? 

Now, most folks are going to refer to various levels of tooling, but for every definition you hear, I want you to remember one thing...most of these "security stacks" stand on an extremely weak foundation. What this means is that if you lay your "security stack" over default installations of OSs and applications, with no configuration modifications or "hardening", and if you have no asset inventory, and you haven't performed even the most high-level attack surface reduction, it's all for naught. It's difficult to filter out noise and false positives in detections when nothing has been done to configure the endpoints themselves to reduce noise.

One approach to a security stack is to install EDR and other security tooling on the endpoints (all of them, one would hope), and manage it yourself, via your own SOC. I know of one organization several years ago that had installed EDR on a subset of their systems, and enabled in learning mode. Unfortunately, it was a segment on which a threat actor was very active, and rather than being used to take action against the threat actor, the EDR learned that the threat actor's activity was "normal". 

I know of another organization that was hit by a threat actor, and during the after action review, they found that the threat actor had used "net user" (native tool/LOLBin) to create new user accounts within their environment. They installed EDR, and were not satisfied with the default detection rules, so they created one to detect the use of net.exe to create user accounts. They were able to do this because they knew that within their organization, they did not use this LOLBin to manage user accounts, and they also knew which app they used, which admins did this work, and from which workstations. As such, they were able to write a detection rule with 100% fidelity, knowing that any detection was going to be malicious in nature.

What happens if you outsource your "security stack", even just part of it, and don't manage that stack yourself (usually referred to as MDR or XDR)? Outsourcing your security stack can become even more of an issue, because while you have access to expertise (you hope), you're faced with another issue all together. Those experts in response, and detection engineering are now faced with receiving data from literally hundreds of other infrastructures, all in similar (but different) states of disarray as yours. The challenge then becomes, how do you write detections so that they work, but do not flood the SOC (and ultimately, customers) with false positives? 

In some cases, the answer is, you don't. There are times when the activity that is 100% malicious on one infrastructure, is part of a critical business process for others. While I worked for one MDR company in particular, we saw that...a lot. We had a customer that had their entire business built on sharing Office documents with embedded macros over the Internet...and yes, that's exactly how a lot of malware made/makes it on to networks. We also had other customers for whom MSWord or Excel spawning cmd.exe or PowerShell (i.e., running a macro) could be a false positive. Under such circumstances, do you keep the detections and run the risk of regularly flooding the SOC with alerts that are all false positives, or do you take a bit of a different approach and focus on detecting post-exploitation activity only?

Figure 1: LinkedIn Post
A recent LinkedIn post from Chris regarding the SOC survey results is shown in figure 1. One of the most significant issues for a SOC is the "lack of logs". This could be due to a number of reasons, but in my experience over the years, the lack of logs is very often the result of configuration management, or lack thereof. For example, by default MSSQL servers log failed login attempts, and modifications to stored procedures (enabling, disabling); successful logins are not recorded by default. I've also seen customers either disable auditing of both successful logins and failed login attempts, or turn up the auditing so high that the information needed when an attack occurs is overwritten quickly, sometimes within minutes, or even quicker. All of this goes back to how the foundation of the security stack, the operating system and installed applications, are built, configured, and managed.

Figure 2: SecureList blog Infection Flow
Figure 2 illustrates the Infection Flow from a recent SecureList article documenting findings regarding the SideWinder APT. The article includes the statement, "The attacker sends spear-phishing emails with a DOCX file attached. The document uses the remote template injection technique to download an RTF file stored on a remote server controlled by the attacker. The file exploits a known vulnerability (CVE-2017-11882) to run a malicious shellcode..."; yes, it really does say "CVE-2017-11882", and yes, the CVE was published over 7 years ago. I'm sharing the image and link to the article not to shame anyone, but rather to illustrate that the underlying technology employed by many organizations may be out of date, unpatched, and/or consisting of default, easily compromised configurations.

The point I'm making here is that security stacks built on a weak foundation are bound to have problems, perhaps even catastrophic ones. A strong foundation begins with an asset inventory (of both systems and applications), and attack surface reduction (through configuration, patching, etc.). Very often, it doesn't take a great deal to harden systems; for example, here's a Huntress blog post where Dray provided free PowerShell code to provide a modicum of "hardening" to endpoints. 

Common issues include:
- Publicly exposed RDP on servers and workstations, with no MFA; no one is watching the logs, so they don't see the brute force attacks, from public IP addresses
- Publicly exposed RDP with MFA, but other services not covered by MFA (SMB, MSSQL) are also exposed, so the MFA can be disabled; this applies to other security services, as well, such as anti-virus, and even EDR
- Exposed, unpatched, out of date services
- Disparate endpoints that are not covered by security services (webcams, anyone?) 

Wednesday, February 19, 2025

Lina's Write-up

Lina recently posted on LinkedIn that she'd published another blog post. Her blog posts are always well written, easy to follow, fascinating, and very informative, and this one did not disappoint.

In short, Lina says that she found a bunch of Chinese blog posts and content describing activity that Chinese cybersecurity entities have attributed to what they refer to as "APT-C-40", or the NSA. So, she read through them, translated them, and mapped out a profile of the NSA by overlaying the various write-ups.

Lina's write-up has a lot of great technical information, and like the other stuff she's written, is an enthralling read. Over the years, I've mused with others I've worked with as to whether or not our adversaries had dossiers on us, or other teams, be they blue or red. As it turns out, thanks to Lina, we now know what they do, what those dossiers might look like, and the advantage that the eastern countries have over the west.

For me, the best part of the article was Lina's take-aways. It's been about 30 yrs since I touched a Solaris system, so while I found a lot of what Lina mentioned in the article interesting (like how the Chinese companies knew that APT-C-40 were using American English keyboards...), I really found the most value in the lessons that she learned from her review and translation of open Chinese reporting. Going forward, I'll focus on the two big (for me) take-aways:

There is a clear and structured collaboration...

Yeah...about that.

A lot of this has to do with the business models used for DFIR and CTI teams. More than a few of the DFIR consulting teams I've been a part of, or ancillary to, have been based on a utilization model, even the ones that said they weren't. A customer call comes in, and the scoping call results in an engagement of a specific length; say, 24 or 48 hrs, or something like that. The analyst has to collect information, "do" analysis and write a report, eating any time that goes over the scoped time frame, or taking shortcuts in analysis and reporting to meet the timeline. As such, there's little in the way of cross-team collaboration, because, after all, who's going to pay for that time?

In 2016, I wrote a blog post about the Samas (or SamSam) ransomware activity we'd seen to that point. This was based on correlation of data across half a dozen engagements, each worked by a different analyst. The individual analysts did not engage with each other; rather, they simply proceeded through the analysis and reporting of their engagement, and were then assigned to other engagements.

Shortly after that blog post was published, Kevin Strickland published his analysis of another aspect of the attacks; specifically, the evolution of the ransomware itself.

Two years later, additional information was published about the threat group itself, some of which had been included in the original blog post.

The point is that many DFIR teams do not have a business model that facilitates communications across engagements, and as such, analysts aren't well practiced at large scale communications. Some teams are better at this than others, but that has a lot to do with the business model and culture of the team itself. 

Overall, there really isn't a great deal of collaboration within teams and organizations, largely because everyone is silo'd off by business models; the SOC has a remit that doesn't necessarily align with DFIR, and vice versa; the CTI team doesn't have much depth in DFIR skill sets, and what the CTI team publishes isn't entirely useful on a per-engagement basis to the DFIR team. I've worked with CTI analysts who are very, very good at what they do, like Allison Wikoff (re: Mia Ash), but there was very little overlap between the CTI and IR teams within those organizations.

Now, I'm sure that there's a lot of folks reading this right now who're thinking, "hey, hold on...I/we collaborate...", and that may very well be the case. What I'm sharing is my own experience over the passed 25 yrs, working in DFIR as a consultant, in FTE roles, running and working with SOCs, working in companies with CTI teams, etc.

This is an advantage that the east has over the west; collaboration. As Lina mentioned, a lot of the collaboration in the west is through closed, invite-only groups, so a lot of what is found isn't necessarily shared widely. As a result, those that are not part of those groups don't have access to information or intel that might validate their own findings, or fill in some gaps. Further, those who aren't in these groups have information that would fill in gaps for those who are, but that information can't be shared, nor developed.

...Western methodologies typically focus on constructing a super timeline...

My name is Harlan, and I'm a timeliner. Not "super timelines"...while I'm a huge fan of Kristinn (heck, I bought the guy a lollipop with a scorpion inside once), I'm a bit reticent to hand over control of my timeline development to log2timeline/plaso. This is due, in part, to knowing where the gaps are, what artifacts the tool parses, and which ones it doesn't. Plaso and it's predecessor are great tools, but they don't get everything, particularly not everything I need for my investigations, based on my analysis goals. 

Okay, getting back on point...I see what Lina's saying, or perhaps it's more accurate to say, yes, I'm familiar with what she describes. In several instances, I've done a good bit of adversary profiling myself, without the benefit of "large scale data analysis using AI" because, well, AI wasn't available, and I started out my investigation looking for those things. In one instance, I could see pretty clearly not just the hours of operation of the adversary, but we'd clearly identified two different actors within the group going through shift changes on a regular basis. On the days where there was activity on one of the nexus endpoints, we'd see an actor log in, open a command prompt/cmd.exe, and then interact with the Event Logs (not clearing them). Then, about 8 hrs later (give or take), that actor would log out, and another actor would log in and go directly to PowerShell. 

Adversary profiling, going beyond IOCs and TTPs to look at hours of operation/operational tempo, situational awareness, etc., is not something that most DFIR teams are tasked or equipped for, and deriving that sort of insight from intrusion data is not something either DFIR or CTI teams are necessarily equipped/staffed for. This doesn't mean that it doesn't happen, just that it's not something that we, in the West, see in reporting on a regular basis. We simply don't have a culture of collaboration, neither within nor across organizations. Rather, if detailed information is available, many times it's thought to be held close to the vest, as part of a competitive advantage. In my experience, it's less about competitive advantage, and more often the case that, while the data is available, it's not developed into intel, nor insights.

Conclusion
I really have to applaud Lina for not only taking the time to, as she put it, dive head-first into this rabbit hole, and for putting forth the effort and having the courage to publish her findings. In his book Call Sign Chaos, Gen. Mattis referred to the absolute need to be well-read, and that applies not just to warfighters, but across disciplines, as well. However, in order for that to be something that we can truly take advantage of, we need writing like Lina's to educate and inspire us. 

Sunday, February 16, 2025

The Role of AI in DFIR

The role of AI in DFIR is something I've been noodling over for some time, even before my wife first asked me the question of how AI would impact what I do. I guess I started thinking about it when I first saw signs of folks musing over how "good" AI would be for cybersecurity, without any real clarity, nor specification as to how that would work.

I recently received a more pointed question regarding the use of AI in DFIR, asking if it could be used to develop investigative plans, or to identify both direct and circumstantial evidence of a compromise. 

As I started thinking about the first part of the question, I was thinking to myself, "...how would you create such a thing?", but then I switched to "why?" and sort of stopped there. Why would you need an AI to develop investigative plans? Is it because analysts aren't creating then? If that's the case, then is this really a problem set for which "AI" is a solution?

About a dozen years ago, I was working at a company where the guy in charge of the IR consulting team mandated that analysts would create investigative plans. I remember this specifically because the announcement came out on my wife's birthday. Several months later, the staff deemed the mandate a resounding success, but no one was able to point to a single investigative plan. Even a full six months after the announcement, the mandate was still considered a success, but no one was able to point to a single investigative plan. 

My point is, if your goal is to create investigative plans and you're looking to AI to "fill the gap" because analysts aren't doing it, then it's possible that this isn't a problem for which AI is a solution. 

As to identifying evidence or artifacts of compromise, I don't believe that's necessarily a problem set that needs AI as the solution, either. Why is that? Well, how would the model be trained? Someone would have to go out and identify the artifacts, and then train the model. So why not simply identify and document the artifacts?

There was a recent post on social media regarding investigating WMI event consumers. While the linked resource includes a great deal of very valuable information, it's missing one thing...specific event records within the WMI-Activity/Operational Event Log that apply to event bindings. This information can be found (it's event ID 5861) and developed, and my point is that sometimes, automation is a much better solution than, say, something like AI, because what we see at the 'training set' is largely insufficient. 

What do I mean by that? One of the biggest, most recurring issues I continue to see in DFIR circles is the misrepresentation (some times subtle, some times gross) of artifacts such as AmCache and ShimCache. If sources such as these, which are very often incomplete and ambiguous, leaving pretty significant gaps in understanding of the artifacts, are what constitutes the 'training set' for an AI/LLM, then where is that going to leave us when the output of these models is incorrect? And at this point, I'm not even talking about hallucinations, just models being trained with incorrect information.

Expand that beyond individual artifacts to a SOAR-like capability; the issues and problems simply become compounded as complexity increases. Then, take it another step/leap further, going from a SOAR capability within a single organization, to something so much more complex, such as an MDR or MSSP. Training a model in a single environment is complex enough, but training a model across multiple, often wildly disparate environments increases that complexity by orders of magnitude. Remember, one of the challenges all MDRs face is that what is a truly malicious event in one environment is often a critical business process in others.

Okay, let's take a step back for a moment. What about using AI for other tasks, such as packet analysis? Well, I'm so glad you asked! Richard McKee had that same question, and took a look at passing a packet capture to DeepSeek:

















The YouTube video associated with the post can be found here.

Something else I see mentioned quite a bit is how AI is going to impact DFIR, by allowing "bad guys" to uncover zero day exploits. That's always been an issue, and I'm sure that the new issue with AI is that bad guys will cycle faster on developing and deploying these exploits. However, this is only really an issue for those who aren't prepared; if you don't have an asset inventory (of both systems and applications), haven't done anything to reduce your attack surface, haven't established things like patching and IR procedures...oh, wait. Sorry. Never mind. Yeah, that's going to be an issue for a lot of folks.

Monday, January 20, 2025

Artifacts: Jump Lists

In order to fully understand digital analysis, we need to have an understanding of the foundational methodology, as well as the various constituent artifacts on which a case may be built. The foundational methodology starts with your goals...what are you attempting to prove or disprove...and once you understand the goals of your analysis, you can assemble the necessary artifacts to leverage in pursuit of those goals.

Like many of the artifacts we might examine on a Windows system, Jump Lists can provide useful information, but they are most useful when viewed in conjunction with other artifacts. Viewing artifacts in isolation deprives the analyst of valuable context.

Dr. Brian Carrier recently published an article on Jump List Forensics over on the CyberTriage blog. In that article, he goes into a good bit of depth regarding both the Automatic and Custom Jump Lists, and for the sake of this article, I'm going to cover just the Automatic Jump Lists. 

As Brian stated in his article, Jump Lists have been around since Windows 7; I'd published several articles on Jump Lists going back almost 14 years at this point. Jump Lists are valuable to analysts because they're (a) created as a result of user interaction via the Windows Explorer shell, (b) evidence of program execution, and (c) evidence of data or file access. 

Automatic Jump Lists follow the old Windows OLE "structured storage" format. Microsoft refers to this as the "compound file binary" format and has thoroughly documented the format structures. Some folks who've been around the industry for a while will remember that the OLE format is what Office documents used to use, and that there was a good bit of metadata associated with these documents. In fact, a good way to find the old school "OG" analysts still hanging around the industry is to mention the Blair document. And the format didn't disappear when Office was updated to the newer style format; rather, the format is used an other areas, such as Jump Lists, and at one point was used for Sticky Notes.

Here's my code for parsing the "structured storage" format; this was specifically developed for Windows 7 Automatic Jump Lists, but the basic code can be repurposed for OLE files, in general, or specifically updated for specific field (i.e., the DestList stream) in newer versions of Windows.

As you saw in Brian's article, Automatic Jump Lists are specific to each user, and are found within the user's profile path. Each Automatic Jump List is named using an "application identifier" or "AppID". This is a value that identifies the application used to open the target files (Notepad, Notepad++, MSWord, etc.), and is consistent across platforms. This means that an AppID that refers to a particular application on a Windows system will remain the same on other Windows systems. 

Microsoft has referred to the "structured storage" format as a "file system within a file"; if you do a study of the format, you'll see why. This structure results in various 'streams' being within the file, and for Automatic Jump Lists, there two types of streams. Most of the streams in a Automatic Jump List file contain a stream structure that follows the Windows shortcut/LNK file format.

The other type of stream is referred to as the "DestList" stream, and the structure of this stream on Windows 7 systems was first documented about 14 yrs ago. The following figure illustrates an Automatic Jump List opened in the Structured Storage Viewer, with the DestList stream highlighted.










The structure of the DestList stream changed slightly between Windows 7 and 10 (and maybe again with Windows 11, I haven't looked yet...), but the overall structure of the Automatic Jump List files remains essentially the same.

Summary
Automatic Jump Lists help analysts validate that a user was active on the system via the Windows shell (as well as when), that they launched applications (program execution), and that they used those applications to open files (file/data access), and when they did so. As such, parsing Jump Lists and including the data in a timeline can add a good deal of granularity and context to the timeline, particularly as it pertains to user activity.

As always, Automatic Jump Lists should be used in conjunction with other artifacts, such as Prefetch, UserAssist, RecentDocs, etc., and should not be viewed in isolation, pursuant to the analyst's investigative goals.

Something else to remember is this...Automatic Jump Lists are generated by the operating system as the user interacts with the environment. As such, if an application is added, the user uses that application and Automatic Jump Lists are generated, and then the user removes the application, the Automatic Jump Lists remain. The same thing happens with other artifacts, such as Recents shortcuts/LNK files, Registry values, etc. So, as with other artifacts, Automatic Jump Lists can provide indications of applications previously installed or files that previously existed on (or were accessed from) the endpoint.

Monday, January 06, 2025

Carving

Recovering deleted data, or "carving", is an interesting digital forensics topic; I say "interesting" because there are a number of different approaches and techniques that may be valuable, depending upon your goals. 

For example, I've used X-Ways to recover deleted archives from the unallocated space of a web server. A threat actor had moved encrypted archives to the web server, and we'd captured the password they used via EDR telemetry. The carving revealed about a dozen archives, which we opened using the captured password, which allowed our customer to understand what data had been exfil'd, and their risk and exposure. 

But carving can be about more than just recovering files from unallocated space. We can carve files and records from unstructured data, or we can treat 'structured' data as unstructured and attempt to recover records. We did this quite a bit during PCI forensic investigations, and found a much higher level of accuracy/fidelity when we carved for track 1 and 2 data, rather than just credit card numbers. 

We can also carve within files themselves. Several common file formats are essentially databases, and some are described as a "file system within a file". As such, deleted records and data can be recovered from such file formats, if necessary.

I recently ran across a fascinating post from TheDFIRJournal recently, regarding file carving encrypted virtual disks. The premise of the post is that some file encryption/ransomware software does not encrypt entire files, just rather just part of it, for the sake of speed. In the case of virtual disks, a partially encrypted file may mean that, while the disk itself is useable, there may be valuable evidence available within the virtual disk file itself. 

I should note that I did recently see a ransomware deployment that used a "--mode fast" switch at the command line, possibly indicating that the entire file would not be encrypted, but rather only a specific number of bytes of the file. As such, with larger files, such as virtual disks, WEVT files, etc., there might be an opportunity to recover valuable data, so file and record carving techniques would be valuable, depending upon your specific investigative goals.

The premise raised in the article is not unique; in fact, I've run into it before. In 2017, when NotPetya hit, we received a number of system images from customers where the MBR was overwritten. We had someone on our team who could reconstruct the MBR, and we also ran carving for WEVTX records, recovering Security-Auditing/4688 records indicating process creation. The customers had not enabled full command lines being recorded, but we were able to reconstruct enough data to illustrate the sequence of processes specific to the infection and impact. So, having a disk image where the MBR and/or the MFT is overwritten is not a new situation, simply one we haven't encountered recently.

TheDFIRJournal article covers a number of tools, including PhotoRec, scalpel (not currently being maintained), and Willi Ballenthin's EVTXtract. The article also covers Simson Garfinkel's bulk_extractor, but looking at the bulk_extractor Github, there do not appear to be releases for Windows starting with version 2.0. While some folks have stated that bulk_extractor-rec's capabilities have been added to bulk_extractor, that's kind of a moot point, and the latest release of bulk_extractor-rec will have to suffice. 

Addendum, 7 Jan 2025: Thanks to Brian Maloney for sharing that the bulk_extractor 2.0 for Windows CLI tool can be found here.

Also from the article, the author mentioned the use of a customer EVTXParser script, which can be found here. I like this approach, as I'd done something similar with the WinXP/2003 EVT files, where I'd written lfle.pl to parse EVT records from unstructured data, which could include a .EVT file. I wrote this script (a 'compiled' Windows EXE is also available) after finding two complete records embedded in an .EVT file that were not "visible" via the Event Viewer, nor any other tools that started off by reading the file header to determine where the records were located. The script then evolved into something you could run against any data source. While not the fastest tool, at the time it was the only tool available that would take this approach. 

In the past, I've done carving on unallocated space within a disk image, using something like blkls to get the uallocated space into on contiguous file of unstructured data. From there, running tools like bulk_extractor allow for record carving.

I've also has pretty good success running bulk_extractor across memory dumps; this is something I talked about/walked through in my book, Investigating Windows Systems.

Carving can also be done on individual files. For example, in 2013, Mari DeGrazia published a great blog post on recovering deleted data from SQLite databases, and carving Registry hive files for deleted keys and values, as well as examining unallocated space within hive files is something I've been a fan of for quite some time. My thanks go to Jolanta Thomassen for 'cracking the code' on deleted cells within Registry hive files!

Here's a presentation I put together a while back that includes information regarding unallocated space within Registry hive files.

Addendum, 13 Jan: Damien Attoe released his first blog post regarding a tool he's working on called "sqbite"; the alpha functionality is what's currently available, and Damien plans to release additional functionality in March. Reading through his blog post, it appears that Damien is working toward something similar to what Mari talked about and released. It's going to be interesting to see what he develops!