Monday, January 06, 2025

Carving

Recovering deleted data, or "carving", is an interesting digital forensics topic; I say "interesting" because there are a number of different approaches and techniques that may be valuable, depending upon your goals. 

For example, I've used X-Ways to recover deleted archives from the unallocated space of a web server. A threat actor had moved encrypted archives to the web server, and we'd captured the password they used via EDR telemetry. The carving revealed about a dozen archives, which we opened using the captured password, which allowed our customer to understand what data had been exfil'd, and their risk and exposure. 

But carving can be about more than just recovering files from unallocated space. We can carve files and records from unstructured data, or we can treat 'structured' data as unstructured and attempt to recover records. We did this quite a bit during PCI forensic investigations, and found a much higher level of accuracy/fidelity when we carved for track 1 and 2 data, rather than just credit card numbers. 

We can also carve within files themselves. Several common file formats are essentially databases, and some are described as a "file system within a file". As such, deleted records and data can be recovered from such file formats, if necessary.

I recently ran across a fascinating post from TheDFIRJournal recently, regarding file carving encrypted virtual disks. The premise of the post is that some file encryption/ransomware software does not encrypt entire files, just rather just part of it, for the sake of speed. In the case of virtual disks, a partially encrypted file may mean that, while the disk itself is useable, there may be valuable evidence available within the virtual disk file itself. 

I should note that I did recently see a ransomware deployment that used a "--mode fast" switch at the command line, possibly indicating that the entire file would not be encrypted, but rather only a specific number of bytes of the file. As such, with larger files, such as virtual disks, WEVT files, etc., there might be an opportunity to recover valuable data, so file and record carving techniques would be valuable, depending upon your specific investigative goals.

The premise raised in the article is not unique; in fact, I've run into it before. In 2017, when NotPetya hit, we received a number of system images from customers where the MBR was overwritten. We had someone on our team who could reconstruct the MBR, and we also ran carving for WEVTX records, recovering Security-Auditing/4688 records indicating process creation. The customers had not enabled full command lines being recorded, but we were able to reconstruct enough data to illustrate the sequence of processes specific to the infection and impact. So, having a disk image where the MBR and/or the MFT is overwritten is not a new situation, simply one we haven't encountered recently.

TheDFIRJournal article covers a number of tools, including PhotoRec, scalpel (not currently being maintained), and Willi Ballenthin's EVTXtract. The article also covers Simson Garfinkel's bulk_extractor, but looking at the bulk_extractor Github, there do not appear to be releases for Windows starting with version 2.0. While some folks have stated that bulk_extractor-rec's capabilities have been added to bulk_extractor, that's kind of a moot point, and the latest release of bulk_extractor-rec will have to suffice. 

Addendum, 7 Jan 2025: Thanks to Brian Maloney for sharing that the bulk_extractor 2.0 for Windows CLI tool can be found here.

Also from the article, the author mentioned the use of a customer EVTXParser script, which can be found here. I like this approach, as I'd done something similar with the WinXP/2003 EVT files, where I'd written lfle.pl to parse EVT records from unstructured data, which could include a .EVT file. I wrote this script (a 'compiled' Windows EXE is also available) after finding two complete records embedded in an .EVT file that were not "visible" via the Event Viewer, nor any other tools that started off by reading the file header to determine where the records were located. The script then evolved into something you could run against any data source. While not the fastest tool, at the time it was the only tool available that would take this approach. 

In the past, I've done carving on unallocated space within a disk image, using something like blkls to get the uallocated space into on contiguous file of unstructured data. From there, running tools like bulk_extractor allow for record carving.

I've also has pretty good success running bulk_extractor across memory dumps; this is something I talked about/walked through in my book, Investigating Windows Systems.

Carving can also be done on individual files. For example, in 2013, Mari DeGrazia published a great blog post on recovering deleted data from SQLite databases, and carving Registry hive files for deleted keys and values, as well as examining unallocated space within hive files is something I've been a fan of for quite some time. My thanks go to Jolanta Thomassen for 'cracking the code' on deleted cells within Registry hive files!

Here's a presentation I put together a while back that includes information regarding unallocated space within Registry hive files.

Addendum, 13 Jan: Damien Attoe released his first blog post regarding a tool he's working on called "sqbite"; the alpha functionality is what's currently available, and Damien plans to release additional functionality in March. Reading through his blog post, it appears that Damien is working toward something similar to what Mari talked about and released. It's going to be interesting to see what he develops!

Tuesday, December 24, 2024

UEPOTB, LNK edition

A while back, Jesse Kornblum published a paper titled, "Using Every Part of the Buffalo in Windows Memory Analysis". This was, and still is, an excellent paper, based on it's content and how it pertained to the subject (Windows memory analysis). However, what Jesse shared in that paper had additional value, in that it he expressed the idea of using everything available to the analyst.

Since then, we've seen more than a few papers, blog posts and articles where the author(s) go only so far with the extent of what they share regarding the data they're looking at, and do not truly use all the parts of the buffalo. Examples of these include blog posts or articles where LNK files are delivered as part of a phishing campaign, and the author only goes so far as to show the basic properties of the LNK file, perhaps up through the command line, but then stopping there.

Now and again, we do see articles published by teams that truly do strive to use all the parts of the buffalo, leveraging everything they have available, but those are still few and far between. Perhaps one of the most notable examples is a Mandiant article from 19 Nov 2018 that referred to a phishing campaign (using LNK files) by APT29/"Cozy Bear". In the article, the authors compare activity from a similar campaign from 2016, using LNK files from the previous campaign (see figures 5 & 6).

One such example of where the content falls short is a recent blog post from Cyble. The article contains references to three LNK files using in phishing campaigns, each illustrated in the article by opening the file via the Properties tab, as seen in figure 1. This shows specific elements of the LNK file, but only those visible via the Properties tab. 

Figure 1: LNK Properties

What we see in figure 1 also shows that the author(s) had access to the LNK files themselves, and could have done so much more with them. 

Hashes for the LNK files are also listed in the IoCs table at the end of the article, and I was able to find one of the file available for download via another site online, and was able to extract the metadata illustrated in figure 2.

Figure 2: TrackerDataBlock

"christmas-destr" seems to be pretty unique workstation name, and might prove to be interesting in tracking and retro-hunts. And yes, if you do an OUI lookup on the node IDs shown in figure 2, they'll align with "VMWare, Inc.", which isn't surprising. 

Interestingly enough, the LNK file also contains a fairly well-populated PropertyStoreDataBlock, which isn't something we see in every LNK file, and can provide insight into how the shortcut file was "constructed". Different methodologies and tools for creating LNK files leave different toolmarks.

Taking things a step further, in addition to the information in the rest of the LNK file discussed above, we can see from figure 3 that the information within the file header and the shell item ID list can be fairly illuminating.

Figure 3: LNK header+

The original article was published on 19 Dec 2024, which provides some idea as to the timeframe of when the LNK file would have been deployed in a campaign, collected, and analyzed. Using the information illustrated in figure 3, we get some additional insight as to the timeframe specifically associated with the LNK file, particularly those time stamps within the shell items.

In addition to the volume serial number (i.e., "280C-1822"), the time stamps and MFT reference numbers extracted from the shell items provides additional indicators that can be used to align with LNK files from other campaigns. 

Another such example is a blog from Blackberry that discusses a Pakistani threat group dubbed "Transparent Tribe". I was able to find a couple of the LNK files listed in the appendix available for download, based on the provided hashes. What I found when comparing the metadata extracted from two LNK files was that aside from some minor differences, mostly in the command lines, they were nearly identical. Same volume serial number, same machine ID, same time stamps in the shell items, etc. In this case, the machine ID value was "desktop-e7n7e7f", which (a) should be compared across the other LNK files as well as other data sources, and (b) can be used along with other elements of metadata as part of a VirusTotal retro-hunt to expand the intelligence aperture even further, potentially associating the system with other campaigns. 

These LNK files also contained a PropertyStoreDataBlock, but rather than being as verbose as the previous example from the Cyble article, these LNK files simply contained a SID:

S-1-5-21-3861309104-3271506253-2070734288-1001

Okay, but so what? Why does any of this matter? 

Well, these indicators, when combined and added to other indicators, tell us a good bit about the operational processes of the threat actor, as well as the development environment and processes employed by the threat actor/group.

Beyond the basic indicators, the structure of the LNK file itself can tell us a good bit about how the LNK file was constructed, as well as the situational awareness of the actor or group. For example, there have been instances where the time stamps in the shell items have been zero'd out, essentially removing those indicators. However, I'd be careful about any assumptions made regarding a threat group's situational awareness or operational security based on metadata within LNK files; the simple fact is that this information is largely left unused by many firms, so why bother with the extra steps or work to remove the indicators?

Tuesday, November 26, 2024

Program Execution: The ShimCache/AmCache Myth

I recently saw another LinkedIn post from someone supporting and sending readers to a site that was reportedly started using the SANS DFIR poster as a reference. As illustrated in figure 1, this site references the ShimCache artifact as providing evidence of program execution, and does the same for the AmCache artifact, as well.

Figure 1: ShimCache Entry



Now, while yes, it is true that these artifacts can provide evidence of program execution, that is not always the case, and this needs to be understood throughout the community. 

What I'm going to do with this blog post is provide the resources to show why these artifacts do not solely provide evidence of "program execution", so that others in the community can reference these resources.

ShimCache
Mandiant's article regarding ShimCache includes the statement illustrated in figure 2.

Figure 2: Article excerpt








Notice the highlighted section in figure 2, which states, "...that were not actually executed.", and then goes on to say that entries can be added if a user browses to a folder. The ShimCache value is written to the System hive, and does not provide a reference to the user who may have browsed to a folder. As such, this is something that would need to be resolved through user profile analysis, via artifacts such as JumpLists, recently accessed files, shellbags, etc. 

However, the important thing to understand here is that this reductionist approach of saying that ShimCache is evidence solely of program execution is incorrect. 

We also need to remember that the ShimCache is written to the Registry value at shutdown; this is trivial to demonstrate via a timeline. 

AmCache
Blanche Lagney's Analysis of AmCache v2 paper is the definitive reference for all things AmCache. The research is thorough, and presented in a manner that, while it does require some reading to address specific questions, should remove all doubt as to the value of the artifact on specific Windows builds.

If you're looking at just the AmCache.hve file as an artifact to determine program execution, you're going to need to find the closest match to the Windows build and libraries, based on the keys you're looking at, to better understand the nature of the artifacts you're seeing.

Analysis Process
The key here is not to try to memorize the "value" of individual artifacts in isolation, but to have an analysis process where multiple data sources and artifacts are viewed together, so that through this process you can 'see' the context of the events in question. For example, when it comes to program execution, we might look to JumpLists, Security (if configured) and/or Sysmon (if installed) Event Logs, UserAssist entries, and on workstation platforms, Prefetch files. On Windows 11, we might also look to data within the PCA folder. To validate that a program executed, we might look to impacts in the Registry and/or file system, or to the Application Event Log to determine if the program generated an error or crashed.

For teaching/instructional purposes, it would be extremely valuable to start by describing one data source, such as the file system, and then show how that data source can be viewed via a timeline. Then, add another data source, such as the Windows Event Log or the Registry, and add that data source to the timeline. When discussing the Registry (as well as the ShimCache and AmCache artifacts), it will be important for analysts to understand the value of time-based metadata (key LastWrite times), as well as time-based data embedded within individual values, all of which can help better address analysis categories such as "program execution". 

Conclusion
While it is valuable to have an understanding of various artifacts, the most important takeaway from this article is that analysts should not consider artifacts in isolation during an investigation, but should instead look to multiple data sources and artifacts, viewed together, to determine the nature and context of events in question.

Thursday, October 31, 2024

FTSCon

I had the distinct honor and pleasure of speaking at the "From The Source" Conference (FTSCon) on 21 Oct, in Arlington, VA. This was a 1-day event put on prior to the Volexity memory analysis training, and ran two different tracks...Maker and Hunter...with some really great presentations in both tracks!

Before I start in with providing my insights and observations, I wanted to point out that the proceeds of the event went to support Connect Our Kids, a really cool project to get folks, and especially kids, connected with birth parents, etc.

Keynote
Sean Koessel and Steven Adair provided the keynote, which was a look into a fascinating case they worked. In this case, the threat actor gained access to their customer by compromising near-by infrastructures and traversing/moving laterally via the wireless networks; hence the title, "The Nearest Neighbor Attack". Sean and Steven put a lot of effort into crafting and delivering their fascinating story, all about how they worked through this incident, with all of the bumps, detours, and delays along the way. They also promised that they'd be putting together a more comprehensive review of the overall incident on the Volexity blog, so keep an eye out.

Something Sean said at the very beginning of the presentation caught my attention, and got me thinking. He referred to the incident as something, "...no one's ever seen before." As Sean and Steven described this particular incident, it was more than just a bit of a complicated. As such, the question becomes, were they able to get as far as they did due to the knowledge, experience, and teamwork they brought to bear? Would someone else, say a single individual with different or lesser experience, have been able to do the same, or would this incident have been more akin the blind men trying to describe an elephant?

Or, had someone seen this before, and just not thought to share it? Not long ago in my career, I worked with different teams of DFIR consultants, and time after time, I spoke to analysts who insisted that they didn't share what they were seeing, because they assumed, often incorrectly, that "...everyone's already seen this...".

Yarden Sharif gave a really interesting presentation on enclaves, something I hadn't heard of prior to the event (Matthew Geiger graciously explained what they were for me). 

At one point during her presentation, Yarden mentioned that enclaves can be enumerated, and I'm sure I'm not the only one who thought, "whoa, wait...what happens if the bad guy creates an enclave???"

Lex Crumpton's presentation was titled, "ATT&CKing the MITRE NERVE Incident: Operationalizing Threat Intelligence for a Safer Tomorrow." What got my attention most was that Lex said she's interested in "behavior analysis", which, when considered from a DFIR perspective, is something that's fascinated me for quite some time. 

John Hammond is always an entertaining and educational (as well as knowledgeable) presenter, and his malware presentation was pretty fascinating to watch. John does a great job of illustrating his walk-through and sharing his thought processes when finding and unraveling new challenges. 

Andrew Case shared some insight into how Volatility could be used to detect EDR-evading malware, which I thought was pretty interesting. I've used Volatility before, and Andrew shared that there are some plugins that already detected the techniques used by some malware, and that other plugins were created to address gaps.

Something to keep in mind is that all of the techniques Andrew talked about are used by malware to directly address/attack EDR. There are other techniques at play, such as EDR Silencer, which creates WFP filter rules to prevent the EDR from talking to it's cloud infrastructure. This way, it doesn't directly interact with the EDR agent. As pointed out in the WindowsIR blog post, another technique that would work, would leave fewer artifacts, and would likely be missed by younger, less experienced analysts is to modify the hosts file (shoutz to Dray for that one!)

Andrew's presentation of the same title, from DefCon, is available here.

Addendum, 22 Nov: Steve and Sean's blog post was published today, find it here. As one would expect, it was picked up by folks like Brian Krebs.


Saturday, October 26, 2024

Artifact Tracking: Workstation Names

Very often in cybersecurity, we share some level of indicators of compromise (IOCs), such as IP addresses, domain names, or file names or hashes. There are other indicators associated with many compromises or breaches that can add a great deal of granularity or insight to the overall incident, particularly as the intrusion data and intel applies to other observed incidents.

One such indicator is the workstation name, so named based on the indicator as found within Microsoft-Windows-Security-Auditing/4624 event records, indicating a successful login, as well as within Microsoft-Windows-Security-Auditing/4625 and Microsoft-Windows-Security-Auditing/4776 events.

The value of the workstation name can depend upon the type of incident you're responding to, examining, or attempting to detect earlier in the attack cycle (i.e., moving "left of bang"). For example, many organizations become aware that files have been encrypted and they've been ransomed after those two things have happened. However, for someone to access an infrastructure or network, often they first need to access or log into an endpoint. Depending upon how this is achieved, there may be indicators left in popular Windows Event Logs. 

Huntress analysts have observed an IAB or Akira ransomware affiliate during multiple incidents with initial activity (logins via RDP) originating from a workstation named "WIN-JGRMF8L11HO".

While investigating a ReadText34 ransomware incident, Huntress analysts found that RDP logins originated from a workstation named "HOME-PC".

Huntress analysts have also observed the recurrence of workstation names such as "kali" and "0DAY-PROJECT" across multiple incidents. In most (albeit not all) instances, the workstation names associated with the identified malicious activity have not aligned with the naming scheme used by the organization. In fact, in some cases, Huntress analysts have been able to filter through the authentication logs and associate user account names with workstation names and IP addresses to clearly identify the malicious activity; that is, when there is a radical change in the workstation name normally associated with a user account.

While we can also extract workstation names from Splashtop Event Logs, we're not limited simply to Windows Event Log records. For example, Huntress analysts saw logins via legacy TeamViewer installations ahead of attempts to deploy LockBit3.0 ransomware, and in multiple observed incidents, logins originated from a workstation named WIN-8GPEJ3VGB8U. 

Correlation
If you've followed my blog for any amount of time, you'll likely have noticed that I'm very interested in file metadata, particularly LNK file metadata. In many cases, LNK/Windows shortcut files will contain a "machine ID" or NetBIOS name of the endpoint on which it was created. This information can be correlated with workstation names, looking for links between usage, campaigns on which the endpoints appear, etc.

Addendum, 14 Nov
This blog post was published today, regarding SafePay ransomware attacks observed within customer infrastructures. The table of IOCs contains two workstation names.

Addendum, 26 Nov
Others have publicized workstation names as part of their blogs, as well. Last year, Intrinsec published a blog post containing Akira indicators; the section on lateral movement contains six observed workstation names.

Tuesday, October 15, 2024

Analysis Process

Now and again, someone will ask me, "...how do you do analysis?" or perhaps more specifically, "...how do you use RegRipper?" 

This is a tough question to answer, but not because I don't have an answer. I've already published a book on that very topic, and it seems that my process for doing analysis is apparently very different from the way most people do analysis. 

Now, I can't speak to how everyone else goes about analyzing an endpoint, but when I share my process, it seems that that's the end of the conversation. 

My analysis process, laid out in books like "Investigating Windows Systems", is, essentially:


1. Document investigative goals. These become the basis for everything you do in the investigation, including the report.


Always start with the goals, and always start documentation by having those goals right there at the top of your case notes file. When I was active in DFIR consulting, I'd copy the investigative goals into the Executive Summary of the report, and provide 1-for-1 answers. So, three goals, three answers. After all, the Executive Summary is a summary for executives, meant to stand on it's own.


2. Collect data sources.


This one is pretty self-explanatory, and very often based on your response process (i.e., full images vs "triage" data collections). Very often, collection processes will include the least amount of data extracted from a system for the biggest impact, based upon the predominance of business needs, leaving other specific sources for later/follow-on collection, if needed.


3. Parse, normalize, decorate, enrich those data sources.


Basically, create a timeline, from as many data sources as I can or makes sense, based on my investigative goals. Easy-peasy.


Timelines are not something left to the end of the investigation, to be assembled manually into a spreadsheet. Rather, creating a timeline as a means of initiating an investigation provides for much needed context.


4. Identify relevant pivot points.


RegRipper and Events Ripper are great tools for this step. Why is that? Well, within the Registry, often items of interest are encoded in some manner, such as binary, hex, ROT-13, or some folder or other resource represented by a GUID; many of the RegRipper plugins extract and display that info in human-readable/-searchable format. So, running RegRipper TLN plugins to incorporate the data into a timeline, and then run "regular output" plugins to develop pivot points. Events Ripper is great for extracting items of interest from events files with (hundreds of) thousands of lines.


5. Identify gaps, if any, and loop back to #2.


Based on the investigative goals, what's missing? What else do you need to look for, or at? You may already have the data source, such as if you need to look for deleted content in Registry hives,


6. Complete when goals are met, which includes being validated.


An issue we face within the industry, and not just in DFIR, is validation. If a SOC analyst sees a "net user /add" command in EDR telemetry, do they report that a "user account was created" without (a) checking the audit configuration of Security Event Log, and (b) looking for Security-Auditing event records that demonstrate that a user account was created? If it was a local account, is the SAM checked?


Or, if msiexec.exe is seen (via EDR telemetry) running against an HTTP/HTTPS resource, is the Application Event Log checked for MsiInstaller events?


My point is, are we just saying that something happened, or are we validating via the available data sources that it actually happened?


7. Anything "new" gets baked back in


The great thing about timelines and other tools is that very often, you'll find something new, something you hadn't seen before, and was relevant (or could be) to your investigation. This is where most of the Events Ripper plugins have originated; I'll see something "new", often based on an update to Windows, or some installed application, and I'll "bake it back into" the process by creating a plugin.


Yes, documenting it is a good first step, but adding it back into your automation is taking action. Also, this way, I don't have to remember to look for it...it's already there.


For example, several years ago, another analyst mentioned seeing something "new" during a response; looking into it, this new thing was a Microsoft-Windows-TaskScheduler/706 event record, so once I got a little more info about it, and dug into the investigation myself just a bit, I added it to eventmap.txt. After that, I never had to remember to look for it, and I had the necessary references to support the finding already documented.

Rundown

I ran across a fascinating post from Cyber Sundae DFIR recently that talked about the Capability Access Manager, and how with Windows 11 it includes database of applications that have accessed devices such as the mic or camera, going beyond just the Registry keys and values we know about. 

It should surprise no one that this is an artifact found on Windows 11; after all, there've been more than a few changes to Windows 10, even just between various individual builds. As such, depending upon the nature of your case, and your investigative goals, this may be a value resource to explore. 

As a reminder, RegRipper has two plugins that query various values beneath the CapabilityAccessManager\ContentStore subkey, contentstore.pl and location.pl. The contentstore.pl plugin also comes in a TLN variant, as well, so that the information can be included in an investigative timeline.

I also ran across an interesting article regarding artifacts of data exfiltration on various platforms, including Windows. While the list of these artifacts, the one specific to Windows, is a good one, IMHO, it misses some very useful artifacts. Some of the artifacts listed in the article, such as Prefetch files, are not definitive, and need to be used in conjunction with other artifacts to even provide a hint of data exfiltration. After all, you can call something whatever you want on Windows systems and not impact the functionality; you can rename net.exe to winrar.exe, and the Prefetch file will be for winrar.exe, and unfortunately, command line arguments are not stored in the Prefecth files.

Also, the article states that the Shimcache, "...stores information about executables that have been run on the system, even if the file has been deleted. Investigators can use this to trace the usage of data exfiltration tools." The Shimcache does not only/solely store information about executables that have been run on the system, something that has been documented again and again. Executables can be included in the ShimCache if the user has browsed to the folder where the EXE resides. So, yes, the ShimCache does include executables that have been run on the system, but those with little experience often interpret this statement to mean that this is all that the ShimCache includes, and is therefore "evidence of execution". 

There are other, perhaps more definitive data sources that point to data exfiltration. For example, querying the BITS Client Event Log for upload jobs would reveal a good deal of information regarding data exfiltration. One data source I've used in the past is the IIS web server logs; a threat actor moved archive files to the web server, and then issued GET requests for the files. Looking back through the logs we had available, there had been no prior instances of .zip files being requested.

Yes, the SRUM db is a great place to look for evidence of data exfiltration, very much so. However, as with other data sources, we have to keep the context of the data source in mind when conducting an investigation.

Even with this list, there are number of ways to exfil data off of a Windows system, including the use of finger.exe (one of my favorites!).

Wednesday, October 09, 2024

Exploiting LNK Metadata

Anyone who's followed me for a bit knows that I'm a huge proponent of metadata, and in particular, exploiting metadata in LNK files that threat actors create, use as lures, and send to their targets.

I read an article not long ago from Splunk titled, LNK or Swim: Analysis & Simulation of Recent LNK Phishing. The article covered a good bit of information regarding LNK files sent by some threat actors, and even included a list of metadata items that could be used for "threat intel purposes", as illustrated in figure 1.

Fig. 1: Splunk article excerpt






However, what's illustrated in figure 1 was as far as they went. In fact, reading through the article and looking at the images of LNK parser tool output, each of those images is cut off before embedded metadata and "extra data blocks" can be seen. Even then, including this information in the images would require analysts to manually transpose this information, which is a very inefficient and error-prone process, particularly given how small some of these images are within the article.

I will say that the article does go on to talk about the use of LNK files in phishing campaigns, and provides a link to an LNK generator tool. There are some definite opportunities here for a research project, where LNK metadata is compared across different creation methods (righ-click on the Desktop, PowerShell, the generator tool, etc.).

In December, 2016, JPCERT published an article describing how threat actors reveal clues about their development environment when they sent LNK files to their targets. The LNK files would contain metadata associated with the system on which they were created, which from a CTI perspective is "free money".

Figure 1 from the JPCERT article, extracted and illustrated in figure 2, demonstrates one way that the LNK file metadata can be used. In this figure, various elements of metadata are used in a graph to illustrate relationships amongst data that would not be obvious via a spreadsheet.

Fig. 2: Figure 1, excerpted from JPCERT article















At this point, you're probably asking, "how would this metadata be used in the real world?" Almost 2 years after the JPCERT article was published, the folks at Mandiant published an article regarding the comparison of data across two Cozy Bear campaigns, one in 2016 and the other in 2018. Within that article, at figures 5 and 6, the Mandiant analysts compared LNK file from the two campaigns, illustrating not just the differences, but also the similarities, such as the volume serial number (fig. 5) and the machine IDs (fig. 6). While there were differences in time stamps and other metadata, there were also consistencies between the two campaigns, 2 yrs apart.

If you're saying, "...but I don't do CTI..." at this point, that's okay. There may be steps we can take to use what we know about LNK files to protect ourselves.

If you have Sysmon installed on endpoints, Sysmon event ID 11 events identify file creation or modifications; you can monitor the Sysmon Event Log for such events, and extract the full file name and path. If the file extension is ".lnk", you can verify that that file is an LNK file based the "magic number" within the file header and the GUID that follows it. From there, you can then either flag the file based on the path, or take an extra step to compare the machine ID to the current endpoint name; if they're not the same, definitely flag or even quarantine the file. 

Is implementing this yourself kind of scary? No problem. If you're using an EDR vendor (directly, or through an MDR) and the EDR generates similar telemetry (keep in mind, not all do), contact the vendor about adding the capability. Detecting behaviors based on LNK files is notoriously difficult, so why not detect them when they're written to disk, and take action before a user double-clicks it?


Tuesday, October 08, 2024

Shell Items

I ran across a Cyber5W article recently titled, Windows Shell Item Analysis. I'm always very interested in not only understanding parsing of various data sources from Windows systems, but also learning a little something about how others view the topic. 

Unfortunately, there was very little actual "analysis" in the article, an excerpt of which is shown in figure 1.

Figure 1: Text from article






I'm not sure I can agree with that statement; tools, be they open source or commercial, tend to be very good at extracting, parsing, and presenting/displaying data, but analyzing that data really depends on the investigative goals, something to which tools are generally not privy. 

But we do see that quite often in the industry, don't we? We'll see something written up, and it will say, "...<tool name> does analysis of...", and this is entirely incorrect. Tools are generally very good at what they do; that is, parsing and displaying information, that an analyst then analyzes, in the context of their investigative goals, as well as other data sources and artifacts.

The rest of the article doesn't really dig into either the metadata embedded within shell items, nor the analysis of the various artifacts themselves. In fact, there's no apparent mention of the fact that there are different types of shell items, all of which contain different information/metadata. 

I've written quite a bit regarding Windows shell items embedded within various data sources. In fact, looking at the results of a search across this blog, there are more than a few posts. Yes, several of them are from 2013, but that's just the thing...the information still applies, when it comes to shell item metadata. Just because it was written a decade or more ago doesn't mean that it's "out of date" or that it's no longer applicable. 

While it is important to understand the nature and value of various data sources and artifacts, we must also keep in mind that tools do not do analysis, it's analysts and examiners who collect, correlate and analyze data based on their investigative goals.

RegRipper Educational Materials

A recent LinkedIn thread led to a question regarding RegRipper educational materials, as seen in figure 1; specifically, are there any.

Figure 1: LinkedIn request








There are two books that address the use of RegRipper; Windows Registry Forensics, and Investigating Windows Systems (see figure 2). Together, these books provide information about the Windows Registry, RegRipper, and the use of RegRipper as part of an investigation. 


Figure 2: IWS



























Demonstrating the use of RegRipper in an investigation is challenging, as RegRipper is only one tool I typically use during an investigation. Investigations do not rest on a single data source, nor on a single artifact. The challenge, then, is in demonstrating the use of RegRipper in an analysis process, such as any of the case studies in Investigating Windows Systems, that most folks are simply unfamiliar with; the value of the demo isn't diminished, it's completely lost if the overall process isn't understood.

The analysis process demonstrated multiple times in IWS is the same process I've used for years, well prior to the publication of the book. It's also the same process I use today, sometimes multiple times a day, as part of my role at Huntress. Any demonstration of RegRipper, or even Events Ripper, as part of the process would fall short, as most analysts do not already follow that same process. 

If you are interested in educational materials associated with RegRipper, I would be very much willing to learn a bit more about what you're looking for, and have a conversation pursuant to those needs. Feel free to reach to me on LinkedIn, or via email.

Monday, July 08, 2024

What is "Events Ripper"?

I posted to LinkedIn recently (see figure 1), sharing the value I'd continued to derive from Events Ripper, a tool I'd written largely for my own use some time ago.

Fig. 1: LinkedIn post











From the comments to this and other LinkedIn posts regarding Events Ripper, I can see that there's still some confusion about the tool...what it is, what it does, what it's for.

Now, I've posted about Events Ripper a number of times on this blog since I released it about 2 years ago, and those posts are trivial to find, including the post illustrated in figure 2.

Fig. 2: 30 Sept 2022 Blog Post















The point is that these blog posts are trivial to find. For example, while I've posted a number of "Events Ripper Update" blog posts over the past 2 years, here's a really good example of a post from October, 2022 that includes a great deal of content regarding the use of Events Ripper. So, in addition to the repo readme file, there's a good bit of info available, and I'm more than happy to answer any questions folks may have. 

To allay any lingering confusion, let's talk a little bit about what Events Ripper is not, then a bit about what it is, and how it works.

What It Is Not
Let's start with what it's not...Events Ripper is not an analysis tool, nor does it do analysis. We often see this turn of phrase used to describe various tools, stating that they do analysis, and this is simply not the case. This is especially the case for Events Ripper. 

Let me say this again...Events Ripper does not do analysis. Nor does any other tool. Analysis is something an analyst, a human, does, by applying their knowledge and experience to the data before them. Tools can parse, normalize, even decorate, and present data, but it's up to an analyst to make sense of the data and present it in an understandable manner.

If you're not familiar with creating timelines as an investigative resource, and incorporating Windows Event Log records alongside other data sources (file system, Registry, etc.) into an overall timeline, you're not going to see much use for Events Ripper. When I say, "creating timelines", I don't mean what many analysts do with Excel, begrudgingly, at the end of an "investigation". What I mean by "creating timelines" is producing an investigative timeline from multiple data sources as a way to begin, and to facilitate analysis.

If, as an analyst, you believe that there are only three Windows Event Logs of interest...Security, System, and Application...then Events Ripper is not for you. It's not something you want to be using, as it will provide no value to you. 

If you believe that event IDs are unique, and that event ID 4624 only refers to successful login events, then Events Ripper is not for you, it's not something that you'll derive value from using.

If you're not familiar or comfortable with working at the command line, Events Ripper is not a tool you'll find a great deal of value in using.

What Is Events Ripper
Events Ripper is a tool intended to facilitate analysis, to identify investigative timeline pivot points and to allow analysts to get to conducting analysis much sooner. The idea is to exploit what analysts have already seen, learned and documented through plugins, to get all analysts on the team (and beyond) to the point where they're actually conducting analysis much faster. 

I look into endpoints on a daily basis, as part of deeper investigations into malicious activity. Collecting a dozen or fewer Windows Event Log files (the number is usually 9 or 10, under most circumstances) to create a timeline often results in 300K or more lines in the events file, which is the intermediate step to creating an investigative timeline. Again, this number of events is just from Windows Event Log files, so finding anything specific would be akin to finding a needle a huge stack of needles. This is why having a computer look through hundreds of thousands of events, extracting items of interest, is so much more efficient and doing to manually, either by opening individual .evtx files in Event Viewer, or by searching and tabbing through the results.

Even when I'm looking for just one thing, such as a list of all VHD, IMG, and/or ISO files mounted on the endpoint, it's still a lot of work to parse through one Windows Event Log file and extract that list. While many analysts will download the .evtx file to their analysis workstation, open it locally in Event Viewer, and tab through the events, I'd much rather use available tools to parse and normalize the events in the file, and then use Events Ripper to give me either a simple list, or something a bit more interesting, such a sorted (based on time of 'surfacing') list of mounted files. 

How Does It Work
Straight from the readme file in the Github repo for the tool, you start by parsing your data sources into the 5-field, "TLN" format events file, which is an intermediate file format prior to creating an investigative timeline. You then create your timeline (I do), and then run Events Ripper against the events file. This is just a text-based file that contains the events that will be parsed into the investigative timeline, with one event per line. 

If you're looking for something specific, such as mounted ISO files, you can choose to simply run a single plugin (in this case, mount.pl). If you're working on something a bit more expansive, run all off the plugins, redirecting the results to a single text file for reference. Just follow the examples in the readme file. 

Event Ripper is incredibly versatile. For example, if I'm working on an incident that involves processes run as child processes of sqlservr.exe, I'll get a copy of the Application Event Log, parse it, and run the mssql.pl plugin against it. If the output of that plugin tells me that there are multiple instances where the xp_cmdshell stored procedure was enabled, I can then go back to the events file, and create an "overlay" or micro-timeline of just those events, using the following command line:

C:\data>type events.txt | find "xp_cmdshell" > x_events.txt

I can then create the timeline using the following command:

C:\tools>parse -f c:\data\x_events.txt > c:\data\x_tln.txt

I know have an "overlay" timeline of just those events that contain references to the stored procedure, and very often these events will correspond to malicious use of the stored procedure to run commands on the endpoint with the privileges of the MSSQL instance (usually SYSTEM). 

Using simple DOS-based tools and commands, such as "find", "findstr", and redirection operators (all stuff I learned from using MS DOS 3.3 and beyond), I can create investigative timelines and case notes to thoroughly document my findings, facilitating analysis and getting me to results in a quick, efficient manner. This leads to incident scoping, as well as threat actor profiling, and developing detections, protection mechanisms, and documenting control efficacy. 

Tools like Events Ripper also provide a phenomenal means for documenting, retaining, and building on "corporate knowledge"; this applies equally well to both internal and consulting teams. If one analyst sees something, there's no reason why they can't develop and document it in a plugin, and share it with others so that now, other analysts can benefit from the knowledge without having to have had the experience. 

Conclusion
Again, Events Ripper does not do analysis...no tool does, regardless of what folks say about it. What Events Ripper does is facilitates analysis by allowing analysts to document their findings in a reproduceable manner, in a way that other analysts can exploit the knowledge without having to have the same analysis experience. 

Monday, June 03, 2024

The Myth of "Fileless" Malware

Is "fileless" malware really fileless?

Now, don't get me wrong...I get what those who use this term are trying to say; that is, the actual malware itself, the malicious code, does not exist as a file on the local hard drive. However, for the uninitiated, the use of the term "fileless" is misleading, because in order for the things to happen and for the malware to persist, there has to be something in a file somewhere on the drive. Otherwise, what's the point?

Yes, threat actors can release code that had a devastating, even catastrophic effect without persisting on an endpoint. This is not in question. 

However, the term "fileless" can imply to the uninitiated reader that files are not used at all, and this simply is not the case. This is important to understand, as this allows us to develop appropriate protections, detections, and responses for this kind of malware. Understanding this also means allows us to leverage DFIR skill sets to learn more about the threat actor leveraging various techniques to "remain fileless".

Sometimes, we just get the descriptions of the malware wrong. In Prevailion's DarkWatchman write-up, at the bottom of pg, 3, the authors state, "Various parts of DarkWatchman, including configuration strings and the keylogger itself, are stored in the registry to avoid writing to disk" (emphasis added), as illustrated in figure 1.

Fig. 1: Excerpt from DarkWatchman Write-up







The Registry is on disk, and is, at it's most basic, a file. Even though the Registry has a binary (rather than ASCII) format, and may be considered a "file system within a file", that does not mean that something that stores its configuration and/or persistence in the Registry is not using a file.

On pg 10, the same write-up states that persistence is achieved via a Scheduled Task, which is contained both in the Registry (a file), as well as within an XML file. Even using various techniques to hide the Scheduled Task XML file from view still means that the configuration and reference to the malware is stored within a file.

Reliaquest's recent article on LotL and "fileless" malware describes "fileless" malware as "running from memory or scripts...instead of executables". Even this definition can be confusing, as a script is a file on disk. 

Yes, I understand that intent is to say that the script isn't the actual malware itself, but instead reaches out off-system to download the actual malware, and then run it in memory so that ideally, the malicious code never touches the disk. Yes, I get it...but it's still misleading and without the appropriate review, the impression is that this malware is incredibly difficult, if not impossible to detect and respond to without a very specific set of tools. 

The article starts off by saying that fileless malware is "harder to detect than traditional malware because it relies solely in memory." I'm not sure that's the case; at work, we detect a lot of SocGholish (referenced in the statement as an example) and other fileless malware, without requiring something that scans memory. 

The article also states that fileless malware "manipulates the command lines of trusted applications...allowing malicious activities to blend in with normal, authorized operations..."; I'm unclear as to what this means, and confused as to how it makes fileless malware harder to detect. In fact, in my experience, it's quite the opposite...if you're monitoring an organization with the appropriate visibility, modified command lines for known-good applications should stand out pretty clearly.

Something else to consider is that not all organizations actually use LOLBins (or LotL) on a regular basis. I've seen organizations that don't use curl.exe or certutil.exe at all. I've worked with customers who, by policy, do not use "net user" to create and manage user accounts. As such, use of these LotL techniques will not remain hidden, particularly if you're looking. How they will remain hidden is if you haven't employed some modicum of visibility, by enabling "Process Tracking" along with full command lines, or Sysmon, or EDR, or some form of AV such as Windows Defender.

The Reliaquest article also discusses threat actor use of LotL techniques, also referred to as LOLBins or LOLBAS to remain stealthy and minimize their impact within a compromised infrastructure. However, this can be a bit misleading. The article indicates that threat actors prefer LotL techniques given the "lack of IoCs" associated with the use of these techniques, and this simply isn't the case, when, in fact, the use of native utilities such as msiexec.exe actually produces quite a few IoCs that can be very valuable to the responders and investigators. And this is simply how the operating system works, without any additional EDR or EDR-like capabilities.

From the Technical Breakdown of the first case study in the Reliaquest article, the SocGholish malware performs the actions illustrated in figure 2.

Fig. 2: First Case Study







How often do users or even admins run "cmdkey /list" within your organization? How about the "net" commands, or "gwmi"? If you have the necessary visibility, and know this, then you can also assess control efficacy, and possibly even use 

So, What?
Who cares, right? Someone calls something "fileless", and whether it really is or not, doesn't matter, does it?

The insistence on the use of the term "fileless" tends to imply that only so much can be done about the malware. After all, just look at the Reliaquest article. The implication is that the malware exists only in memory, so the options for detecting, responding to and analyzing the malware are extremely limited. 

However, this simply is not the case. For the malware to truly be effective, it has to persist in some manner, and whatever that is can be used to hunt for, detect and respond to, and possibly even prevent the malware from impacting the endpoint. Understanding the details of how the malware arrives on the endpoint, and how it persists, allows organizations to assess their own control efficacy and determine how best to address the issue. So while the actual malicious code itself may not be directly detected without some sort of memory-based detection, the precursors, effects/impacts, and follow-on activities tied directly to and associated with the malware can all be detected, because they will "appear" in a detectable manner. This will depend upon your aperture and visibility, of course, but the fact remains that they will be detectable. They could be processes, code retrieved from Registry values, Scheduled Tasks, or even other endpoint impacts recorded in the Windows Event Log, but they will be detectable.

Given the amount of "fileless" malware organizations see with the initial download capability embedded within the Windows Registry, I added the capability of running Yara rules to RegRipper 4.0, in hopes that folks would use that capability to help detect such things.

Addendum, 5 June:
Not long after I published this blog post, John Hammond shared a post on LinkedIn, as shown in figure 3.

Fig. 3 - John's LinkedIn post















Again, the underlined phrase refers to the fact that rather than writing a malware EXE to a value (or values) in the Registry, a "fileless payload" is written instead. This can be confusing because, depending upon the path, "HKEY_CURRENT_USER" can refer to either the NTUSER.DAT or the USRCLASS.DAT file in the user's profile.