From the USBStor key:
The Windows Incident Response Blog is dedicated to the myriad information surrounding and inherent to the topics of IR and digital analysis of Windows systems. This blog provides information in support of my books; "Windows Forensic Analysis" (1st thru 4th editions), "Windows Registry Forensics", as well as the book I co-authored with Cory Altheide, "Digital Forensics with Open Source Tools".
Pages
Thursday, May 26, 2022
USB Device Redux, with Timelines
From the USBStor key:
Tuesday, May 17, 2022
USB Devices Redux
Fig 1: Event ID 1005 |
Fig 2: Event ID 112 |
Fig 3: Event ID 207 |
Fig 4: Event ID 1006, XML view |
Fig 5: Event ID 145 |
HEFC Blog - Sunday Funday, Daily Blog #197
Friday, May 13, 2022
Understanding Data Sources and File Formats
Following on the heels of my previous post regarding file formats and sharing the link to the post on LinkedIn, I had some additional thoughts that would benefit greatly from not blasting those thoughts out as comments to the original post, but instead editing and refining them via this medium.
My first thought was, is it necessary for every analyst to have deep, intimate knowledge of file formats? The answer to that is a resounding "no", because it's simply not possible, and not scalable. There are too many possible file formats for analysts to be familiar with; however, if a few knowledgeable analysts, ones who understand the value of the file format information to DFIR, CTI, etc., document the information and are available to act as resources, then that should suffice. With the format and it's value documented and reviewed/updated regularly, this should act as a powerful resource.
What is necessary is that analysts be familiar enough with data sources and file formats to understand when something is amiss, or when something is not presented by the tools, and know enough recognize that fact. From there, simple troubleshooting steps can allow the analyst to develop a thoughtful, reasoned question, and to seek guidance.
So, why is understanding file formats important for DFIR analysts?
1. Parsing
First, you need to understand how to effectively parse the data. Again, not every analyst needs to be an expert in all file formats - that's simply impossible. But, if you're working on Windows systems, understanding file formats such as the MFT and USN change journal, and how they can be tied together, is important. In fact, it can be critical to correctly answering analysis questions. Many analysts parse these two file separately, but David and Matt's TriForce tool allowed these files (and the $LogFile) to be automatically correlated.
So, do you need to be able to write an OLE parser when so many others already exist? No, not at all. However, we should have enough of an understanding to know that certain tools allow us to parse certain types of files, albeit only to a certain level. We should also have enough understanding of the file format to recognize that not all files that follow the format are necessarily going to have the same content. I know, that sounds somewhat rudimentary, but there are lot of new folks in the DFIR industry who don't have previous experience with older, perhaps less observed file formats.
Having an understanding of the format also allows us to ask better questions, particularly when it comes to troubleshooting an issue with the parser. Was something missed because the parser did not address or handle something, or is it because our "understanding" of the file format is actually a "misunderstanding"?
2. The "Other" Data
Second, understanding file formats provides insight into what other data the file format may contain, such as deleted data, "slack", and metadata. Several file formats emulate file systems; MS describes OLE files as "a file system within a file", and Registry hives are described as a "hierarchal database". Each of these file formats has their own method for addressing deleted content, as well as managing "slack". Further, many file formats maintain metadata that can be used for a variety of purposes. In 2002, an MSWord document contained metadata that came back to haunt Tony Blair's administration. More recently, Registry hive files were found to contain metadata that identified hives as "dirty", prompting further actions from DFIR analysts. Understanding what metadata may be available is also valuable when that metadata is not present, as observed recently in "weaponized" LNK files delivered by Emotet threat actors, in a change of TTPs.
3. Carving
Third, "file carving" has been an important topic since the early days of forensic analysis, and analysts have been asked to recover deleted files from a number of file systems. Recovering, or "carving" deleted files can be arduous and error prone, and if you understand file formats, it may be much more fruitful to carve for file records, rather than the entire file. For example, understanding the file and record structure of Windows 2000 and XP Event Logs (.evt files) allowed records to be recovered from memory and unallocated space, where carving for entire files would yield limited results, if any. In fact, understanding the record structure allowed for complete records to be extracted from "unallocated space" within the .evt files themselves. I used the successfully where an Event Log file header stated that there were 20 records in the file, but I was able to extract 22 complete records from the file.
Even today, the same holds true for other types of "records", including "records" such as Registry key and value nodes, etc. Rather than looking for file headers and then grabbing the subsequent X number of bytes, we can instead look for the smaller records. I've used this approach to extract deleted keys and values from the unallocated space within a Registry hive file, and the same technique can be used for other data sources, as well.
4. What's "In" The File
Finally, understanding the file format will help understand what should and should not be resident in the file. One example I like to look back on occurred during a PCI forensic investigation; an analyst on our team ran our process for searching for credit card numbers (CCNs) and stated in their draft report that CCNs were found "in" a Registry hive file. As this is not something we'd seen previously, this peaked our curiosity, and some of use wanted to take a closer look. It turned out that what had happened was this...the threat actor had compromised the system, and run their process for locating CCNs. At the time, the malware used would (a) dump process memory from the back office server process that managed CCN authorization and processing to a file, (b) parse the process memory dump with a 'compiled' Perl script that included 7 regex's to locate CCNs, and then (c) write the potential CCNs to a text file. The threat actor then compressed, encrypted, and exfiltrated the output file, deleting the original text file. This deleted text file then became part of unallocated space within the file system, and the sectors that comprised the file were available for reallocation.
Later, as the Registry hive file "grew" and new data was added, sectors from the file system were added to the logical structure of the file, and some of those sectors were from the deleted text file. So, while the CCNs were found "in" the logical file structure, they were not actually part of the Registry. The CCN search process we used at the time returned the "hit" as well as the offset within the file; a visual inspection of the file via a hex editor illustrated that the CCNs were not part of the Registry structure, as they were not found to be associated with any key or value nodes.
As such, what at first looked like a new threat actor TTP was really just how the file system worked, which had a significant impact on the message that was delivered to Visa, who "ran" the PCI Council at the time.
Tuesday, May 03, 2022
Putting It All Together
It's great when a plan, or a puzzle, comes together, isn't it?
I'm not just channeling my inner Hannibal Smith...I'm talking about bringing various pieces or elements together to build a cohesive, clear picture, connecting the dots into a cohesive analysis.
To kick this off, Florian had this to say about threat actors moving to using ISO/IMG files as result of Microsoft disabling VBA macros in docs downloaded from the Internet, a change which results in entirely new artifact constellations. After all, a change in TTPs is going to result in changes as to how the system is impacted, and a change in the resultant constellations. So, this sets the stage for our example.
In this case, the first piece of the puzzle is this tweet from Max_Mal_, which points to the BumbleBee campaign (more info from TAG here), described in the Orion Threat Alert. Per the tweet, the infection looks like this:
Zip -> ISO -> LNK -> rundll32.exe (LOLBin) -> Cobalt Strike
This all starts with a zip archive being delivered to or downloaded by the user; however, what's not mentioned or described here are the system impacts. Downloading the archive often (depending upon the process) results in MOTW being "attached" to the zip archive. This tweet thread by Florian Roth includes a couple of resources that discuss MOTW, one of which is an excellent article by Mike Wolfe that provides a really nice explanation and details regarding MOTW. I've been fascinated by NTFS alternate data streams (ADSs) since I first encountered them, in particular how they're used by the OS, as well as by the adversary. As a result, I've been similarly interested in really leveraging MOTW in every way possible.
The other useful component of Florian's thread is this tweet by Nobutaka Mantani regarding MOTW propagation support in archiver software for Windows. This is huge. What it means is that when the ISO file is extracted from the zip archive by one of the software products that supports MOTW propagation, the extracted file "inherits" MOTW, albeit without the same contents as the original MOTW. Rather, the MOTW attached to the ISO file points back to the zip archive. This then gives us a great deal of insight into the origin of the extracted file, even if the zip archive is deleted.
The benefit is that this may provide us with a detection opportunity, something that depends upon the framework and/or approach you're using. For example, it may be a good idea to alert on files being written to suspicious locations (ProgramData folder, user's StartUp folder, etc.) by one of the archiver software packages, where the target file has a MOTW. Or, we can search for such files, via either proactive or DFIR threat hunting, as a means of locating "badness" on systems or within images. Imagine having an automated process that parses the MFT, either from a triage file collection or from an acquired image, and automatically identifies for the analyst all files with MOTW in suspicious locations. MOTW propagation also appears to occur if the downloaded zip archive includes an LNK file (rather than an ISO file), or a batch file, as well. There have been instances where a batch file is extracted from an archive, to the user's StartUp folder, and has an associated MOTW. As such, automating the detection process, via alerts based on EDR telemetry, or via proactive or DFIR hunting, provides for efficiency and consistency in the analysis process.
So, where we once had to deal with weaponized documents, we're now extracting files from an archive, mounting an ISO or IMG file, and accessing the embedded LNK file within the "new volume". All of this results in a completely new artifact constellation, one that we have to understand in order to fully address coming attacks.
Sunday, May 01, 2022
Changes In The Use Of LNK Files
shitemidlist My Computer/C:\/Windows/system32/cmd.exe
**Shell Items Details (times in UTC)**
C:0 M:0 A:0 Windows (9)
C:0 M:0 A:0 system32 (9)
C:0 M:0 A:0 cmd.exe (9)
commandline /v:on /c findstr "rSIPPswjwCtKoZy.*" Password2.doc.lnk > "%tmp%\VEuIqlISMa.vbs" & "%tmp%\VEuIqlISMa.vbs"
iconfilename shell32.dll
hotkey 0x0
showcmd 0x1
***LinkFlags***
HasLinkTargetIDList|IsUnicode|HasArguments|HasIconLocation|HasRelativePath
***PropertyStoreDataBlock***
GUID/ID pairs:
{46588ae2-4cbc-4338-bbfc-139326986dce}/4 SID: S-1-5-21-1499925678-132529631-3571256938-1001
***KnownFolderDataBlock***
GUID : {1ac14e77-02e7-4e5d-b744-2eb1ae5198b7}
Folder: CSIDL_SYSTEM
For example, here's a sample from Astaroth, and here are samples of output from LNK files created using various native means. However, all of these LNK files, as well as the LNK files from figs 5 and 6 of the Mandiant APT29 report, contain these items of metadata. The one Tony found does not.