Monday, January 06, 2025

Carving

Recovering deleted data, or "carving", is an interesting digital forensics topic; I say "interesting" because there are a number of different approaches and techniques that may be valuable, depending upon your goals. 

For example, I've used X-Ways to recover deleted archives from the unallocated space of a web server. A threat actor had moved encrypted archives to the web server, and we'd captured the password they used via EDR telemetry. The carving revealed about a dozen archives, which we opened using the captured password, which allowed our customer to understand what data had been exfil'd, and their risk and exposure. 

But carving can be about more than just recovering files from unallocated space. We can carve files and records from unstructured data, or we can treat 'structured' data as unstructured and attempt to recover records. We did this quite a bit during PCI forensic investigations, and found a much higher level of accuracy/fidelity when we carved for track 1 and 2 data, rather than just credit card numbers. 

We can also carve within files themselves. Several common file formats are essentially databases, and some are described as a "file system within a file". As such, deleted records and data can be recovered from such file formats, if necessary.

I recently ran across a fascinating post from TheDFIRJournal recently, regarding file carving encrypted virtual disks. The premise of the post is that some file encryption/ransomware software does not encrypt entire files, just rather just part of it, for the sake of speed. In the case of virtual disks, a partially encrypted file may mean that, while the disk itself is useable, there may be valuable evidence available within the virtual disk file itself. 

I should note that I did recently see a ransomware deployment that used a "--mode fast" switch at the command line, possibly indicating that the entire file would not be encrypted, but rather only a specific number of bytes of the file. As such, with larger files, such as virtual disks, WEVT files, etc., there might be an opportunity to recover valuable data, so file and record carving techniques would be valuable, depending upon your specific investigative goals.

The premise raised in the article is not unique; in fact, I've run into it before. In 2017, when NotPetya hit, we received a number of system images from customers where the MBR was overwritten. We had someone on our team who could reconstruct the MBR, and we also ran carving for WEVTX records, recovering Security-Auditing/4688 records indicating process creation. The customers had not enabled full command lines being recorded, but we were able to reconstruct enough data to illustrate the sequence of processes specific to the infection and impact. So, having a disk image where the MBR and/or the MFT is overwritten is not a new situation, simply one we haven't encountered recently.

TheDFIRJournal article covers a number of tools, including PhotoRec, scalpel (not currently being maintained), and Willi Ballenthin's EVTXtract. The article also covers Simson Garfinkel's bulk_extractor, but looking at the bulk_extractor Github, there do not appear to be releases for Windows starting with version 2.0. While some folks have stated that bulk_extractor-rec's capabilities have been added to bulk_extractor, that's kind of a moot point, and the latest release of bulk_extractor-rec will have to suffice. 

Addendum, 7 Jan 2025: Thanks to Brian Maloney for sharing that the bulk_extractor 2.0 for Windows CLI tool can be found here.

Also from the article, the author mentioned the use of a customer EVTXParser script, which can be found here. I like this approach, as I'd done something similar with the WinXP/2003 EVT files, where I'd written lfle.pl to parse EVT records from unstructured data, which could include a .EVT file. I wrote this script (a 'compiled' Windows EXE is also available) after finding two complete records embedded in an .EVT file that were not "visible" via the Event Viewer, nor any other tools that started off by reading the file header to determine where the records were located. The script then evolved into something you could run against any data source. While not the fastest tool, at the time it was the only tool available that would take this approach. 

In the past, I've done carving on unallocated space within a disk image, using something like blkls to get the uallocated space into on contiguous file of unstructured data. From there, running tools like bulk_extractor allow for record carving.

I've also has pretty good success running bulk_extractor across memory dumps; this is something I talked about/walked through in my book, Investigating Windows Systems.

Carving can also be done on individual files. For example, in 2013, Mari DeGrazia published a great blog post on recovering deleted data from SQLite databases, and carving Registry hive files for deleted keys and values, as well as examining unallocated space within hive files is something I've been a fan of for quite some time. My thanks go to Jolanta Thomassen for 'cracking the code' on deleted cells within Registry hive files!

Here's a presentation I put together a while back that includes information regarding unallocated space within Registry hive files.

No comments: