Monday, July 31, 2006

Extracting Executable Images from RAM Dumps

In his Cyberspeak podcast interview, Jesse Kornblum mentioned the tool I'd recently released called, for LiSt Process Image.

As of this morning, shows that it's been downloaded 32 times. If you're reading this, and you're one of the folks who downloaded and has tried it, I'd greatly appreciate any comments you may have.

"Genius" Kornblum on fuzzy hashing

Jesse Kornblum was recently interviewed by the CyberSpeak podcast guys with regards to his "fuzzy hashing" paper to be presented at DFRWS.

Jesse, the author of hashing tools such as md5-, tiger-, and whirlpooldeep, has come up with something called "fuzzy hashing" which can be used to combat the approach that's being taken to obfuscate files by making small changes, as with Word documents (intellectual property crime) and images ('nuff said).

Jesse clearly explains the concept behind "fuzzy hashing" in his a nutshell, if you have two similar bitstreams (ie, JPEGs, GIFs, etc.) that have small changes, his tool (dubbed "SSDeep") will be able to tell you if they are similar...whereas tools like MD5Deep will tell you if the tools are exactly the same via a mathematical algorithm.

So what are the applications of something like this? Well, first off, it's not meant to find evidence; instead, it's an awesome data reduction tool. One of the examples Jesse used in his interview is Word documents that are printed out. When you print out a Word document, the time that the document was last printed is modified within the document an MD5 hash generated for the original document will not match the one generated for the printed document. The bitstreams are essentially the same, with some small modifications, and Jesse says that his tool will let you know that the two files are similar.

There are many other applications for this tool, to include image identification, intellectual property theft, etc. So, go on over to Cyberspeak and give the podcast a listen. If you see Jesse at DFRWS or GMU2006 or HTCIA, say hi, and buy him a beer!

Interestingly, Jesse is presenting on Windows memory analysis at HTCIA (end of Oct)...I'll be presenting on the subject at GMU2006. Jesse's also presenting at GMU2006.

BTW...Jesse, if you're reading this...thanks for the shout-out in your Cyberspeak interview!

Thursday, July 20, 2006

LiSt Process Image upload

I've uploaded lspi 0.4 to my SourceForge site, and if you use lsproc or want to be able to automatically extract process image files from RAM dumps (of Windows 2000 systems, at this point), then you definitely want to check this out! This is the "automatic binary extractor" that Andreas mentioned a while back.

This code (the archive contains the Perl script and a Windows EXE "compiled" using sure to keep the included DLL with the EXE at all times) makes use of some of the RAM dump file parsing code from tools like lsproc and lspd, as well as the code from my File::ReadPE module. Like the other tools, this one relies on the output of lsproc for it's input...specifically, the offset to the process in question. It uses pretty much the same method as Andreas mentioned in his first blog entry on reassembling binaries, for no other reason than it just makes sense. After all, why not use the "map" provided to you in the PE headers to reassemble the executable image file.

Let's take a look at an example, focusing on the same process that Andreas looked at...dd.exe with a PID of 284, from the first memory dump from the DFRWS 2005 Memory Challenge. The output of lsproc gives us:

Proc 1112 284 dd.exe 0x0414dd60 Sun Jun 5 14:53:42 2005

Very cool. So, using the offset to the process within the dump file, we can then launch lspi with the following command line:

C:\Perl> d:\hacking\dfrws-mem1.dmp 0x0414dd60

What we get as output is:

lspi - list Windows 2000 process image (v.0.4 - 20060721)
Ex: lspi

Process Name : dd.exe
PID : 284
DTB : 0x01d9e000
PEB : 0x7ffdf000 (0x02c2d000)
ImgBaseAddr : 0x00400000 (0x00fee000)

e_lfanew = 0xe8
NT Header = 0x4550

Reading the Image File Header
Sections = 4
Opt Header Size = 0x000000e0 (224 bytes)


Reading the Image Optional Header

Opt Header Magic = 0x10b
Entry Pt Addr : 0x00006bda
Image Base : 0x00400000
File Align : 0x00001000

Reading the Image Data Directory information

Data Directory RVA Size
-------------- --- ----
ResourceTable 0x0000d000 0x00000430
DebugTable 0x00000000 0x00000000
BaseRelocTable 0x00000000 0x00000000
DelayImportDesc 0x0000af7c 0x000000a0
TLSTable 0x00000000 0x00000000
GlobalPtrReg 0x00000000 0x00000000
ArchSpecific 0x00000000 0x00000000
CLIHeader 0x00000000 0x00000000
LoadConfigTable 0x00000000 0x00000000
ExceptionTable 0x00000000 0x00000000
ImportTable 0x0000b25c 0x000000a0
unused 0x00000000 0x00000000
BoundImportTable 0x00000000 0x00000000
ExportTable 0x00000000 0x00000000
CertificateTable 0x00000000 0x00000000
IAT 0x00007000 0x00000210

Reading Image Section Header information

Name Virt Sz Virt Addr rData Ofs rData Sz Char
---- ------- --------- --------- -------- ----
.text 0x00005ee0 0x00001000 0x00001000 0x00006000 0x60000020
.data 0x000002fc 0x0000c000 0x0000c000 0x00001000 0xc0000040
.rsrc 0x00000430 0x0000d000 0x0000d000 0x00001000 0x40000040
.rdata 0x00004cfa 0x00007000 0x00007000 0x00005000 0x40000040

Reassembling image file into dd.exe.img
Bytes written = 57344
New file size = 57344

Most of this is the PE header being parsed and displayed.

Now, a couple of caveats...first, this code ONLY works for RAM dumps from Windows 2000 systems (yes, I have been getting a lot of questions about that). Second, this does not take PAE into account. Nor does it take special cases into account, such as UPX compressed executables.

However, the code does check for things like pages that are paged out to the pagefile. Sounds kind of circular, I know, but basically, the pages in RAM that aren't being used can be paged out to pagefile.sys...since we're just dealing with a memory dump, we're assuming at this point that the pages in the pagefile aren't available.

Also, you will notice that the resulting executable image, once renamed to an .exe, does not run the way the original does. This is due to the fact that once the various sections are loaded into memory, some of the values in the different sections will change as the code is being run.

So, how is something like this useful? Well, if you're into malware analysis, or if you're performing incident response against an as-yet-unknown bit of code, this would be helpful.

Of course, there are things that need to be added to this code, as it performs only parsing. For example, some actual analysis would be in analyzing the code at the entry point address, performing some analysis of the sections, extracting the import address (or name) tables, etc.

As always, comments are welcome...

Monday, July 17, 2006

Automatic binary reassembly from a RAM dump

A bit ago, Andreas Schuster posted to his blog about reconstructing executable images from a RAM dump (1, 2, 3). The first method listed not only made the most sense to me at the time, but was also easy to implement, as it dovetailed right off of some code I'd written to parse PE file headers. As I blogged about earlier, getting the PE header information from the ImageBaseAddress offset is just the beginning.

I'm posting now to say that I recently got the code for automatic reassembly of executable images working. My results are identical to those shown in Andreas's first blog post. I suspect that Andreas may be correct in his third post that the reassembled binary won't run correctly due to changes in code as the program runs. However, there are other conditions to consider...for instance, what if pages from the image have been paged out to the pagefile?

After extracting the binary image for dd.exe from the first RAM dump used in the DFRWS 2005 Memory Challenge, I ran a script to extract the PE header info, and compared that to the PE header info for the original copy of the executable, retrieved from George Garner's site. Visually, the headers are identical. I've since used Jesse Kornblum's md5deep tool to verify that the .rsrc sections from both files (original and reconstructed) are identical.

More testing needs to be done, and the code needs to be cleaned up and brought in line with the other code already available. Once I've completed this, I'm going to redesign the format of the tools to be better suited to identifying the OS of the RAM dump, and then progressing on from there. At this point, like the other tools currently available, this code only works for RAM dumps from Windows 2000 systems.

You may be saying to yourself..."Yeah...and?" Some folks ask me why this kind of thing is important. Well, for one, if you're performing live response and suspect that there may be malware or a rootkit, you may want to actually get the executable for analysis. This may also be useful during dynamic analysis of malware, in which obfuscated malware is decompressed/decrypted when in memory.

Wednesday, July 12, 2006

Forensics Magazine

A bit ago, I blogged about cool technical e-zines that are available. Since then, CheckMate really seems to have come along, and I've also found some well-written content in the CodeBreakers Journal (be sure to check out the Magazine, as well).

While most of these e-zines are technical in nature, there don't seem to be many specific to forensic analysis. My first question there any interest in such a thing? My thoughts are that such an e-zine would cover more than just forensic analysis of Windows systems, and would include topics in live response, legal issues, as well as (potentially) case studies, how-tos, etc.

Now, I know that there are several journals out there now, such as the DIJ and the IJDE, but I'm thinking of something a little more practical, down-and-dirty (though the article on iPod forensics from the IJDE is a lot like what I'm thinking of). If you're not familiar with it, check out SysAdmin Magazine. I like the format of this magazine because a lot of times, the articles don't simply refer to something being done...they actually provide the tools (be it a link to an executable, a shell script, etc.) to accomplish the task, and enough explanation for the reader to customize the script/process. there interest? If so, what would you like to see in such an e-zine? Or do you think that there's already enough magazines, journals and e-zines out there, and the last thing you want to see is another one?

Tuesday, July 11, 2006

Have you seen...?

I'm looking for a couple of things before I get started down the road in collecting my own data regarding forensic artifacts, so I thought I'd turn to the community at large to see what's already out there...

Specifically, I'm looking for credible sources regarding P2P application artifacts, including evidence of installation, searches, downloads, etc., within the Windows file system and Registry.

I'm also interested in forensic artifacts on Windows systems for popular steganography tools.

Has anyone seen any credible sites out there regarding these two areas?

Book FAQ

In putting together my next book, one of the things I've put a lot of thought into is providing examples, as well as exercises. For example, I haven't found it very effective to say "run tool X against file Y"...instead, I'll provide a detailed walk-through (sort of like a mini case study) on how to do something, and then provide sample files that the reader can then run the tools/process against.

One thing that's been pretty consistent since my first book came out is folks wanting to know how to do specific tasks; how do I find movie files?, or how do I do X? For some of these specific, fairly frequent questions, I've been considering providing a FAQ in the book to answer/address them.

If you have questions like this, that you'd like see addressed, drop me a line and let me know. Please keep the questions specific to the forensic analysis of Windows or post-mortem.