Wednesday, October 26, 2005

Peeking inside Word documents

I was chatting with someone yesterday who asked me where I found the files I used to test the Word Metadata Dumper, and I simply said, "Google."

Yep, that's right...I just Googled for Word documents from the .mil domain, as well as the .gov domain. Wanna know how to search Google for all sorts of other goodies? Check out Johnny Long's web site, or grab a copy of his book on "Google hacking".

Want to get a little up-close-and-personal with someone else, maybe even someone else you don't know? Remember the "Extreme File Sharing" post from Security Fix? I'd tried it and found some of the very same things...files left behind by malware with keylogger capability, etc.

Tuesday, October 25, 2005

Sleuthkit on Windows

Hey, guess what I did last night!! I installed Sleuthkit and Autopsy on Windows XP!

For you Linux and *BSD gurus who felt a nauseating disturbance in the Force...it wasn't that burrito you had...it was me!

Okay, okay...it wasn't just me...I had help.

When I'm working with things on my home systems and looking at forensic analysis, I like to use ProDiscover to grab a dd image of a VMWare session, my thumb drive, etc. As it turns out, I had a 5GB image of an XP Pro system, so I copied that over to the evidence locker and fired up Autopsy. I didn't run completely through many of the things that I could have done, because it would take some time to do so...but as far as the things I did try, they worked great.

Don't have an image of your own to play with? Well, the instructions for installing Sleuthkit and Autopsy on Windows also has instructions for how to image a floppy drive...so, you can entertain and amaze your friends by recovering deleted files! Or, you can go to the Digitial Forensics Tool Testing site and grab an image or two to work with.

My hat's off to Brian Carrier, for having created these tools.

Monday, October 24, 2005

Perl for Forensics

Perl is freely available, and Perl scripts are essentially open source. Perl is used by forensics products such as The Sleuthkit and ProDiscover. Perl is used in the Metasploit framework. Perl is great for automating repetitive tasks, parsing files...and it's free. Perl runs on Windows, most Unices, the Mac, and a plethora of other platforms.

O'Reilly has a ton of books on Perl...from how to program to how to use Perl for a variety of tasks.

So my question is, how useful would a book on using Perl for forensics be to you? Say, a reference tome that discusses:
  • Collecting live/volatile data using Perl
  • Correlating data from multiple sources using Perl
  • Analyzing data, or presenting data for analysis
  • Analyzing file formats (retrieving metadata, etc.)

Obviously, a book like this should include copies of all code used or mentioned in the book. As ProDiscover uses Perl as it's scripting language, a book such as this should also include a variety of "ProScripts". The book should also include not only the files analyzed in the book, but additional example files that the reader can explore and practice on...perhaps even an image of drive to examine.

Is this a book you'd be interested in? If so, what would you like to see? What topics do you think should be covered? How would you envision such a book, particularly as something that you'd pick up off of a shelf at a bookstore and decide to purchase? What do you see as the market for such a book?

Saturday, October 22, 2005

VMWare Playa

As a user of VMWare, I received an email the other day that mentioned a new, free product called the VMWare Player. This is a free product that allows you to play a single VMWare virtual machine on Windows or Linux (rpm and tar versions available for Linux). Very cool. Want to share tools, and other stuff that you may not have been able to share before? Want to try out Windows or Linux, but didn't want to shell out the almost $200 for VMWare Workstation?

VMWare also provides some pre-built virtual machines for you to download. What good is a player if you don't have something to play? One is a browser appliance, which you can use for safer web surfing.

This was also picked up by TaoSecurity, along with some comments from readers of that blog, and a link to a chart showing differences in functionality between the Player and other VMWare products.

Word metadata code posted

All,

I've posted the code that I mentioned in my previous post on Word metadata. This code produces the output seen in the blog entry.

The code is commented, including how to obtain the necessary modules if you're using ActiveState Perl. The PPM commands look like this:

ppm install OLE-Storage
ppm install Startup
ppm install Unicode-Map

Give the code a shot, and let me know what you think. As I said in my earlier post, I'm working on producing a standalone EXE via Perl2Exe, for Windows users.

Sunday, October 16, 2005

Yet, even more on Word metadata

While awaiting information on the binary format of shortcut (LNK) files, I decided to try to learn more about structured storage and metadata in Word documents. The best example I've seen of that describes some of the metadata in Word documents is available at the Computerbytesman site, and addresses an issue that Tony Blair's government had a while back. While I was researching my book, Richard Smith was kind enough to share his code for retrieving the last 10 author's for within the Word document with me. Since that time, I've thought about taking another look at the sort of metadata that one can retrieve from within a Word document.

I included a Perl script for retrieving Word metadata with my book. The code is on the CD that accompanies the book, in the code directory for chapter 3. The script is called "meta.pl" and uses the Win32::OLE module to create an instance of Word, and use the API to retrieve metadata. Well, as I've seen with the work that I did on reading Event Log files, the API doesn't always get everything. Also, I've been looking for something a little more platform-independant.

Thanks to Richard Smith, I dug into the OLE::Storage module a bit, and found exactly what I was looking for. First, a quick caveat...the POD for this module, as well as some of the supporting modules, is a bit out of date. However, by using some of the accompanying examples (such as ldat, written by Martin Schwartz, copyright '96-'97) and simply trying some things out, I was able to figure things out. So the script uses that module, and a couple of others...but only after it opens the file in binary mode to retrieve other information from the file.

Okay, on to the output. I started with the Blair document from the Computerbytesman site, and got the same information (I didn't include the VBA Macro information, though). I downloaded a couple of arbitrary Word documents from the Web, via Google, and found some interesting info:

--------------------
Statistics
--------------------
File = d:\cd\wd\04_007.doc
Size = 322560 bytes
Magic = 0xa5ec (Word 8.0)
Version = 193
LangID = English (US)

Document has picture(s).

Document was created on Windows.

Magic Created : MS Word 97
Magic Revised : MS Word 97

--------------------
Last Author(s) Info
--------------------
1 : Susan and Shawn Sutherland :
2 : Susan and Shawn Sutherland :
3 : Susan and Shawn Sutherland :
4 : Susan and Shawn Sutherland :
5 : picketb :
6 : padilld :
7 : ONR :
8 : John T. McCain :
9 : horvats :
10 : arbaizd :

--------------------
Summary Information
--------------------
Title : I
Subject :
Authress : PICKETB
LastAuth : arbaizd
RevNum : 2
AppName : Microsoft Word 10.0
Created : 08.12.2003, 16:11:00
Last Saved : 08.12.2003, 16:11:00
Last Printed : 08.12.2003, 16:11:00

--------------------
Document Summary Information
--------------------
Organization : Office of Naval Research

Pretty cool, eh? Again, I found this document on the web. From my previous post, I asked some folks to send me documents written on the Mac platform, and I received a couple. Here's what the output looks like:

--------------------
Statistics
--------------------
File = d:\cd\wd\ex1.doc
Size = 21504 bytes
Magic = 0xa5ec (Word 8.0)
Version = 193
LangID = English (US)

Document was created on a Mac.
File was last saved on a Mac.

Magic Created : Word 98 Mac
Magic Revised : Word 98 Mac

--------------------
Last Author(s) Info
--------------------
1 : : Macintosh HD:Users:name:Desktop:Ex1.doc

--------------------
Summary Information
--------------------
Title : The quick brown fox jumps over the lazy dog
Subject :
Authress : name
LastAuth : name
RevNum : 1
AppName : Microsoft Word 10.1
Created : 12.10.2005, 02:51:00
Last Saved : 12.10.2005, 02:58:00
Last Printed :

--------------------
Document Summary Information
--------------------
Organization :

Okay, I made a couple of obvious changes, but the point is that there is information within the binary contents of the file information block (FIB) that tells you the platform that a document was created on...for example, if it was created on a Mac, or on a Windows platform. Pretty cool, eh?

So...what do you think? I'll be posting the script soon, along with a couple of other scripts...for example, I'm going to include one that I used for troubleshooting, which simply writes all of the structured storage streams to files on the system. After all, MS describes structured storage as "a file system within a file", so wouldn't you like to see the contents of each of those files? I'm not entirely sure of the usefulness of this with regards to forensic analysis, but someone might find it useful.

An offshoot of all this involves the MergeStreams application (here's something I found at UTulsa) that I've used in some of my presentations. This application allows you to merge an Excel spreadsheet into a Word document, resulting in a much larger, but otherwise unchanged Word doc. However, if you change the resulting file's extension to ".xls", and double-click on it, you'll see the entire, unmodified contents of the spreadsheet. This is due to the streams being merged, and handled by the appropriate application (no, this is not steganography!!). Whenever I've presented on this, I've been asked how this sort of thing can be detected, and up until now, the only solutions I've been able to come up with have include the use of 'strings' and 'find'. With this module, however, you can dump the names of the streams from an OLE document, and if you see a stream named "Workbook" inside a Word document, you can be pretty sure that you've got an embedded document. This is a more accurate method than using 'strings'.

I'll be releasing the scripts soon...there are a couple of things I need to clean up, and I'm having a small issue with the compiled EXE version of the main script (above) that I'm trying to clear up.

Thursday, October 13, 2005

Recent rootkit news

I hope you weren't expecting things to stand still...

This past Monday, F-Secure had an entry in their blog about a custom version of Hacker Defender. In this case, "custom" means "private commercial", meaning that someone paid for a specific version of the rootkit. And don't think for an instant that this is the only one...there are other rootkit authors who do the very same thing.

According to the F-Secure blog entry, the version of the rootkit has anti-detection mechanisms. Specifically, it detects modern rootkit detectors via their binary signature, and if it does find one of the detectors, it can modify itself or the detector. F-Secure says that the most recent version of their BlackLight product can detect this rootkit.

This brings up something I saw over on the Incidents.org blog. Handler Patrick Nolan posted an entry about rootkits that run even in safe mode. Yes, that's right...when you try to boot your computer in safe mode (here's the description of Safe Mode for Windows 2000) so that certain Registry keys aren't parsed, such as autostart locations, the rootkit will still launch. Check out this description from Symantec (btw...take a look at everything that last bit of malware does...).

The Registry key you're interested in is:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SafeBoot

On a side note, Autoruns has been updated to v8.22, and includes new functionality. I've run it on my system, and it doesn't seem to check the SafeBoot key mentioned above. However, when running your cases and parsing the Registry files from an image, be sure to add this one this one to your list of keys to check. Remember, though...on an image, the correct path would be "ControlSet00x", rather than "CurrentControlSet".

Addendum 14 Oct: I caught an interesting item on the Spire Security ViewPoint blog this morning...there's a link to a VNUNet article (ZDNet's version) that mentions three guys in the Netherlands who got busted with a 'botnet army of 100K nodes/zombies. The bots were evidently W32.ToxBot, which Symantec states has "0-49" infections. In all fairness, though, Symantec's definition of "number of infections" is "Measures the number of computers known to be infected". This leads me over to a post on the TaoSecurity blog about digital security, and the differences between the real, "analog" world and the indicators of engineering failures, and those in the digital world. I can't imagine that all 100K of the zombies infected with W32.ToxBot were simply home user systems. It's entirely possible that many of them were academic and corporate systems...and in the case of the corporate systems, someone should have realized that something was going on.

I've dealt with incidents in the past in which admin machines were infected. When I was the security admin at a financial services company, I had a script that would pull down the most recent IIS 4.0 web server logs from a system that we had (and that I'd locked down, in part by removing all but the necessary script mappings) and parse out the other-than-ordinary entries. Over the course of a couple of days, I noticed Nimda scans from the same IP address. So, I did a lookup of the IP space to see who owned it, and in the end, I got lucky. The infected system was owned by the administrator, who was also the technical contact for the domain. I talked to him via the phone...he stated that he didn't realize that he'd had a web server on his system, and didn't know that his system was infected with Nimda (had been for several days), but once he started receiving calls (mine wasn't the first), he really had no idea what do to about it.

Okay...back to our little 'bot. Take a look at the Symantec write-up for the 'bot, in particular these items:
  • Installs as a service, and oddly enough, it actually writes a description for the service, as well
  • Besides the Registry keys for the service, it adds entries under "Control\SafeBoot\Minimal" and "Control\SafeBoot\Network" so that it is initiated even if the system is booted to Safe Mode
  • It looks for the "HKLM\Software\VMware" key, and doesn't run if it finds it (Note: this same technique was used in SotM 32)

Nothing in the write-up indicates the use of rootkit capabilities, but from the capabilities this bot does have...wow. How much harder would it have been for normal admins to detect it if it did have rootkit capabilities (ie, the use of rdriv.sys, for example)?

CA's write up on the ToxBot family

Addendum 21 Oct: VNUNet posted an article 2 days ago, announcing that rootkit creators have gone professional, selling custom versions of their software. While "creators" is plural, there is only one such rootkit announced in the article. This was /.'d, as well. Contrary to what the author of the article would have you believe, this is NOT new.

Tuesday, October 11, 2005

More on Word Metadata

This past summer, I gave a couple of presentations, one that covered file metadata. I got to thinking...I've parsed Event Log files and Registry files in binary format...why not do the same with Word documents and see what else is in there besides what the MS API is telling me? After all, a particular value that references "hidden" data may be set to 0 (or 0x0000), but the actual data itself may still be there.

Remember the issue with Blair's gov't? When I found this, I tried using the MS API (via OLE) to retrieve the metadata concerning the last 10 authors of the file, and I simply could not get it to work. However, Richard Smith had no trouble doing so.

I started looking around and found the MS Word 97 binary file format (BFF) (here in HTML). I haven't had any trouble parsing the file information block, but what I am having a bit of trouble doing is locating the beginning of the table stream. Many of the values I'm interested in are listed as "offset in the table stream", indicating (to me, anyway) that the offset is from the beginning of the table stream.

If anyone has any information on this, I'd greatly appreciate some help with this.

Also, for testing/verification purposes, I was wondering if anyone out there with a Mac would do me a favor and create a couple of simple Word documents on that platform, zip them up, and send them to me. Some of the metadata within the Word document tells you whether the file was created or revised on a Mac. When you send the files, if you could specify the platform and version numbers (of the os and the application), I'd appreciate it. Thanks!

Monday, October 10, 2005

Perl Programming on Windows

Let's see a show of hands for everyone out there who uses programs Perl on Windows systems. Okay, thank you...please put your hands down. Now, how many of you use Perl to manage Windows systems? Okay, thank you very much.

Now...how many of you want to use Perl to manage your systems?

Whether you're an experienced Perl programmer and not familiar with Windows, or you're a Windows admin and don't know much about Perl, let me the first to tell you...Perl is a very powerful tool that you can learn to use, and use to harness your infrastructure.

Books like Learning Perl and Learning Perl on Win32 Systems will get you started. Even Advanced Perl Programming and Perl for System Administration can help. Dave Roth's books and web site can help. But to really get into the guts of what you can do, you need to (in the words of Nike), just do it.

At it's simplest, Perl can be used to automate tasks. Using Perl, you can create a Scheduled Task that reports certain information and has it waiting and available when the sysadmin comes in in the morning. Throw in a little error checking, and you will have reports on why some things may not have completed successfully...like systems being turned off, services not being available, etc. What would you like to do? Run nmap? Not a problem. Run it against your systems first thing in the morning, or over lunch, and have the output written to a file on your system. Once that's done, use Nmap::Parser to sort through the data and create reports. Great for sysadmins, pen testers, and security analysts running vulnerability assessments.

Perl can be used to implement WMI, and collect information from managed systems. Many of the tools I have available on my web site implement WMI. Using WMI, you can scan remote systems for processes, services, and even perform software inventory scanning from a continent away. Or how about reaching out across the country to locate rogue WAPs via managed Windows XP systems?

Perl is a very powerful tool that can harnessed to automate a wide variety of tasks performed by sysadmins, as well as security analysts. Data collection and parsing, as well as some modicum of analysis, can all be easily automated. Some of the things I use Perl for include:
  • Retrieve data from deep within the local system, or from remote systems
  • Parse binary files, based on structure documentation, knowing what each DWORD means, etc. (ie, PE header analysis, Event Log and Registry parsing, etc.)
  • Retrieve metadata from files (ie, Word/Excel docs, JPGs, PDF files, Prefetch files, etc.)
  • Querying service information
  • Data correlation across multiple sources (ie, Registry, files, etc.)
  • Automation of information discovery in ProDiscover IR
A side effect of all this is that you end up learning how Windows systems function by themselves, as well as within a domain. If you're automating a task, you end up learning a great deal about the task and the issue that the task addresses, as well.

If this is something you're interested in, drop me a line, post a comment, etc.

Tuesday, October 04, 2005

Book Report

I haven't blogged in a while, and I came across something worth blogging about. While I don't have the actual numbers in front of me, I've received word from my publisher that my book has only shipped 3500 copies domestically since it was published in July, 2004. From the numbers I received in April of this year, 3055 of those copies were in the first couple of months.

So what does this mean? I have no idea at this point, other than it doesn't seem to be enough justify another book. That's right...given all the material I've produced in the 15 or so months since the first book was published, I've already started putting another book together - an advanced version of the first one, with more technical, detailed information.

The last bit I got from the publisher is that it's up to me to find out what you, the readers, want in another book in order to get the final, published product to move off of the shelves. From what you've told me so far, it just about amounts to incident response war stories, case studies, and maybe even challenges you can work through. All that I can do...but again, it really doesn't sound promising.

I guess I need to start looking around for another avenue for publication. Fortunately, I got one good pointer at lunch today that I need to follow up on...

Addendum 5 Oct: I thought maybe I should give a brief description of what I was looking to provide in the next book. I wasn't planning for my next effort to be a second edition of the first...rather, my thought was to use the first as a stepping stone and launch off into a more advanced effort. I'd like to go more deeply into actual forensic analysis, with the focus being on analysis. Too many times, I've read papers and books that talk about analysis, and for the most part will only go so far as to say "run this tool, and if you see this in the output, something may be amiss..." I'd like to address data correlation and analysis, across the board...use multiple sources of information (i.e., file system, Registry, Event Log, etc.) to build out as complete a view of the issue as possible. I think that the best way to do that is to present the information, and then present examples via live case studies. This book would be interspersed with "war stories", case studies, and examples. I'd also like to include challenges, and exercises for the reader to work. This one would cover both live and post-mortem analysis.

If you've followed this blog, you're familiar with some of what's going to appear in the book...the tools I've released, things I've mentioned here (with more detailed research and analysis) will all be part of the book.

What do you think of something like this? Is this a pipe dream, or is it something you'd like to have on your reference shelf?