Thursday, June 30, 2005

Reading Windows files in binmode()

For the past couple of days, I've been writing Perl scripts to parse binary data on Windows systems...I've been staring at 1s and 0s, hex, parsing strings, etc.

My first exercise was to parse PE headers...and the script works very well for legit PE files. By reviewing the information available on MSDN, and correlating that information with other sources (sorry, guys, but there are some holes in your docs!), I have been able to parse all the way down paste the data directories and into the section headers. Very cool! There's still a lot of work to do to make this script really useful, but developing it has really been very beneficial in understanding PE headers and malware.

Now, I'm working on a Perl script to parse .evt files manually, by opening the file in binmode() and parsing the byte stream. The problem I'm having is that even though the EVENTLOGRECORD structure is well documented at the MSDN site, I have not been able to find any information about the data located between offset 0 of the file, and the offset of the first record (which itself seems variable, depending on log type, operating system, etc.). Byte alignment is important, so I know that the API has some inherent method for locating the various records. However, I'm trying to read in the file, basically, a byte at a time...does anyone have any information about the .evt file header info? I'd like to figure out how to parse and make sense of this data.

While this whole thing seems like a pointless exercise, there is a method to my madness. For example, I can use scripts like this with ProDiscover 4.0 to reduce the time it takes to analyze a system. If you know what it is you're looking for, you can automate the activity, increasing efficiency, reducing mistakes, etc. So let's say I have a ProScript (ProDiscover uses a Perl module called ProScript to implement Perl as its scripting language) that I use to identify can copy files from an image. I can then automate scripts using...you guessed it...Perl in order to help me find only the suspicious things. This is referred to as "data reduction". By using the ProScript API, I can automate adding this information to my reports, as well.

No comments: