Friday, January 09, 2015

A Smattering of Links and Other Stuff

Well, 2014 ended with just one submission for the WRA 2/e Contest.  That's unfortunate, but it doesn't alter my effort in updating the book in any way.  For me, this book will be something to look forward to in 2015, and something I'm pretty excited about, in part because I'll be presenting more information about the analysis processes I've been using, processes that have led to some pretty interesting findings regarding various incidents and malware.

In other Registry news, Eric Zimmerman has created an offline Registry parser (in C#) and posted it to GitHub.  Along with Eric's Shellbag Explorer tool, it really seems as if his interest in the Windows Registry is burgeoning, which is great.  Hopefully, this will generate some of the same interest in others.

Willi has a new tool available called process-forest, which parses event ID 4688 records within the Security.evtx file to develop a process tree.  If you have the necessary auditing enabled, and have increased the default size of your Security Event Log, this tool will provide you with some great data!

Tool Updates
Speaking of the Registry, I've made some updates to several of my tools.  One was to extend the plugin by creating a RegRipper plugin ( that could be used to scan any hive file for indications of "fileless" malware, such as Phasebot or Poweliks.  My testing of this plugin was limited, but it works for other search criteria.

I've also made some minor updates to other tools.

Widening the Aperture through Sharing
I was doing some reading recently regarding attribution as a result of DFIR analysis, which included Richard's recent post. From there, I followed the link to Rid and Buchanan paper (PDF), which was fascinating reading.  I took a lot of what I read and applied it to my thinking on threat intel in general, and not just to attribution.  One of the things I've been noodling over is how to widen the aperture on the indicators that we initially collect/have access to, and how that relates to the incident response process.

One of the big things that immediately jumped into my head was "sharing".

Secureworks Research Blog post
Speaking of sharing, this article (titled "Sleeper Agents") was published recently on the Dell Secureworks Research Blog.  It's a great example of something seen in the wild, and

Sharing Indicators
Continuing with the sharing theme, @binaryz0ne posted an article to his blog regarding artifacts of user account creation, looking at differences between the use of the command line and GUI for creating local accounts on a system (i.e., not AD accounts).

A couple of things to point out...

First, this testing is for account creation.  The accounts are being created, but profiles are not created until someone logs in using the credentials, as mentioned by Azeem.

Second, if you were interested in just account creation information, then collecting ALL available timeline information might be time consuming.  Let's say you found the event log record for account creation and wanted to focus on just that aspect of analysis.  Using the "Sniper Forensics" principles, you could create a timeline from just the Windows Event Logs (Security, LocalSessionManager, RemoteConnectionManager, maybe even TaskScheduler, just in case...), the SAM Registry hive ( RegRipper plugin), prefetch files (if available), and the user's (account collected from Windows Event Log record) NTUSER.DAT and USRCLASS.DAT.

I've seen instances of both (use of CLI and GUI to create accounts...) in the wild, and it's great to see someone putting in the effort to not only test something like this, but to also share their findings.  Thanks, Ali!

I've been doing some testing of Powershell, and have been using a .ps1 file to create a user and add it to the local Administrators group, and finding some truly fascinating artifacts.

One of the things I've been working on (well, working off and on...) is learning to program in Python.  I do plan to maintain my Perl programming, but learning Python is one of those things I've laid out as a goal for myself.  I've already written one small program ( that I do use quite often for parsing the DestList streams out of automaticDestination Jump Lists, including adding that information to a timeline.  I don't expect to become an expert programmer, but I want to become more proficient, and the best way to do that is to start developing projects.

Not long ago, I was doing some reading and ran across this ISC post that mentions the impacket library (from CoreLabs), and I thought that was pretty interesting.  Not long after, I was reading Jon Glass's blog and ran across another mention of the impacket library, and solutions Jon has been looking at for parsing the IE10+ WebCacheV01.dat history database (part 3 of which can be found here).  I've been pretty successful using the NirSoft ESE Database Viewer so far, but like Jon, I may reach the point where I'd like to have something that I can extend, or better yet, provide output in a format that I can easily incorporate into my analysis processes.

Monday, January 05, 2015

What It Looks Like: Disassembling A Malicious Document

I recently analyzed a malicious document, by opening it on a virtual machine; this was intended to simulate a user opening the document, and the purpose was to determine and document artifacts associated with the system being infected.  This dynamic analysis was based on the original analysis posted by Ronnie from, using a copy of the document that Ronnie graciously provided.

After I had completed the previous analysis, I wanted to take a closer look at the document itself, so I disassembled the document into it's component parts.  After doing so, I looked around on the Internet to see if there was anything available that would let me take this analysis further.  While I found tools that would help me with other document formats, I didn't find a great deal that would help me this particular format.  As such, I decided to share what I'd done and learned.

The first step was to open the file, but not via MS Word...we already know what happens if we do that.  Even though the document ends with the ".doc" extension, a quick look at the document with a hex editor shows us that it's format is that of the newer MS Office document format; i.e., compressed XML.  As such, the first step is to open the file using a compression utility, such as 7Zip, as illustrated in figure 1.

Figure 1: Document open in 7Zip

As you can see in figure 1, we now have something of a file system-style listing that will allow us to traverse through the core contents of the document, without actually having to launch the file.  The easiest way to do this is to simply extract the contents visible in 7Zip to the file system.

Many of the files contained in the exported/extracted document contents are XML files, which can be easily viewed using viewers such as Notepad++.  Figure 2 illustrates partial contents for the file "docProps/app.XML".

Figure 2: XML contents

Within the "word" folder, we see a number of files including vbaData.xml and vbaProject.bin.  If you remember from blog post about the document,  there was mention of the string 'vbaProject.bin', and the Yara rule at the end of the post included a reference to the string “word/_rels/vbaProject.bin”.  Within the "word/_rels" folder, there are two files...vbaProject.bin.rels and document.xml.rels...both of which are XML-format files.  These documents describe object relationships within the overall document file, and of the two, documents.xml.rels is perhaps the most interesting, as it contains references to image files (specifically, "media/image1.jpg" and "media/image2.jpg").  Locating those images, we can see that they're the actual blurred images that appear in the document, and that there are no other image files within the extracted file system.  This supports our finding that clicking the "Enable Content" button in MS Word did nothing to make the blurred documents readable.

Opening the word/vbaProject.bin file in a hex editor, we can see from the 'magic number' that the file is a structured storage, or OLE, file format.  The 'magic number' is illustrated in figure 3.

Figure 3: vbaProject.bin file header

Knowing the format of the file, we can use the MiTeC Structured Storage Viewer tool to open this file and view the contents (directories, streams), as illustrated in figure 4.

Figure 4: vbaProject

Figure 5 illustrates another view of the file contents, providing time stamp information from the "VBA" folder.

Figure 5: Time stamp information

Remember that the original write-up regarding the file stated that the document had originally been seen on 11 Dec 2014.  This information can be combined with other time stamp information in order to develop an "intel picture" around the infection itself.  For example, according to VirusTotal, the malicious .exe file that was downloaded by this document was first seen by VT on 12 Dec 2014.  The embedded PE compile time for the file is 19 June 1992.  While time stamps embedded within the document itself, as well as the PE compile time for the 'msgss.exe' file may be trivial to modify and obfuscate, looking at the overall wealth of information provides analysts with a much better view of the file and its distribution, than does viewing any single time stamp in isolation.

If we continue navigating through the structure of the document, and go to the VBA\ThisDocument stream (seen in figure 4), we will see references to the files (batch file, Visual Basic script, and Powershell script) that were created within the file system on the infected system.

My goal in this analysis was to see what else I could learn about this infection by disassembling the malicious document itself.  My hope is that the process discussed in this post will serve as an initial roadmap for other analysts, and be extended in the future.

Tools Used
Hex Editor (UltraEdit)
MiTeC Structured Storage Viewer

Lenny Zeltser's blog - Analyzing Malicious Documents Cheat Sheet
Virus Bulletin presentation (from 2009)
Kahu Security blog post - Dissecting a Malicious Word document - upload documents for analysis
Python OLETools from Decalage
Trace Evidence Blog: Analyzing Weaponized RTF Documents

Addendum 6 Jan 2015 - Extracting the macro
I received a tip on Twitter from @JPoForenso to take a look at Didier Stevens' tools and, as a means for extracting the macro from the malicious document.  I first tried by itself, and that didn't work, so I started looking around for some hints on how to use the tools together.  I eventually found a tweet from Didier that had illustrated how to use these two tools together.  From there, I was able to extract the macro from within the malicious file.  Below are the steps I followed in sequence to achieve the goal of extracting the macro.

1. "C:\Python27> d:\tips\file.doc" gave me a listing of elements within the document itself.  From here, I knew that I wanted to look at "word/vbaProject.bin".

2.  "C:\Python27> -d d:\tips\file.doc word/vbaProject.bin" gave me a bunch of compressed stuff sent to the console.  Okay, so good so far.

3.  "C:\Python27> -d d:\tips\file.doc word/vbaProject.bin |" gave me some output that I could use, specifically:
  1:       445 'PROJECT'
  2:        41 'PROJECTwm'
  3: M   20159 'VBA/ThisDocument'
  4:      3432 'VBA/_VBA_PROJECT'
  5:       515 'VBA/dir'

Now, I've got something I can use, based on what I'd read about here.  At this point, I know that the third item contains a "sophisticated" macro.

4.  "C:\Python27> -d d:\tips\file.doc word/vbaProject.bin | -s 3 -v" dumps a bunch of stuff to the console, but it's readable.  Redirecting this output to a file (i.e., " > vba.txt") lets me view the entire macro.

Addendum 14 Jan 2015 - More Extracting the Macro
Didier posted this following image to Twitter recently, illustrating the use of

Tuesday, December 30, 2014

What It Looks Like: Malware Infection via a Weaponized Document

Okay...I lied.  This is my last blog post of 2014.

A couple of weeks ago, Ronnie posted regarding some analysis of a weaponized document to the blog.  There is some interesting information in the post, but I commented on Twitter that there was very little post-mortem analysis. In response, Ronnie sent me a copy of the document.  So, I dusted off a Windows 7 VM and took a shot at infecting it by opening the document.

Analysis Platform
32-bit Windows 7 Ultimate SP1, MS Office 2010, with Sysmon installed - VM running in Virtual Box.  As with previous dynamic analysis I've performed, Sysmon provides not only place holders to look for, but also insight into what can be trapped via a process creation monitoring tool.

Run Windows Updates, reboot to a clean clone of the VM, and double-click the document (sitting on the user profile desktop).  The user profile used to access the document had Admin-level privileges, but UAC had not been disabled.  After waiting a few moments after the launch of the document, the application (MS Word) was closed, and the VM shut down cleanly.

I purposely did not run a packet capture tool, as that was something that had been done already.

Initial attempts to view the file in a hex editor caused MSE to alert on TrojanDownloader:O97M/Tarbir.  After opening the file, waiting, and shutting down the VM cleanly, I created a timeline using file system, WEVTX, Prefetch, and Registry metadata.  I also created a separate micro-timeline from the USN change journal - I didn't want to overpopulate my main timeline and make it more difficult to analyze.

Also, when I extracted the file from the archive that I received, I named it "file.docx", based on the contents (the structure was not the older-style OLE format).  When I double-clicked the file, MS Word opened but complained that there was something wrong with the file.  I renamed the file to "file.doc", and everything ran in accordance with Ronnie's blog post.

As expected, all of the files that Ronnie mentioned were created within the VM, in the user's AppData\Local\Temp folder.  Also as expect, the timeline I created was populated by artifacts of the user's access to the file.  Since the "Enable Editing" button had to be clicked in order to enable macros (and run the embedded code), the TrustRecords key was populated with a reference to the file.  Keep in mind that many of the artifacts that were created (JumpList entries, Registry values, etc.) will persist well beyond the removal/deletion of the file and other artifacts.

While I did not capture any of the off-system communication (i.e., download of the malware), Sysmon provided some pretty interesting information.  I looked up the domain in Ronnie's post, and that gave me the IP address "50.63.213[.]1".  I then searched for that IP address in my timeline, and found one entry, from Sysmon...Powershell had reached off of the system (Sysmon/3 event) to that IP address (which itself translates to "[.]net"), on port 80.  Artifacts of Powershell's off-system communications were the HKLM/Software/Microsoft/Tracing/powershell_RASMANCS and HKLM/Software/Microsoft/Tracing/powershell_RASAPI32 keys being created.

Per Ronnie's blog post, the file "444.exe" is downloaded.  The file is deleted after being copied to "msgss.exe".  The strings within this file (msgss.exe) indicate that it is a Borland Delphi file, and contains the strings "GYRATOR" and "TANNERYWHISTLE" (refer to the icon used for the file).  The PE compile time for the file is 19 Jun 1992 22:22:17 UTC.  The VirusTotal analysis of this file (originally uploaded to VT on 12 Dec) can be found here.

Persistence Mechanism:  User's Run key; the value "OutLook Express" was added to the key, pointing to the msgss.exe file.

An interesting artifact of the infection occurred at the same time that the msgss.exe file was created on the system and the Run key value created so that the malware would persist; the key "HKCU/full" was created.  The key doesn't have any's just the key.

To extend Corey's discussion of Prefetch file contents just a bit, the Prefetch file for WinWord.exe included references to RASMAN.DLL, RASAPI32.DLL, as well as other networking DLLs (W2_32.DLL, WINHTTP.DLL).

Given the off-system communications, I located and extracted the WebCachev01.dat file that contains the IE history for the user, and opened it using ESE DatabaseView.  I found no indication of the host being contacted, via either IP address or name.  Additional testing is required but it would appear that the System.Net.WebClient object used by Powershell does not leave traces in the IE history (i.e., the use of the WinInet API for off-system communications would leave traces in the history).  If that's the case, then from an infrastructure perspective, we need to find another means of detecting this sort of activity, such as through process creation monitoring, the use of web proxies, etc.

1. Threat intel cannot be based on analysis in isolation.

Okay, I understand that this is just a single document and a single infection, and does not specifically represent an APT-style threat, but the point here is that you can't develop "threat intelligence" by analyzing malware in isolation.  In order to truly develop "threat intelligence", you have to look how the adversary operates within the entire infrastructure eco-system; this includes the network, memory, as well as on the host.

I'm also aware that "APT != malware", and that's absolutely correct.  The findings I've presented here are more indicators than intel, but it should be easy to see not just the value of the analysis, but also how it can be extended.  For example, this analysis might provide the basis for determining how an adversary initially gained access to an infrastructure, i.e., the initial infection vector (IIV).  Unfortunately, due to a number of variables, the IIV is often overlooked, or assumed.  When the IIV is assumed, it's often incorrect.  Determining the IIV can be used to see where modifications can be made within the infrastructure in order to improve prevent, detection, and response.

Looking specifically at the analysis of this weaponized document, Ronnie provided some insight, which I was then able to expand upon, something anyone could have done.  The focus of my analysis was  to look at how the host system was impacted by this malware; I can go back and redo the analysis (re-clone the VM), and run the test again, this time pausing the VM and capturing the memory for analysis via Volatility, and extend the understanding of the impact of this document and malware even further.  Even with just the timeline, the available indicators have been expanded beyond the domain and hash (SHA-256) that was available as of 15 Dec.  By incorporating this analysis, we've effectively moved up the Pyramid of Pain, which is something we should be striving to do.  Also, be sure to check out Aaron's Value of Indicators blog post.

2.  Host analysis significantly extends response capability.

The one big caveat from this analysis is the time delta between "infection" and "response"; due to the nature of the testing, that delta is minimized, and for most environments, is probably unrealistic.  A heavily-used system will likely not have the same wealth of data available, and most systems will very likely not have process creation monitoring (Sysmon).

However, what this analysis does demonstrate is, what is available to the responder should the incident be discovered weeks or months after the initial infection.  One of the biggest misconceptions in incident response is that host-based analysis is expensive and not worth the effort, that it's better to just burn the affected systems down and then rebuild them.  What this analysis demonstrates is that through host analysis, we can find artifacts that persist beyond the deletion/removal of various aspects of the infection.  For example, the file 444.exe was deleted, but the AppCompatCache and Sysmon data provided indications that the file had been executed on the system (the USN change journal data illustrated the creation and subsequent deletion of the file).  And that analysis doesn't have to be expensive, time consuming, or fact, it's pretty straightforward and simple, and it provides a wealth of indicators that can be used to scope an incident, even weeks after the initial infection occurred.

3.  Process creation monitoring radically facilitates incident response.

I used Sysmon in this test, which is a pretty good analog for a more comprehensive approach, such as Sysmon + Splunk, or Carbon Black.  Monitoring process creation lets us see command line arguments, parent processes, etc.  By analyzing this sort of activity, we can develop prevention and detection mechanisms.

This also shows us how incident response can be facilitated by the availability of this information.  Ever since my early days of performing IR, I've been asked what, in a perfect world, I'd want to have available to me, and it's always come back to a record of the processes that had been run, as well as the command line options used.  Having this information available in a centralized location would obviate the need to go host-to-host in order to scope the incident, and could be initially facilitated by running searches of the database.

Lenny Zeltser's Analyzing Malicious Documents Cheat Sheet

Monday, December 29, 2014

Final Post of 2014

As 2014 draws to a close, I thought I'd finish off the year with one last blog post.  In part, I'd like to thank some folks for their contributions over the past year, and to look forward to the coming year for what they (and others) may have in the coming year.

I wanted to thank two people in particular for their contributions to the DFIR field during 2014.  Both have exemplified the best in information sharing, not just in providing technical content but also in providing content that pushes the field toward better analysis processes.

Corey's most recent blog post continues his research into process hollowing, incorporating what he's found with respect to the Poweliks malware.  If you haven't taken a good look at his blog post and incorporated this into your analysis process yet, you should strongly consider doing so very soon.

Maria's post on time stomping was, as always, very insightful.  Maria doesn't blog often but when she does, there's always some great content.  I was glad to see her extend the rudimentary testing I'd done and blogged about, particularly because very recently, I'd seen an example of what she'd blogged about during an engagement I was working on.

Maria's also been getting a lot of mileage out of her Google cookies presentation, which I saw at the OSDFCon this year.  If you haven't looked at the content of her presentation, you really should.  In the words of Hamlet, "There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy", and I'm sure Maria was saying, "There are more things in a Windows image than are dreamt of in your timeline."

Tying both Corey and Maria's contributions together, I was doing some analysis recently regarding a particular malware variant that wrote it's files to one location, copied them to another, time stomped those files, and injected itself into the svchost.exe process.  This variant utilized the keystroke logging capability of the malware, and the keystroke log file was re-stomped after each successive update.  It was kind of like an early nerd Christmas gift to see what two well respected members of the community had talked about right there, in the wild.  In the words of one of my favorite characters, "Fascinating."

The year would not be complete without a huge THANK YOU to the Volatility folks for all they do, from the framework, to the book, to the training class.  2014 saw me not only attending the course, but also receiving a copy of the book.

On the whole, it might be fair to refer to 2014 (maybe just the latter half) as the "Year of the Shellbag Research".  Eric Zimmerman (Shellbag Explorer), Willi Ballenthin, Dan Pullega, and Joachim Metz should be recognized for the work they've been putting into analyzing and documenting shellbags.  To learn more about what Eric and others have done to further the parsing and analysis of shellbags, be sure to check out David Cowen's Forensic Lunch podcasts (28 Nov, 12 Dec).

Speaking of David Cowen, I still think that TriForce is a great example of the outcome of research in the field of forensic analysis.  Seriously.  I don't always use things like the USN change journal in my analysis...sometimes, quite simply, it's not applicable...but when I have incorporated into a timeline (by choice...), the data has proved to be extremely valuable and illuminating.

There are many others who have made significant contributions to the DFIR field over the past year, and I'm sure I'm not going to get to every one of them, but here are a few...

Ken Johnson has updated his file history research.
Basis Technology - Autopsy 3.1
Didier Stevens - FileScanner
Foxton Software - Free Tools
James Habben - Firefox cache and index parsers

Lateral Movement
Part of what I do puts me in the position of tracking a bad guy's lateral movement between systems, so I'm always interested in seeing what other analysts may be seeing.  I ran across a couple of posts on the RSA blog that discussed confirming Remote Desktop Connections (part 1part 2).  I'm glad to see someone use RegRipper, but I was more than a little surprised that other artifacts associated with the use of RDP (either to or from a system) weren't mentioned, such as RemoteConnectionManager Windows Event Log records, and JumpLists (as described in this July, 2013 blog post).

One of the things that I have found...interesting...over time is the number of new sources of artifacts that get added to the Windows operating system with each new iteration.  It's pretty fascinating, really, and something that DFIR analysts should really take advantage of, particularly when we no longer have to rely on a single artifact (a login record in the Security Event Log) as an indicator, but can instead look to clusters of artifacts that serve to provide an indication of activity.  This is particularly valuable when some of the artifacts within the cluster are not available...the remaining artifacts still serve as reliable indicators.

WRA Contest
Finally, as the year draws to a close, here's an update on the WRA 2/e Contest. To date (in over 2 months) there has been only a single submission.  I had hoped that the contest would be much better received (no coding required), but alas, it was not to be the case.

Monday, December 08, 2014

10 Years of Blogging

That's first blog post was ten years ago today.  Wow.

Over the passed ten years, some things have changed, and others haven't.

As the year comes to a close, don't forget about the WRF 2/e Contest.

Thursday, October 23, 2014

WRF 2/e Contest

I recently posted that Syngress has agreed to publish a second edition of Windows Registry Forensics, and in that post, I mentioned that I wanted to provide those in the community with an opportunity to have input into the content of the book prior to it being published.  I know that it's only been a couple of days since the post was published, but historically, requests like these haven't really panned out.  As such, I wanted to take something of a different the recommendation of a friend, and stealing a page from the Volatility folks, I'm starting a contest for submissions of "case studies" to appear in the second edition.

So what I'm looking for is submissions of detailed case studies (or "write-ups", "war stories", etc...I don't want to get tangled up on the terminology) of your triumphs via and innovations in Registry analysis.

Please read through this entire blog post before sending in a submission.
What I don't want is case information, user and system names, etc.  Please provide enough detail in your write-up to give context, but not so much that case information is exposed and privacy is violated.

For the moment, I plan to accept submissions until midnight, 31 Dec 2014.  I may extend that in the really depends on how the schedule for the book writing works out, how far I get, how many submissions come in, etc.  The really good submissions will be included in the book, and the author of the submission will received a signed copy of the book.  And yes, when I say "signed", I mean by me.  That also means that your submission needs to include a name and email address, so that I can reach back to you, if your submission is accepted, and get your mailing address.

I'm looking for the top 10 or so submissions; however, if there are more really good ones than just ten, I'll consider adding them, as well.

Consideration will be given to...
Those submissions that require the least effort to incorporate into the book, with respect to spelling and grammar.  I'm all about cut-and-paste, but I don't want to have the copy editor come back with more modifications and edits than there is original text.  I can take care of incorporating the submission into the book in the correct format, but I don't want to have to spend a great deal of time correcting spelling and grammar.

Those submissions that are more complete and thorough, illustrating the overall process.  For example, "...I looked at this value..." or "...I ran RegRipper..." isn't nearly as useful as correlating multiple Registry keys and values, even with other data sources (i.e., Windows Event Logs, etc.).

Those submissions that include more than just, "...I used RegRipper..." or "...I used auto_rip...".  Submissions should talk about how tools (any tools, not just the ones mentioned...) were used.

Those submissions that include process, data, results, RR plugins used, created, or modified, etc.

Note that if you include the newly created or modified plugin along with your submission, the plugin will be added to the RR distribution.

Send submissions so to me as text.  Use "WRF 2/e contest submission" as the subject line.   If you have images (screen captures, etc.) that you'd like to share, reference the image in the text ("insert figure 1 here"), and provide the image in TIFF format.

If you have multiple files (the write-up, a plugin, images, etc.), just zip them up.

Please include your name along with the information.  If you do not want your name included in the content when it's added to the book, please specify as such...however, anonymous submissions will not be considered, as I may want to reach back to you and ask a clarifying question (or two).  So, please also be willing to answer questions!  ;-)

Please let me know if it would be okay to post the submission to this blog, and if so, should your name be included (or not).

If you have any questions about this contest, please feel free to ask.

Wednesday, October 22, 2014

RegRipper v2.8 is now on GitHub

RegRipper v2.8 is now available on GitHub.

From this point forward, this repository should be considered THE repository for RegRipper version 2.8.  If you want a copy of RegRipper, just click the "Download ZIP" button on the right of the browser window, and save the file...doing so, you'll have the latest-and-greatest set of plugins available.

If you have any questions, please feel free to contact me.