Tuesday, September 24, 2013

Links - Malware Edition

Bromium Labs posted some pretty interesting information regarding another Zeus variant; the information in this post is very useful.  For example, the phishing attack(s) reportedly targeted folks in the publishing industry.

What I found most interesting is that the variant, with all of its new capabilities, still uses the Run key for persistence.

Looking at other artifacts that can be used to detect malware on systems, see this Zeus write up from SecureWorks...one of the listed capabilities is that it can modify the hosts file on the system.  MS KB 172218 illustrates why this is important...but it's also something that can be checked with a very quick query.

Speaking of write ups, I really enjoyed reading this one from System Forensics, for a couple of reasons:

First, the author documents the setup so that you can not only see what they're doing, but so that the analysis can be replicated.

Second, the author uses the tools available to document the malware being analyzed.  For example, they use PEView to determine information about the sample, including the fact that it's compiled for 32-bit systems.  This is pretty significant information, particularly when it comes to where one (DF analyst, first responder) will look for artifacts.  Fortunately, the system on which the malware was run is also 32-bit, so analysis will be pretty straightforward.  It does seem very interesting to me that most malware analysts/RE folks appear to use 32-bit Windows XP when they conduct dynamic analysis.

Again, we see that this variant uses the Run key (in this case, in the user context) for persistence.

Finally, they performed Prefetch file analysis to gather some more information regarding what the malware actually does on a system.

A couple of thoughts about the analysis:
Had the author run the malware and then copied off the MFT, they might have recovered the batch files as they're very likely resident files.  Even if the files had been marked as not in use (i.e., deleted), had they responded quickly enough, they might have been able to retrieve the MFT before those records were overwritten.

The author states in the article, "The stolen data is sent via the C&C using the HTTP protocol."; I tend to believe that it can be valuable to know how this is done.  For example, if the WinInet API is used, there may be a set of artifacts available that can assist analysts in gaining some information regarding the malware.

Corey released his Tr3Secure data collection script.  Open the batch file up in Notepad to see what it does...Corey put a lot of work into the script, and apparently got some great input from others in the community - see what happens when you share what you're doing, and someone else actually takes the time to look at it, and comment on it?  Stuff just gets better.

If you're grabbing memory during IR, you might want to take a look at SketchyMoose's Total Recall script.  Corey's script gives you the capability to dump memory, and SketchyMoose's script can help make analysis a bit easier, as well.

Adam has posted another in a series (four total, thus far) of blogs regarding persistence mechanisms.  There is some pretty interesting stuff so far, and I was recently looking at a couple of his posts, in particular #3 in the series.  I tried out some of the things he described under the App Paths section of the post, and I couldn't get them to work.  For example, I tried typing "pbrush" and "pbrush.exe" at the command prompt, and just go the familiar, "'pbrush' is not recognized as an internal or external command, operable program or batch file."  I also added calc.exe as a key name, and for the default value, added the full path to Notepad.exe (in the system32 folder) and tried launching calc.exe...each time, the Calculator would launch.  I have, however, seen the AppCertDlls key reportedly used for persistence in malware write ups, which is why I wrote the RegRipper plugin to retrieve that information.

Update: Per an exchange that I had with Adam via Twitter (good gawd, Twitter is a horrible way to try to either explain things, or get folks to elaborate on something...people, don't use it for that...), apparently, the App Paths "thing" that Adam pointed out only works if you try to run the command via the Run box ("through the shell"), and doesn't work if you try to run it from the command prompt.

Monday, September 23, 2013

Shell Item Artifacts

I was watching the 9/20 Forensic Lunch with David Cowen and crew recently, and when Jonathan Tomczak TZWorks was initially speaking, there was a discussion of MFT file reference numbers found in shellbags artifacts.  Jonathan pointed out that these artifacts are also found in Windows shortcut/LNK files and Jump Lists.  From there, Dave posed a question (which I think was based off of the mention of Jump Lists), asking if this was an artifact specifically related to Windows 7.  As it turns out, this isn't so much a function of Windows 7, as how shell items are crafted on different versions of Windows; if you remember this post, shell items are becoming more and more prominent on Windows platforms.  They've existed in shellbags and LNK files since XP, and as of Windows 7, they can be found in Jump Lists (streams in Jump Lists are, with the exception of the DestList stream, LNK format).  Windows 8 has Jump Lists, as well, and thanks to Jason's research, we know that LNK-formatted data can also be found in the Registry.

Shell Items in the Registry
There are a number of RegRipper plugins that parse shell items; menuorder.pl, comdlg32.pl (for Vista+ systems), itempos.pl, shellbags.pl, photos.pl (for Windows 8 systems). This simply illustrates how pervasive shell items are on the different versions of Windows.

Willi mentions the existence of these artifacts here, in his description of the ItemPos* section of the post; look for the section of code that reads:

if (ext_version >= 0x0007) {
        FILEREFERENCE file_ref;

What this says, essentially, is that for certain types of shell items, when a specific ext_version value is found (in this case, greater than 7, which indicates Vista...), there may be file reference available within the shell item.  I say "may be" to reiterate Jonathan's comments; I have only looked at a very limited set of artifacts, and Jonathan made no specific reference to the types of shell items that did or did not contain file reference numbers.

This is also mentioned in Joachim Metz's Windows Shell Item format specification, specifically on pg 25, within the discussion of the extension block.  Joachim has put a lot of effort into documenting a great deal of information regarding the actual structure of a good number of shell items; in his documentation, if the ext_version is 7 or greater, certain types of shell items appear to contain the MFT file reference.

So, again...this is not something that you should expect to see in all types of shell items...many types of shell items simply will not contain this information.  However, those shell items that point to files and folders...type 0x31, 0x32, 0xB1, etc...and those on Vista systems and beyond...may contain MFT file reference numbers.

I had a quick chat with David, and he pointed out that making use of the MFT file reference number from within the shellbags artifacts can show you what existed on the system at some point in the past, as the file reference number is essentially the MFT record number concatenated with the sequence number for the record.  This works in very well with David's TriForce analysis methodology, and can be extremely valuable to an examiner.

The only shortcoming I can see here is that the time stamps embedded within these shell items are not of the same granularity as the time stamps found within the MFT; see this MS API for translating FILETIME time stamps to DOSDate format, which is how the time stamps are stored in the shell items.  As such, the time values will be different from what's found in the MFT.

Thursday, September 12, 2013

Forensic Perspective

We all have different perspectives on events, usually based on our experiences.  When I was a member of the ISS ERS team, I tried to engage the X-Force Vulnerability folks in discussions regarding the exploits they developed.  I figured that they needed to test them, and that they used virtual systems to do so...what I wanted to do was get access to the images of those virtual systems after an exploit had successfully been developed, so that I could examine the image for artifacts directly associated with the exploit.  The perspective of the folks who wrote the exploit seemed to be that if the exploit worked, it worked.  As a DFIR analyst, my perspective was, how can I be better prepared to serve and support my customers?

We know that when a Windows system is in use (by a user or attacker), there is stuff that goes on while other stuff goes on, and this will often result in indirect artifacts, stuff that folks that are not DFIR analysts might not consider.  For example, I ran across this post a bit ago regarding NetTraveler (the post was written by a malware analyst); being a DFIR analyst, I had submitted the link to social media, along with the question of whether the download of "new.jar" caused a Java deployment cache index (*.idx) file to be created.  From my perspective, and based on my experience, I may respond to a customer that had been infected with something like this that is perhaps a newer version, and in the face of AV not detecting this malware, I would be interested in finding other artifacts that might indicate an infection...something like an *.idx file.

Forensic Perspective
I ran across this post on the Carnal0wnage blog, which describes a method for modifying a compromised system (during a pen test) so that passwords will be collected as they are changed.  A couple of things jumped out at me from the post...

First, the Registry modification would be picked up by the RegRipper lsa_packages.pl plugin.  So, if you're analyzing an acquired image, or a Registry file extracted from a live system, or if you've deployed F-Response as part of your response, you're likely to see something amiss in this value, even if AV doesn't detect the malware itself.

Second, the code provided for the associated DLL not only writes the captured passwords to a file, but also uses the WinInet API to send the information off of the system.  This results in an entry being made into the IE history index.dat file for the appropriate account.  By "appropriate", I mean whichever privilege level the code runs under; on XP systems, I've seen infected systems where the malware ran with System-level privileges and the index.dat file in the "Default User" profile was populated.  I analyzed a Windows 2008 R2 system not long ago that was infected with ZeroAccess, and the click-fraud URLs were found in the index.dat file in the NetworkService profile.

If you haven't seen it yet, watch Mudge's comments at DefCon21...he makes a very good point regarding understanding perspectives when attempting to communicate with others.


Jason Hale has a new post over on the Digital Forensics Stream blog, this one going into detail regarding the Search History artifacts associated with Windows 8.1.  In this post, Jason points out a number of artifacts, so it's a good idea to read it closely.  Apparently, with Windows 8.1, LNK files are used to maintain records of searches. Jason also brought us this blog post describing the artifacts of a user viewing images via the Photos tile in Windows 8 (which, by the way, also makes use of LNK streams...).

Claus is back with another interesting post, this one regarding Microsoft's Security Essentials download.  One of the things I've always found useful about Claus's blog posts is that I can usually go to his blog and see links to some of the latest options with respect to anti-virus applications, including portable options.

Speaking of artifacts, David Cowen's Daily Blog #81 serves as the initiation of the Encyclopedia Forensica project.  David's ultimate goal with this project is to document what we know, from a forensic analysis perspective, about major operating systems so that we can then determine what we don't know.  I think that this is a very interesting project, and one well worth getting involved in, but my fear is that it will die off too soon, from nothing more than lack of involvement.  There are a LOT of folks in the DFIR community, many of whom would never contribute to a project of this nature.

One of perhaps the biggest issues regarding knowledge and information sharing within the community, that I've heard, going back as far as WACCI 2010 and beyond, is that too many practitioners simply feel that they don't have any means for contributing to the community in a manner that allows them to do so.  Some want to, but can't be publicly linked to what they share.  Whatever the reason, there are always ways to contribute.  For example, if you don't want to request login credentials on the ForensicsWiki and actually write something, how about suggesting content (or clarity or elaboration on content) or modifications via social media (Twitter, G+, whatever...even directly emailing someone who has edited pages)?

Like working forensic challenges, or just trying to expand your skills?  I caught this new DFIR challenge this morning via Twitter, complete with an ISO download.  This one involves a web server, and comes with 25 questions to answer.  I also have some links to other resources on the FOSS Tools page for this blog.

Speaking of challenges, David Cowen's been continuing his blog-a-day challenge, keeping with the Sunday Funday challenges that he posts.  These are always interesting, and come with prizes for the best, most complete answers.  These generally don't include images, and are mostly based on scenarios, but they can also be very informative.  It can be very beneficial to read winning answers come Monday morning.

I ran across this extremely interesting paper authored by Dr. Joshua James and Pavel Gladyshev, titled Challenges with Automation in Digital Forensics Investigations.  It's a bit long, with the Conclusions paragraph on pg. 14, but it is an interesting read.  The paper starts off by discussing "push-button forensics" (PBF), then delves into the topics of training, education, licensing, knowledge retention, etc., all issues that are an integral part of the PBF topic.

I fully agree that there is a need for intelligent automation in what we do.  Automation should NOT be used to make "non-experts useful"...any use of automation should be accompanied with an understanding of why the button is being pushed, as well as what the expected results should be so that anomalies can be recognized.

It's also clear that some of what's in the paper relates back to Corey's post about his journey into academia, where he points out the difference between training and education.

I ran across a link to Mudge's comments at DefCon21.  I don't know Mudge, and have never had the honor of meeting him...about all I can say is that a company I used to work for used the original L0phtCrack...a lot.  Watching the video and listening to the stories he shared was very interesting, in part because one of the points he made was getting out and engaging with others, so that you can see their perspectives

Monday, September 02, 2013

Data Structures, Revisited

A while back, I wrote this article regarding understanding data structures.  The importance of this topic has
not diminished with time; if anything, it deserves much more visibility.  Understanding data structures provides analysts with insight into the nature and context of artifacts, which in turn provides a better picture of their overall case.

First off, what am I talking about?  When I say, "data structures", I'm referring to the stuff that makes up files.  Most of us probably tend to visualize files on a system as being either lines of ASCII text (*.txt files, some log files, etc.), or an amorphous blob of binary data.  We may sometimes even visualize these blobs of binary data as text files, because of how our tools present the information found in those blobs.  However, as we've seen over time, there are parts of these blobs that can be extremely meaningful to us, particularly during an examination.  For example, in some of these blobs, there may be an 8-byte sequence that is the FILETIME format time stamp that represents when a file was accessed, or when a device was installed on a system.

A while back, as an exercise to learn more about the format of the IE (version 5 - 9) index.dat file, I wrote a script that would parse the file based on the contents of the header, which includes a directory table that points to all of the valid records within the file, according to information available on the ForensicsWiki (thanks to Joachim Metz for documenting the format, the PDF of which can be found here).  Again, this was purely an exercise for me, and not something monumentally astounding...I'm sure that we're all familiar with pasco.  Using what I'd learned, I wrote another script that I could use to parse just the headers of the index.dat as part of malware detection, the idea being that if a user account such as "Default User", LocalService, or NetworkService has a populated index.dat file, this would be an indication that malware on the system is running with System-level privileges and communicating off-system via the WinInet API.  I've not only discussed this technique on this blog and in my books, but I've also used this technique quite successfully a number of times, most recently to quickly identify a system infected with ZeroAccess.

More recently, I was analyzing a user's index.dat, as I'd confirmed that the user was using IE during the time frame in question.  I parsed the index.dat with pasco, and did not find any indication of a specific domain in which I was interested.  I tried my script again...same results.  Exactly.  I then mounted the image as a read-only volume and ran strings across the user's "Temporary Internet Files" subfolders (with the '-o' switch), looking specifically for the domain name...that command looked like this:

C:\tools>strings -o -n 4 -s | find "domain" /i

Interestingly enough, I got 14 hits for the domain name in the index.dat file.  Hhhhmmmm....that got me to thinking.  Since I had used the '-o' switch in the strings command, the output included the offsets within the file to the hits, so I opened the index.dat in a hex editor and manually scrolled on down to one of the offsets; in the first case, I found full records (based on the format specification that Joachim had published).  In another case, there was only a partial record, but the string I was looking for was right there.  So, I wrote another script that would parse through the file, from beginning to end, and locate records without using the directory table.  When the script finds a complete record, it will parse it and display the record contents.  If the record is not complete, the script will dump the bytes in a hex dump so that I could see the contents.  In this way, I was able to retrieve 10 complete records that were not listed in the directory table (and were essentially deleted), and 4 partial records, all of which contained the domain that I was looking for.

Microsoft refers to the compound file binary file format as a "file system within a file", and if you dig into the format document just a bit, you'll start to see why...the specification details sectors of two sizes, not all of which are necessarily allocated.  This means that you can have strings and other data buried within the file that are not part of the file when viewed through the appropriate application.
CFB Format
The Compound File Binary Format document available from MS specifies the use of a sector allocation table, as well as a small sector allocation table. For Jump Lists in particular, these structures specify which sectors are in use; mapping the ones that are in use, and targeting just those sectors within the file that are not in use can allow you to recover potentially deleted information.

MS Office documents no longer use this file format specification, but it is used in *.automaticDestinations-ms Jump Lists on Windows 7 and 8. The Registry is similar, in that the various "cells" that comprise a hive file can allow for a good bit of unallocated or "deleted" data...either deleted keys and values, or residual information in sectors that were allocated to the hive file as it continued to grow in size.  MS does a very good job of making the Windows XP/2003 Event Log record format structure available; as such, not only can Event Logs from these systems be parsed on a binary basis (to not only locate valid records within the .evt file that are "hidden" by the information in the header), but records can also be recovered from unallocated space and other unstructured data.  MFT records have been shown to contain useful data , particularly as a file moves from being resident to non-resident (specific to the $DATA attribute), and that can be particularly true for systems on which MFT records are 4K in size (rather than the 1K that most of us are familiar with).

Understanding data structures can help us develop greater detail and additional context with respect to the available data during an examination.  We can recover data from within files that is not "visible" in a file by going beyond the API.  Several years ago, I was conducting a PCI forensic audit, and found several potential credit card numbers "in" a Registry hive...understanding the structures within the file, and a bit of a closer look revealed that what I was seeing wasn't part of the Registry structure, but instead part of the sectors allocated to the hive file as it grew...they simply hadn't been overwritten with key and value cells yet.  This information had a significant impact on the examination.  In another instance, I was trying to determine which files a user had accessed, and found that the user did not have a RecentDocs key within their NTUSER.DAT; I found this to be odd, as even a newly-created profile will have a RecentDocs key.  Using regslack.exe, I was able to retrieve the deleted RecentDocs key, as well as several subkeys and values.
Understanding the nature of the data that we're looking at is critical, as it directs our interpretations of that data. This interpretation will not only direct subsequent analysis, but also significantly impact our conclusions. If we don't understand the nature of the data and the underlying data structures, our interpretation can be significantly impacted. Is that credit card number, which we found via a search, actually stored in the Registry as value data? Just because our search utility located it within the physical sectors associated with a particular file name, do we understand enough about the file's underlying data structures to understand the true nature and context of the data?