Tuesday, July 23, 2013

HowTo: Investigate an Online Banking Fraud Incident

A recent comment over on Google Plus caught my attention, and I thought it was important enough to incorporate into a HowTo post.  The comment was made with respect to the HowTo: Detecting Persistence Mechanisms post, and had to do with another means of persistence associated specifically with (according to the person who left the comment) online banking fraud.

Online banking fraud is a significant issue.  It's long been known that criminals go where the money is, and there are some interesting aspects with regards to this criminal activity.  For example, by targeting the computers used by small businesses to perform online banking, very often no investigation is done.  After all, a small business has just lost a significant amount of money, and possibly gone out of business...who's going to pay to have a thorough examination of the system performed?  Further, in the US, small businesses are not protected in the same manner as individuals, so once the money's gone, it's gone.

Small businesses can take a number of steps in order to protect themselves from online banking fraud; however, there is a lack of information and intel available that law enforcement can use to pursue the criminals, simply due to the fact that the systems used do not seem to be examined.  A thorough examination can be, and should be, conducted immediately so that law enforcement has the information that they need to pursue an investigation.

Brian Krebs has discussed the Zeus malware quite extensively in his blog, particularly with respect to online banking fraud. More often than not, once Brian has been contacted, the systems have likely already been wiped and repurposed, without any sort of examination having been conducted.

However, there're more ways of achieving this sort of criminal activity than simply using Zeus; W32/Crimea is one example.  During the infection process, the PE header of imm32.dll is modified (for Windows XP, Windows File Protection is disabled temporarily) to point to a malicious DLL, which is loaded when imm32.dll is loaded.  Imm32.dll is associated with keyboard interaction, so it's loaded by a number of processes, including the web browser.  The malicious DLL focuses specifically on retaining information entered into browser form fields associated with specific sites (i.e., online banking sites).  The collected information is not stored locally; instead, it is immediately shuttled off of the system.  As such, the artifacts are minimal, at best.  The most significant artifacts were found through examination of the malicious DLL itself, which then led to findings in the pagefile.

Another example is one pointed out by Sandro Suffert on Google Plus...he mentioned that a "downloader" had modified two Registry settings (modified or created two values):

- Within the HKLM\Software\Microsoft\Windows\CurrentVersion\Internet Settings key, the AutoConfigURL value pointed to either a local or remote .pac file

- Within the HKCU\Software\Policies\Microsoft\Internet Explorer key, the Autoconfig value was set to 0x1.

I cannot attest to these key paths or values being correct, as I have not seen the data.  However, this is an interesting technique to use, as Sandro pointed out that particularly with a remote .pac file, there's no actual malware on the system, and therefore no file for AV to alert on.  Yet, this technique allows the bad guys to capture information using a man-in-the-middle attack.

Similar techniques are used by Trojan-Banker.Win32.Banbra, Troj-MereDrop, as well as this baddie from ThreatExpert.

As Hamlet said to Horatio, "...there are more things on heaven and earth than are dreamt of in your philosophy..."; Win32\Theola is a Chrome plugin that is used to commit online banking fraud.

So...in order to investigate a potential online banking fraud issue, as soon as this issue is suspected (or a small business is notified of such an issue), immediately sit down with the employee responsible for conducting online banking and determine all of the systems that they used for this activity.  You may find that they have one system from which they conduct this activity, or you will find out that they had an issue at some point and used another system.  Immediately isolate that system, and depending upon the timeframe of the fraudulent activity, acquire a dump of physical memory from the system.  Then, acquire an image of the system and conduct a thorough examination, or contact someone who can.

If you create a timeline of system activity, it should go without saying that you should focus your attention on activity that prior to the date of the fraudulent transaction (or the first one, if there are several).

MS KB: How to reset your IE proxy settings

HowTo: Determine/Detect the use of Anti-Forensics Techniques

The use of anti-forensics techniques to hide malicious activity (malware installation, intrusion, data theft, etc.) can be something of a concern during an examination; in fact, in some cases, it's simply assumed when particular data or artifacts can't be found.  It's easy to assume that these techniques were used when we look at a very limited range of artifacts; however, as we begin to incorporate additional and more comprehensive data sources into our analysis processes, we begin to be able to separate out the anti-forensics signal from the noise.

The term "anti-forensics" can refer to a lot of different things.  When someone asks me about this topic, I generally try to get them to describe to me what they're referring to, and to be more specific.  As with anything else, nomenclature can be important, and messages get scrambled when the use of terms becomes too loose.  Rather than address this as broad topic, I thought we'd take a look at some of the common techniques used to hide evidence on or remove it from a system...

One of perhaps the most publicly discussed anti-forensic techniques is referred to as time stomping, in part due the name of the tool used to demonstrate this capability.  While this initially threw a monkey wrench into our analysis processes, it was quickly realized that the use of this sort of technique (and tool) could be detected.  Then, as things tend to go in any eco-system, there was an adaptation to the technique...rather than modifying a 64-bit time stamp with a 32-bit value, the technique was adapted to copy the file times from kernel32.dll onto the target file, preserving 64-bit granularity.  Once again, analysis techniques were updated.  For example, in 2009, Lance Mueller talked about detecting the use of time changing utilities in his blog.  There's been discussion regarding techniques for changing the $FILE_NAME attribute time stamps, as well as those within the $STANDARD_INFORMATION attribute, just as there have been techniques for detecting the use of this technique.  Direct and thorough analysis of the MFT (the analysis of which is predicated by having a thorough understanding of the MFT records themselves) can be revealing, whereas techniques such as detecting program execution and David Cowen's NTFS TriForce can prove valuable insight, as well.

Tools: MFT parser, knowledge of MFT records

Changing the System Time
Okay, let's say that rather than changing the times of specific files, and intruder changes the system time itself.  This would mean that, after that change, the times recorded by the system would be different...so how could we detect this?  One way to do this is to list available Event Log records by sequence number and generated time...if the system time were rolled back, this activity would become evident as the sequence numbers increased but at one point, the time generated was earlier than the time for the previous record.  Lance Mueller's first forensic practical exercise provided a great example of how to detect system time changes using this technique.

Tools: evtparse.pl ('-s' switch)

Zapping Event Records
I've heard analysts state the there were gaps in the available Event Logs, so an intruder must have been able to remove specific event records from the log.  Again, I've heard this claimed, but I've never seen the data to support this sort of thing.  Writing a tool to do this is hazardous to the intruder...it may not work, and may instead crash the system.  Why not just do something much simpler, such as (given the appropriate privileges) clear the Event Log and disable auditing all together.

I've had to analyze a number of Windows systems where the Event Logs have been cleared, and with Windows XP and 2003 systems in particular, it's been pretty trivial to recover a good deal of those deleted event records.

Checking the LastWrite time of a Registry key within the Security hive file (see the auditpol.pl RegRipper plugin) will help you determine when the audit policy of the system was last modified.

Multiple Techniques
What we've discussed thus far was not intended to be a comprehensive listing of all anti-forensics techniques; rather, I wanted to look at a couple and point out analysis processes that you could employ to detect the use of such techniques.  The thing about using anti-forensics techniques is that less is better; the fewer and more simple the techniques used, the harder they are to address.  For example, simply deleting a file...downloader, executable file, etc...after us is perhaps the simplest technique to use, as it prevents an analyst from obtaining a copy of the file for analysis.  Say a script downloads and executes a file, then deletes it...the analyst may still find artifacts to indicate that the file was executed (i.e., Prefetch file, AppCompatCache artifacts, etc.) but not be able to determine explicitly what the file was designed to do.

However, to use multiple techniques requires additional planning and effort.  If this is done automatically, then either a larger application, or multiple applications will need to be downloaded to the system.  The problem that the intruder then runs up against is that the applications have to be tested specifically against the version of Windows that has been compromised...different versions of Windows may have different functionality behind the API, and the applications may not work correctly, or may even crash the system.  The more "sophisticated" the technique used, the more planning and effort is required.  If multiple applications are used, it's more likely that indications of program execution will be found.  If a more manual approach is used, then the intruder must spend more time and engage with the system more, again leaving more tracks and artifacts as they interact with the environment.

The key things to remember with respect to determining or detecting the use of anti-forensics techniques are:

1.  If you suspect it, provide it.  Find the evidence.  If you suspect that a particular technique has been used, gather the data that supports, or ultimately disproves, your theory.  Don't just wave your hand and suggest that "anti-forensics techniques were used."  If you suspect that one or more techniques were used, identify them explicitly.  Then, you can pursue demonstrating or disproving your theory.

2.  Remember that you're not only on the same battlefield as the bad guy, but you actually have an advantage.  You're examining an acquired image, which is the "scene of the crime", frozen in time and unchanging.  You can go back and start your analysis over again, without the fear of loosing any of your previous artifacts.

3.  Document your analysis; if you didn't document it, it didn't happen.  Once you've documented your analysis, including both what worked and what didn't, you can then incorporate your findings into future analysis, as well as share your finding with other analysts.

Monday, July 22, 2013

HowTo: Add Intelligence to Analysis Processes

How many times do we launch a tool to parse some data, and then sit there looking at the output, wondering how someone would see something "suspicious" or "malicious" in the output?  How many times do we look at lines of data, wondering how someone else could easily look at the same data and say, "there it is...there's the malware"?  I've done IR engagements where I could look at the output of a couple of tools and identify the "bad" stuff, after someone else had spent several days trying to find out what was going wrong with their systems.  How do we go about doing this?

The best and most effective way I've found to get to this point is to take what I learned on one engagement and roll it into the next.  If I find something unusual...a file path of interest, something particular within the binary contents of a file, etc...I'll attempt to incorporate that information into my overall analysis process and use it during future engagements.  Anything that's interesting, as a result of either direct or ancillary analysis will be incorporated into my analysis process.  Over time, I've found that some things keep coming back, while other artifacts are only seen every now and then.  Those artifacts that are less frequent are no less important, not simply because of the specific artifacts themselves, but also for the trends that they illustrate over time.

Before too long, the analysis process includes, "access this data, run this tool, and look for these things..."; we can then make this process easier on ourselves by taking the "look for these things" section of the process and automating it.  After all, we're human, get tired from looking at a lot of data, and we can make mistakes, particularly when there is a LOT of data.  By automating what we look for (or, what we've have found before), we can speed up those searches and reduce the potential for mistakes.

Okay, I know what you're going to say..."I already do keyword searches, so I'm good".  Great, that's fine...but what I'm talking about goes beyond keyword searches.  Sure, I'll open up a lot of lines of output (RegRipper output, web server logs) in UltraEdit or Notepad++, and search for specific items, based on information I have about the particular analysis that I'm working on (what are my goals, etc.).  However, more often than not, I tend to take that keyword search one step further...the keyword itself will indicate items of interest, but will be loose enough that I'm going have a number of false positives.  Once I locate a hit, I'll look for other items in the same line that are of interest.

For example, let's take a look at Corey Harrell's recent post regarding locating an injected iframe.  This is an excellent, very detailed post where Corey walks through his analysis process, and at one point, locates two 'suspicious' process names in the output of a volatile data collection script.  The names of the processes themselves are likely random, and therefore difficult to include in a keyword list when conducting a search.  However, what we can take away from just that section of the blog post is that executable files located in the root of the ProgramData folder would be suspicious, and potentially malicious.  Therefore, a script that that parses the file path and looks for that condition would be extremely useful, and written in Perl, might look something like this:

my @path = split(/\\/,$filepath);
my $len = scalar(@path);
if (lc($path[$len - 2]) eq "programdata" && lc($path[$len - 1]) =~ m/\.exe$/) {
  print "Suspicious path found: ".$filepath."\n";

Similar paths of interest might include "AppData\Local\Temp"; we see this one and the previous one in one of the images that Corey posted of his timeline later in the blog post, specifically associated with the AppCompatCache data output.

Java *.idx files
A while back, I posted about parsing Java deployment cache index (*.idx) files, and incorporating the information into a timeline.  One of the items I'd seen during analysis that might indicate something suspicious is the last modified time embedded in the server response be relatively close (in time) to when the file was actually sent to the client (indicated by the "date:" field).  As such, I added a rule to my own code, and had the script generate an alert if the "last modified" field was within 5 days of the "date" field; this value was purely arbitrary, but it would've thrown an alert when parsing the files that Corey ran across and discussed in his blog.

Adding intel is generally difficult to do with third-party, closed source tools that we download from someone else's web site, particularly GUI tools.  In such cases, we have to access the data in question, export that data out to a different format, and then run our analysis process against that data.  This is why I recommend that DFIR analysts develop some modicum of programming skill...you can either modify someone else's open source code, or write your own parsing tool to meet your own specific needs.  I tend to do this...many of the tools I've written and use, including those for creating timelines, will incorporate some modicum of alerting functionality.  For example, RegRipper version 2.8 incorporates alerting functionality directly into the plugins. This alerting functionality can greatly enhance our analysis processes when it comes to detecting persistence mechanisms, as well as illustrating suspicious artifacts as a result of program execution.

Writing Tools
I tend to write my own tools for two basic reasons:

First, doing so allows me to develop a better understanding of the data being parsed or analyzed.  Prior to writing the first version of RegRipper, I had written a Registry hive file parser; as such, I had a very deep understanding of the data being parsed.  That way, I'm better able to troubleshoot an issue with any similar tool, rather than simply saying, "it doesn't work", and not being able to describe what that means.  Around the time that Mandiant released their shim cache parsing script, I found that the Perl module used by RegRipper was not able to parse value "big data"; rather than contacting the author and saying simply, "it doesn't work", I was able to determine what about the code wasn't working, and provide a fix.  A side effect of having this level of insight into data structures is that you're able to recognize which tools work correctly, and select the proper tool for the job.

Second, I'm able to update and make changes to the scripts I write in pretty short order, and don't have to rely on someone else's schedule to allow me to get the data that I'm interested in or need.  I've been able to create or update RegRipper plugins in around 10 - 15 minutes, and when needed, create new tools in an hour or so.

We don't always have to get our intelligence just from our own analysis. For example, this morning on Twitter, I saw a tweet from +Chris Obscuresec indicating that he'd found another DLL search order issue, this one on Windows 8 (application looked for cryptbase.dll in the ehome folder before looking in system32); as soon as I saw that, I thought, "note to self: add checking for this specific issue to my Win8 analysis process, and incorporate it into my overall DLL search order analysis process".

The key here is that no one of us knows everything, but together, we're smarter than any one of us.

I know that what we've discussed so far in this post sounds a lot like the purpose behind the OpenIOC framework.  I agree that there needs to be a common framework or "language" for representing and sharing this sort of information, but it would appear that some of the available frameworks may be too stringent, not offer enough flexibility, or are simply internal to some organizations.  Or, the issue may be as Chris Pogue mentioned during the 2012 SANS DFIR Summit..."no one is going to share their secret sauce."  I still believe that this is the case, but I also believe that there are some fantastic opportunities being missed because so much is being incorporated under the umbrella of "secret sauce"; sometimes, simply sharing that you're seeing something similar to what others are seeing can be a very powerful data point.

Regardless of the reason, we need to overcome our own (possibly self-imposed) roadblocks for sharing those things that we learn, as sharing information between analysts has considerable value.  Consider this post...who had heard of the issue with imm32.dll prior to reading that post?  We all become smarter through sharing information and intelligence.  This way, we're able to incorporate not just our own intelligence into our analysis processes, but we're also able to extend our capabilities by adding intelligence derived and shared by others.

Thursday, July 18, 2013


I've offered up a number of HowTo blog posts thus far, and hopefully DFIR folks out there have found use in them.  In the comments of one of the posts, a reader offered up a list of proposed some HowTo topics which he would like to see addressed.  As many of you may have noticed, most of my current posts have been technical in nature...specific artifacts to look for, specific tools to use, etc.  My hope has been to enable folks to expand their own analysis processes, either through the content I have provided, or by any exchange or discussion that occurs as a result of the material posted.  Most of the requested topics are either very general, or refer to soft-skills topics, so I wanted to take the opportunity to address them and see what others might have to add...

How to do a root cause analysis

This is an interesting question, in part because there's been discussion in the past regarding the need for conducting a root cause analysis, or "RCA".

An excellent resource for this topic is Corey Harrell's jIIr blog, as he's written blog posts regarding malware RCA, compromise RCA, and there are other posts that discuss topics associated with root cause analysis.

How to work with clients who are stressed, want answers now, point fingers, or heads to roll.

I think that like any other type of incident, it depends, and it's simply something that you need to be prepared for.

I remember one engagement that I was attempting to address.  The customer who called was not part of the IT staff, and during the initial discussions, it was clear that there was a good deal of stress involved in this incident.  At one point, we came down to the customer simply wanting us to get someone on site immediately, and we were trying to determine which site we needed to send someone to...the corporate offices were located in a different city than the data center, and as such, anyone sent might need to fly into a different airport.  If the responder flew into the wrong one, they'd have to drive several hours to the correct location, further delaying response.  The more the question was asked of the customer, the more frustrated they became, and they just didn't answer the question.

In my experience, the key to surviving trying times such as these are process and documentation.  Process provides analysts with a starting point, particularly during stressful times when everything seems like a whirlwind and you're being pulled in different directions.  Documenting what you did, and why, can save your butt after the fact, as well.

When I was in the military, like many military units, we'd go on training exercises.  During one exercise, we came back to base and during our "hot washup" after-action meeting, one of the operators made the statement that throughout the exercise, "comm sucked", indicating that communications was inadequate.  During the next training exercise, we instituted a problem reporting and resolution process, and maintained detailed records in a log book.  Operators would call into a central point and the problem would be logged, reported to the appropriate section (we had tactical data systems, radar, and communications sections), and the troubleshooting and resolution of the issue would be logged, as well.  After the exercise, we were in our "hot washup" when one of the operators got up and said "comm sucked", at which point we pushed the log book across the table and said, "show us where and why...".  The operators changed their tune after that.  Without the process and documentation, however, we would have been left with commanders asking us to explain an issue that didn't have any data to back it up.  The same thing can occur during an incident response engagement in the private sector.

How to hit the ground running when you arrive at a client with little information.

During my time as an emergency incident responder, this happened often...a customer would call, and want someone on-site immediately.  We'd start to ask questions regarding the nature of the incident (helped us determine staffing levels and required skill sets), and all we would hear back is, "Send someone...NOW!"

The key to this is having a process that responders use in order to get started.  For instance, I like to have a list of questions available when a customer calls (referred to as a triage worksheet); these are questions that are asked of all customers, and during the triage process the analyst will rely on their experience to ask more probing questions and obtain additional information, as necessary.  The responder to go on-site is given the completed questionnaire, and one of the first things they do is meet with the customer point of contact (PoC) and go through the questions again, to see if any new information has been developed.

One of the first things I tend to do during this process is ask the PoC to describe the incident, and I'll ask questions regarding the data that was used to arrive at various conclusions.  For example, if the customer says that they're suffering from a malware infection, I would ask what they saw that indicated a malware infection...AV alerts or logs, network traffic logged/blocked at the firewall, etc.

Generally speaking, my next step would be to either ask for a network diagram, or work with the PoC to document a diagram of the affected network (or portion thereof) on a white board.  This not only provides situational awareness, but allows me to start asking about network devices and available logs.

So, I guess the short answer is, in order to "hit the ground running" under those circumstances, have a process in place for collecting information, and document your steps.

How to communicate during an incident with respect to security and syngergy with other IRT members.

As with many aspects of incident response, it depends.  It depends on the type and extent of incident, who's involved, etc. Most of all, it depends upon the preparedness of the organization experiencing the incident.  I've seen organizations with Nextel phones, and the walkie-talkie functionality was used for communications.

Some organizations will use the Remedy trouble-ticketing system, or something similar.  Most organizations will stay off of email all together, assuming that this has been 'hacked', and may even move to having key personnel meet in a war room.  In this way, communications handled face-to-face, and where applicable, I've found this to be very effective.  For example, if someone adds "it's a virus" to an email thread, it may be hard to track that person down and get specific details, particularly when that information is critical to the response.  I have been in a war room when someone has made that statement, and then been asked very pointed questions about the data used to arrive at that statement.  Those who have that data are willing to share it for the benefit of the entire response team, and those who don't learn an important lesson.

How to detect and deal with timestomping, data wiping, or some other antiforensic [sic] technique.

I'm not really sure how to address this one, in part because I'm not really sure what value I could add to what's already out there.  The topic of time stomping, using either timestomp.exe or some other means, such as copying the time stamps from kernel32.dll via the GetFileTime/SetFileTime API calls, and how to detect their use has been addressed at a number of sites, including on the ForensicsWiki, as well as on Chris Pogue's blog.

How to "deal with" data wiping is an interesting question...I suppose that if the issue is one of spoliation, then being able to determine the difference between an automated process, and one launched specifically by a user (and when) may be a critical component of the case.

As far as "some other antiforensic[sic] technique", I would say again, it depends.  However, I will say that the use of anti-forensic techniques should never be assumed, simply because one artifact is found, or as the case may be, not found.  More than once, I've been in a meeting when someone said, "...it was ...", but when asked for specific artifacts to support that finding, none were available.

How to get a DFIR job, and keep it.

I think that to some degree, any response to this question would be very dependent upon where you're located, and if you're willing to relocate.

My experience has been that applying for jobs found online rarely works, particularly for those sites that link to an automated application/submission process.  I've found that it's a matter of who you know, or who knows you.  The best way to achieve this level of recognition, particularly in the wider community, is to engage with other analysts and responders, through online meetings, blogging, etc.  Be willing to open yourself up to peer review, and ignore the haters, simply because haters gonna hate.

How to make sure management understands and applies your recomendations [sic] after an incident when they're most likely to listen.

Honestly, I have no idea.  Our job as analysts and responders is to present facts, and if asked, possibly make recommendations, but there's nothing that I'm aware of that can make sure that management applies those recommendations.  After all, look at a lot of the compliance and legislative regulatory requirements that have been published (PCI, HIPAA, NCUA, etc.) and then look at the news.  You'll see a number of these bodies setting forth requirements that are not followed.

How to find hidden data; in registry, outside of the partition, ADS, or if you've seen data hidden in the MFT, slackspace, steganography, etc.

Good question...if something is hidden, how do you find it...and by extension, if you follow a thorough, documented process to attempt to detect data hidden by any of these means and don't find anything, does that necessarily mean that the data wasn't there?

Notice that I used the word "process" and "documented" together.  This is the most critical part of any analysis...if you don't document what you did, did it really happen?

Let's take a look at each of the items requested, in order:

Registry - my first impression of this is that 'hiding' data in the Registry amounts to creating keys and/or values that an analyst is not aware of.  I'm familiar with some techniques used to hide data from RegEdit on a live system, but those tend to not work when you acquire an image of the system and use a tool other than RegEdit, so the data really isn't "hidden", per se.  I have seen instances where searches have revealed hits "in" the Registry, and then searching the Registry itself via a viewer has not turned up those same items, but as addressed in Windows Registry Forensics, this data really isn't "hidden", and it's pretty easy to identify if the hits are in unallocated space within the hive file, or in slackspace.

Outside the partition - it depends where outside the partition that you're referring.  I've written tools to start at the beginning of a physical image and look for indications of the use of MBR infectors; while not definitive, it did help me narrow the scope of what I was looking at and for.  For this one, I'd suggest looking outside the partition as a solution.  ;-)

ADS - NTFS alternate data streams really aren't hidden, per se, once you have an image of the system.  Some commercial frameworks even highlight ADSs by printing the stream names in red.

MFT - There've been a number of articles written on residual data found in MFT records, specifically associated with files transitioning from resident to non-resident data.  I'm not specifically aware of an intruder hiding data in an MFT record...to me, it sounds like something that would not be too persistent unless the intruder had complete control of the system, to a very low level.  If someone has seen this used, I would greatly appreciate seeing the data.

Slackspace - there are tools that let you access the contents of slackspace, but one of the things to consider is, if an intruder or user 'hides' something in slackspace, what is the likelihood that the data will remain available and accessible to them, at a later date?  After all, the word "hiding" has connotations of accessing the data at a later date...by definition, slackspace may not be available.  Choosing a file at random and hiding data in the slackspace associated with that file may not be a good choice; how would you guarantee that the file would not grow, or that the file would not be deleted?  This is not to say that someone hasn't purposely hidden data in file slackspace; rather, I'm simply trying to reason through the motivations.  If you've seen this technique used, I'd greatly appreciate seeing the data.

Steganography - I generally wouldn't consider looking for this sort of hidden data unless there was a compelling reason to do so, such as searches in the user's web history, tool downloads, and indications of the user actually using tools for this.  

How to contain an incident.

Once again, the best answer I can give is, it depends.  It depends on the type of incident, the infrastructure affected, as well as the culture of the affected organization.  I've seen incidents in which the issue has been easy to contain, but I've also been involved in response engagements where we couldn't contain the issue because of cultural issues.  I'm aware of times where a customer has asked the response team to monitor the issue, rather than contain it.

Again, many of the topics that the reader listed were more on the "soft" side of skills, and it's important that responders and analysts alike have those skills.  In many cases, the way to address this is to have a process in place for responders to use, particularly during stressful times, and to require analysts to maintain documentation of what they do.  Yes, I know...no one likes to write, particularly if someone else is going to read it, but you'll wish you had kept it when those times come.

HowTo: Data Exfiltration

One of the questions I see time and again, in forums as well as from customers, is "what data was taken from the system?"  Sometimes, an organization will find out what data was taken when they get a call from an outside third party (just review any of the annual reports from Verizon, Mandiant, or TrustWave); if this is the case, they may have a pretty good idea of what data was taken.

This post is not intended to be totally comprehensive; rather, the idea here is to present artifacts that you can look for/at that you may not have seen before, and provide other opportunities for finding indications of data exfiltration.  Many of these artifacts are very simple to check for and analyze, and provide for a more thorough and complete examination, even if nothing is found.  After all, simply illustrating to the customer that you checked all of these possibilities provides value, regardless of whether any useful evidence was turned up.

It's also very important to point out that these artifacts may provide an indication of data exfiltration of some kind; the only way to determine if data was exfiltrated at a particular time, and what that data might have been, in a definitive manner is to have a full packet capture from the time of exfiltration.  That way, you can see exactly what was exfiltrated.

Attachments are an easy means for getting files off of a system.  Attachments can be made to email, web mail, as well as to chat/IM programs.  Files can be uploaded via the web to sites like Twitter, Yahoo Groups, Google Docs, etc.  The use of social media should be examined closely.

Program Execution
One of the things you'll want to look for is artifacts of program execution...many times, exfiltrating data requires that a program of some type be executed; whether it's launching a standalone application or uploading something via a browser, a program must be running for data exfil to occur.

Programs you might want to look for include the Windows native ftp.exe command line utility, third-party FTP utilities, etc.  Also, you might consider looking for the use of archiving utilities, such as rar.exe, particularly in cases where files may have been archived prior to transmittal.  As stated in the previous blog post, you'll want to look for artifacts such as application Prefetch files, etc.  You might also want to look to user-specific Registry values, such as:

User's UserAssist (GUI) or MUICache (CLI) - RegRipper userassist.pl or muicache.pl plugins, respectively

Tracing key - RegRipper tracing.pl plugin; this key contains subkeys that I have found refer to applications with networking capabilities.  I say this in part due to observation, but also because during one investigation where an intruder had installed and run an vulnerability exploitation tool (specifically, Havij), I found references to this tool beneath this key.

The first time I ran across the use of the Windows native fsquirt.exe utility, I found an entry for the utility in the user's MUICache data. The path pointed to the file in the system32 folder, which, after an initial investigation, appeared to be correct. I then found a Prefetch file for the utility, as well.  The utility is actually a wizard with a GUI, and as such, uses common dialogs to allow the user to select files to send to device; analysis of values in the ComDlg32\OpenSavePidlMRU key provided indications of files that might have been copied to the device.

You're probably thinking, "Really?" Well, you can find indications of possible data exfiltration (or infiltration) within the shellbags artifacts.

Shellbag artifacts can provide indications of access to network resources (such as shares), not only within the network infrastructure, but also beyond the borders of the infrastructure.

Shellbags can show indications of access to resources for data exfiltration through different types of shell items.  When I first started working with my publisher, I was provided with instructions for accessing their FTP site via Windows Explorer, which would allow me to drag-and-drop files between folders.  This method for accessing an FTP server does not leave what one would expect to be the "normal" artifacts...there is no Prefetch file (or other artifact) created for the use of the native ftp.exe utility, nor are there any UserAssist artifacts created.  However, as this method of exchanging files requires interaction with Windows Explorer, shellbag artifacts are created, and as a result, artifacts are also created beneath the Software\Microsoft\FTP\Accounts key within the user's NTUSER.DAT hive.

As mentioned previously, shellbags artifacts can also provide indications of access not only to traditional USB storage devices (i.e., thumb drives), but also to other devices (smartphones, MP3 players, and digital cameras) that can be connected to the system via a USB cable.  This is important to understand, as a number of the available tools for parsing shellbag artifacts do not parse the shell items for these devices; as such, access to these devices will not be apparent when some of the popular tools are used to parse and display these artifacts.

Cloud Services
I purposely haven't addressed cloud services in this post, as this topic should likely be addressed on it's own.  I have provided some resources (below) that may be of value.  As there are a number of different types of cloud services available, I would like to get input from the DFIR community regarding this topic, specifically. Many of these cloud services can be accessed via the web, or by specific applications installed on the system, and as such, artifacts may vary depending upon how the user accessed those services.

I've provided links to some interesting resources on this topic below.

Jad Saliba's presentation
Mary Hughes' capstone project blog; this site hasn't been updated since Feb, 2013, but it does have a number of very useful links
Derek Newtons Forensic Artifacts: Dropbox blog post
Some Carbonite artifacts - lists some Registry keys and files, not much explanation or detail

Other Means
Not all means of exfiltrating data out of an infrastructure are really very sophisticated.  I once worked for a company that was going out of business, and was involved with providing physical security when offices were being shut down.  At one point, we were contacted by HR, as they suspected that the office closure schedule had been obtained after someone "hacked" one of their computers.  A short investigation determined that someone had printed the schedule, and left it on the printer, where someone else had found the schedule and faxed it to all of the involved offices.  I've also seen where information has been pasted into an AIM chat window.

A number of years ago, I had a couple of data exfil/exposure incidents while I was filling an FTE position at a now-defunct telecom company.  In 2001, the official story was that the company was going to go through bankruptcy, and morale was very low.  The security team was contacted by a member of senior management, as apparently company memos regarding the proceedings of meetings and the direction of the company were being posted to a site called "doomedcompany.com" (this site had a sister site, whose name was a bit more vulgar).  In many cases, these memos were being posted before offices outside of the corporate headquarters received the memos via email.  We knew that a number of employees in the company were visiting the site, and that most were doing so in order to read the memos and commentary.  We had to create an account on the site, and then upload a file in order to be able to differentiate between an employee reading the memo, and someone uploading a file.  By identifying those artifacts, we were able to incorporate that information into searches.

In another case, the security team was contacted by members of HR, with the complaint that their computers were 'hacked'.  The issue centered around the impending shutdown of a call center office in Arizona, and the fact that the list of the employees being laid off was made available to that entire office.  We sat down with the HR associate who had put the list together, and asked questions, such as "...did you email this list to anyone?" and "...did you place this list on a file server?"  Ultimately, we found out that she'd sent the list to the printer, and then stepped out for a meeting.  Apparently, another employee had collected their printed document from the printer, found the list, and then faxed the list to the Arizona office.

The point is that sometimes, data exfiltration is as simple as picking up something off of the printer.

Data can be exfiltrated from a system in a number of ways, but those methods generally come down to using the network (moving data to a file share, attaching files to web-based emails, etc.), or copying files to an "attached" device (attached physically via USB, or via Bluetooth).  If data exfiltration is suspected, then the steps that you might want to take include:

  1. Attempt to determine with a greater level of certainty or clarity why data exfil is suspected; what details are available?  Is this based on third-party notification, or the result of monitoring?
  2. Determine where the data that was thought to have been exfil'd originally or usually could be found; was the data on a file or database server, or did it reside on a user's system?
  3. Did the user have access to that data?  Are there indications that the user interacted with or accessed that data?  
  4. Determine indications of programs that could be used for data exfil being executed.
One final thought...Windows systems do NOT maintain artifacts of copy operations.  You will not find any logs or Registry values that indicate files that were copied to a removable device, for example, particularly if all you have available for analysis is an image of the system.  If the user were to copy a file to an external resource, such as a thumb drive or remote file share, and then open the file from there, then a Windows shortcut/LNK file would be created on the user's system.  However, by itself, all that LNK file shows is that the user opened a file...it does not provide an indication that the user explicitly copied the file to the external resource.  In the absence of some sort of monitoring agent, additional analysis, particularly of file system time stamps on both resources, would be required in order to more closely determine if the user copied the file.

Monday, July 15, 2013

HowTo: Detecting Persistence Mechanisms

This post is about actually detecting persistence mechanisms...not querying them, but detecting them.  There's a difference between querying known persistence mechanisms, and detecting previously unknown persistence mechanisms used by malware; the former we can do with tools such as AutoRuns and RegRipper, but the latter requires a bit more work.

Detecting the persistence mechanism used by malware can be a critical component of an investigation; for one, it helps us determine the window of compromise, or how long it's been since the system was infected (or compromised).  For PCI exams in particular, this is important because many organizations know approximately how many credit card transactions they process on a daily or weekly basis, and combining this information with the window of compromise can help them estimate their exposure.  If malware infects a system in a user context but does not escalate it's privileges, then it will mostly likely start back up after a reboot only after that user logs back into the system.  If the system is rebooted and another user logs in (or in the case of a server, no user logs in...), then the malware will remain dormant.

Detecting Persistence Mechanisms
Most often, we can determine a malware persistence mechanism by querying the system with tools such as those mentioned previously in this post.  However, neither of these tools is comprehensive enough to cover other possible persistence mechanisms, and as such, we need to seek other processes or methods of analysis and detection.

One process that I've found to be very useful is timeline analysis.  Timelines provide us with context and an increased relative confidence in our data, and depending upon which data we include in our timeline, an unparalleled level of granularity.

Several years ago, I determined the existence of a variant of W32/Crimea on a system (used in online banking fraud) by creating a timeline of system activity.  I had started by reviewing the AV logs from the installed application, and then moved on to scanning the image (mounted as a volume) with several licensed commercial AV scanners, none of which located any malware.  I finally used an AV scanner called "a-squared" (now apparently owned by Emsisoft), and it found a malicious DLL.  Using that DLL name as a pivot point within my timeline, I saw that relatively close to the creation date of the malicious DLL, the file C:\Windows\system32\imm32.dll was modified; parsing the file with a PE analysis tool, I could see that the PE Import Table had been modified to point to the malicious DLL.  The persistence mechanism employed by the malware was to 'attach' to a DLL that is loaded by user processes that interact with the keyboard, in particular web browsers.  It appeared to be a keystroke logger that was only interested in information entered into form fields in web pages for online banking sites.

Interestingly enough, this particular malware was very well designed, in that it did not write the information it collected to a file on the system.  Instead, it immediately sent the information off of the system to a waiting server, and the only artifacts that we could find of that communication were web server responses embedded in the pagefile.

Something else to consider is the DLL Search Order "issue", often referred to as hijacking. This has been discussed at length, and likely still remains an issue because it's not so much a specific vulnerability that can be patched or fixed, but more a matter of functionality provided by the architecture of the operating system.

In the case of ntshrui.dll (discussed here by Nick Harbour, while he was still with Mandiant), this is how it worked...ntshrui.dll is listed in the Windows Registry as an approved shell extension for Windows Explorer.  In the Registry, many of the approved shell extensions have explicit paths listed...that is, the value is C:\Windows\system32\some_dll.dll, and Windows knows to go load that file.  Other shell extensions, however, are listed with implicit paths; that is, only the name of the DLL is provided, and when the executable (explorer.exe) loads, it has to go search for that DLL.  In the case of ntshrui.dll, the legitimate copy of the DLL is located in the system32 folder, but another file of the same name had been created in the C:\Windows folder, right next to the explorer.exe file.  As explorer.exe starts searching for the DLL in it's own directory, it happily loaded the malicious DLL without any sort of checking, and therefore, no errors were thrown.

Around the time that Nick was writing up his blog post, I'd run across a Windows 2003 system that had been compromised, and fortunately for me, the sysadmins had a policy for a bit more extensive logging enabled on systems.  As I was examining the timeline, starting from the most recent events to occur, I marveled at how the additional logging really added a great deal of granularity to thing such as a user logging in; I could see where the system assigned a token to the user, and then transferred the security context of the login to that user.  I then saw a number of DLLs being accessed (that is, their last accessed times were modified) from the system32 folder...and then I saw one (ntshrui.dll) from the C:\Windows folder.  This stood out to me as strange, particularly when I ran a search across the timeline for that file name, and found another file of the same name in the system32 folder.  I began researching the issue, and was able to determine that the persistence mechanism of the malware was indeed the use of the DLL search order "vulnerability".

Creating Timelines
Several years ago, I was asked to write a Perl script that would list all Registry keys within a hive file, along with their LastWrite times, in bodyfile format.  Seeing the utility of this information, I also wrote a version that would output to TLN format, for inclusion in the timelines I create and use for analysis.  This allows for significant information that I might not otherwise see to be included in the timeline; once suspicious activity has been found, or a pivot point located, finding unusual Registry keys (such as those beneath the CLSID subkey) can lead to identification of a persistence mechanism.

Additional levels of granularity can be achieved in timelines through the incorporation of intelligence into the tools used to create timelines, something that I started adding to RegRipper with the release of version 2.8. One of the drawbacks to timelines is that they will show the creation, last accessed, and last modification times of files, but not incorporate any sort of information regarding the contents of that file into the timeline.  For example, a timeline will show a file with a ".tmp" extension in the user's Temp folder, but little beyond that; incorporating additional functionality for accessing such files would allow us to include intelligence from previous analyses into our parsing routines, and hence, into our timelines.  As such, we may want to generate an alert for that ".tmp" file, specifically if the binary contents indicate that it is an executable file, or a PDF, or some other form of document.

Another example of how this functionality can be incorporated into timelines and assist us in detecting persistence mechanisms might be to add grep() statements to RegRipper plugins that parse file paths from values.  For example, your timeline would include the LastWrite time for a user's Run key as an event, but because the values for this key are not maintained in any MRU order, there's really nothing else to add.  However, if your experience were to show that file paths that include "AppData", "Application Data", or "Temp" might be suspicious, why not add checks for these to the RegRipper plugin, and generate an alert if one is found?  Would you normally expect to see a program being automatically launched from the user's "Temporary Internet Files" folder, or is that something that you'd like to be alerted on.  The same sort of thing applies to values listed in the InProcServer keys beneath the CLSID key in the Software hive.

Adding this alerting functionality to tools that parse data into timeline formats can significantly increase the level of granularity in our timelines, and help us to detect previously unknown persistence mechanisms.

Mandiant: Malware Persistence without the Windows Registry
Mandiant: What the fxsst?
jIIR: Finding Malware like Iron Man
jIIR: Tracking down persistence mechanisms

HowTo: Malware Detection, pt I

Many times we'll come across a case where we need to determine the presence of malware on a system.  As many of us are aware, AV products don't always work the way we hope they would...they don't provide us with 100% coverage and detect everything that could possibly affect our systems.

This post is NOT about malware analysis.  This post addresses malware detection during dead box analysis.  Malware detection is pretty expansive, so to really address the topic, I'm going to spread this one out across several blog posts.

Malware Detection
Malware detection during dead box analysis can be really easy, or it can be very hard.  I say this because we can mount an image as a read-only volume and run several (or more) AV scanners against the volume, and keep track of all the malware found.  Or, we can run several AV scanners against the volume, and they will all find nothing - but does that mean that there isn't any malware on the system?

This post is the first of several that will write, in an attempt to fully address this issue.

Before we start digging into the guts of detecting malware during dead box analysis, it is important to understand the four characteristics of malware, specifically the initial infection vector, the propagation mechanism, the persistence mechanism, and artifacts of the malware.  I originally developed these characteristics as a way of helping new analysts develop a confident, professional demeanor when engaging with customers; rather than reacting like a deer in the headlights, my hope was to help these new analysts understand malware itself to the point where they could respond to a customer request in a professional and confident manner.  Understanding and applying these characteristics enables analysts to understand, detect, locate, and mitigate malware within an infrastructure.

Initial Infection Vector
The initial infection vector (or IIV) refers to how the malware originally made it's way on to the system.  Worms like SQL Slammer took advantage of poor configuration of systems; other malware gets on to systems as a result of exploiting vulnerabilities in browsers.

Understanding malware IIV mechanisms not only provides us with a starting point for beginning our investigation, but also allows analysts to go from "I think it got on the system..." or "..if I had written the malware...", to actually being able to demonstrate, through empirical data, how the malware ended up on the system.  Too many times, the IIV is assumed, and that assumption is passed on the customer, who uses that information to make critical business decisions, possibly even validating (or, invalidating) compliance.

Also, keep in mind that many of the more recent malware samples appear on systems as a result of a multi-stage delivery mechanism.  For example, a user may open an email attachment, and the document will contain embedded malicious code that will exploit a vulnerability in the target application, which may reach to a server with additional instructions, and then the second stage will reach out to another server and download the actual malware.  As such, the malware does not simply appear on the system, and the IIV is actually much more complex than one might expect.

Determining the IIV can be an important factor in a number of exams.  For example, PCI exams require that the analyst determine the window of compromise, which is essentially the time from when the system was first compromised or infected, to when the incident was discovered and taken offline.  While this is done for fairly obvious reasons, other non-PCI cases ultimately have similar requirements.  Being able to accurately determine when the system was first infected or compromised can be a critical part of an exam, and as such should not be left to speculation and guesswork, particularly when this can be determined through the use well-though-out processes.

Propagation Mechanism
This characteristic refers to how the malware moves between systems, if it does.  Some malware doesn't move between systems on it's own...instead, it infects one system and doesn't move on to other systems.  For example, RAM scrapers found during PCI cases don't infect one system and then propagate to another...rather, the malware is usually placed on specific systems by an intruder who has unrestricted access to the entire infrastructure.

Some malware will specifically infect removable devices as a means of propagation.  Worms are known primarily for their ability to propagate via the network.  Other malware is known to infect network shares, in the hopes that by infecting files on network shares, the malware will spread through the infrastructure as users access the infected files.

It's important to note that the malware propagation mechanism may be the same as the IIV, but analysts should not assume that this is the case.  Some malware may get onto a system within an infrastructure as a result of a spear-phishing campaign, and once on the internal infrastructure, propagate via network or removable drives.

According to Jesse Kornblum's Rootkit Paradox, rootkits want to remain hidden, and they want to run.  The paradox exists in the fact that by running, there will be ways to detect rootkits, even though they want to remain hidden.  The same is true with malware, in general, although malware authors are not as interested in remaining hidden as rootkit authors.  However, the fact is that as malware interacts with it's environment, it will leave artifacts.

Self-Inflicted Artifacts
As malware interacts with it's environment, artifacts will be created. These artifacts may be extremely transient, while others, being created by the environment itself, may be much more persistent.

One of the issues I've seen over the years when AV vendors have produced technical reports regarding malware is that there are a number of self-inflicted artifacts; that is, artifacts are created as a result of how the malware is launched in the testing environment. One of the best examples of this occurs when the IIV of a malware sample is a multi-stage delivery mechanism, and the analyst only has a copy of the executable delivered in the final stage. When this occurs, the malware report will contain artifacts of the analyst launching the malware, which will not show up on a system that was infected in the wild.

Looking for malware artifacts is a lot like using those "expert eyes" that Chris Pogue talks about in his Sniper Forensics presentations.  When malware executes, it interacts with it's environment, and the great thing about the Windows environment is that it records a lot of stuff.  In that way, it's a lot like using 'expert eyes' to look for deer on the Manassas Battlefield park...as deer move through the park, they leave signs or 'artifacts' of their presence.  They're grass eaters, and they leave scat that is different from fox, dogs, and bears (as well as from other grass eaters, like our horses).  They leave tracks, whether it's in the mud, sand, soft dirt or snow.  When they move through tall grass, they leave trails that are easily visible from horseback.  When they lay down for the night, they leave matted grass.  Sometimes the bucks will leave velvet from their horns, or if a couple of bucks are sparing, you may actually find a horn or two (we have a couple hanging in the barn).  My point is that I don't have to actually see a deer standing in a field or along a stream to know that there are deer in the area, as I can see their "artifacts".  The same thing is true for a lot of the malware out there...we just need to know what artifacts to look for, and how to look for them.  This is how we're able to detect malware during dead box analysis, particularly when that malware is not detected by AV scanners.

Some of the more popular locations where I've found artifacts include AV logs, and on Windows XP systems in particular, within the Windows firewall configuration.  I've seen a number of instances where malware has been detected by AV, but according to the logs, the AV was configured to take no action.  As such, as strange as it may seem, the malware infection is clearly visible in the AV logs.  In another instance, I was reviewing the on-demand scan AV logs and found an instance of malware being detected.  A closer examination of the system indicated that malware had originally been created on the system less than two days prior to the log entries, and I was able to verify that the malware had indeed been quarantined and then deleted.  About six weeks later, a new variant of the malware, with the same name, was deposited on the system, and throughout the course of the timeline, none of the subsequent scans detected this new variant.  Previously-detected malware can provide valuable clues of an intruder attempting to get their malware on to a system.

In other instances, the malware infection process creates a rule in the Windows firewall configuration to allow itself out onto the network.  Finding a reference to a suspicious executable in the firewall configuration can point directly to the malware.

In another instance, the "malware" we found consisted of three components...a Windows service, the "compiled" Perl script, and a utility that would shuttle the collected track data off of the system to a waiting server.  The first time that the malware components were installed on the system, within two days, an on-demand AV scan detected one of the components and removed it from the system (as indicated by the AV logs and subsequent lack of Perl2Exe temp folders).  Several weeks later, the intruder installed a new set of components, which ran happily...the installed AV no longer detected the first component.  This can happen a lot, and even when the system has been acquired, AV scans do not detect the malware within the mounted image.  As such, we very often have to rely on other artifacts...unusual entries in autostart locations, indirect artifacts,

The use of artifacts to detect the presence of malware is similar to the use of rods and cones in the human eye to detect objects at night or in low light.  Essentially, the way our vision works, we do not see an object clearly at night by looking directly at it...rather, we have to look to the side of the object.  AV scanners tend to "look directly at" malware, and may not detect new variants (or in some cases, older variants)

Persistence Mechanism
The persistence mechanism refers to the means that malware utilizes to survive reboots.  Some persistence mechanisms may allow the malware to launch as soon as the system is booted, while others will start the malware until a user logs in, or until a user launches a specific application.  Persistence mechanisms can be found in Registry locations...during the 2012 SANS Forensic Summit, Beth (a Google engineer) described writing a web scraper to run across an AV vendor's web site, and around 50% of the Registry keys she found pointed to the ubiquitous Run key.  Persistence can also be achieved through a variety of other means...locations within the file system, DLL Search Order parsing, etc.  Determining the persistence mechanism can be a very valuable part of gaining intelligence from your examination, and detecting new persistence mechanisms (to be covered in a future post) can been equally important.

Mis-Identified Persistence
Sometimes when reading a malware write-up from a vendor, particularly Microsoft, I will see that the persistence mechanism is listed as being in the HKCU\..\Run key, and the write-up goes on to say, "...so that the malware will start at system boot. This is incorrect, and can be critical to your examination; for example, if a user context on a system is infected, but when the system is rebooted another user logs in, the malware is not active.

As you may already see, the persistence mechanism of malware is also an artifact of that malware; persistence is a subset of the artifact set.  In many ways, this can help us to a great extent when it comes to detecting malware, particular malware missed by AV scanners.

Thursday, July 11, 2013

Programming and DFIR

I was browsing through an online list recently and I came across an older post that I'd written, that had to do with tools.  In it, I'd made the statement, "Tweaked my browser history parser to add other available data to the events, giving me additional context."  This brought to mind just how valuable even the smallest modicum of programming skill can be to an analyst.

This statement takes understanding data structures a step further because we're not simply recognizing that, say, a particular data structure contains a time stamp.  In this case, we're modifying code to meet the needs of a specific task.  However, simply understanding basic programming principles can be a very valuable skill for DFIR work, in general, as the foundational concepts behind programming teach us a lot about scoping, and programming in practice allows us to move into task automation and eventually code customization.

David Cowen has been doing very well on his own blog-a-day-for-a-year challenge, and recently posted a blog regarding some DFIR analyst milestones that he outlined. In this post, David mentions that milestone 11 includes "basic programming".  This could include batch file programming, which is still alive and well, and extremely valuable...just ask Corey Harrell.  Corey's done some great things, such as automating exploiting VSCs, through batch files.

My programming background goes back to the early '80s, programming BASIC on the Timex-Sinclair 1000 and Apple IIe.  In high school, I learned some basic Pascal on the TRS-80, and then in college, moved on to BASIC on the same platform.  Then in graduate school, I picked up some C (one course), some M68K
assembly, and a LOT of Java and MatLab, to the point that I used both in my thesis.  This may seem like a lot, but none of it was really very extensive.  For example, when I was programming BASIC in college, my programs included one that displayed the Punisher skull on the screen and played the "Peter Gunn theme" in the background, and another one interfaced with a temperature sensor to display fluctuations on the screen.  In graduate school, the C programming course required as part of the MSEE curriculum really didn't have us to much more than open, write to or read from, and then close a file.  Some of the MatLab stuff was a bit more extensive, as we used it in linear algebra, digital signal processing and neural network courses.  But we weren't doing DFIR work, nor anything close to it.

The result of this is not that I became an expert programmer...rather, take a look that something David had said in a recent blog post, specifically that an understanding of programming helps you put your goals into perspective and reduce the scope of the problem you are trying to solve.  This is the single most valuable aspect of programming experience...being able to look at the goals of a case, and break them down into compartmentalized, achievable tasks.  Far too many times, I have seen analysts simply overwhelmed by goals such as, "Find all bad stuff", and even when going back to the customer to get clarification as to what the goals of the case should be, they still are unable to compartmentalize the tasks necessary to complete the examination.

Task Automation
There's a lot that we do that is repetitive...not just in a single case, but if you really sit down and think about the things you do during a typical exam, I'm sure that you'll come across tasks that you perform over and over again.  One of the questions I've heard at conferences, as well as while conducting training courses, is, "How do I fully exploit VSCs?"  My response to that is usually, "what do you want to do?"  If your goal is to run all the tools that you ran against the base portion of the image against the available VSCs, then you should consider taking a look at what Corey did early in 2012...as far as I can see, and from my experience, batch scripting such as this is still one of the most effective means of automating tasks such as this, and there is a LOT of information and sample code freely available on the Interwebs for automating an almost infinite number of tasks.

If batch scripting doesn't provide the necessary flexibility, there are scripting languages (Python, Perl) that might be more suitable, and there are a number of folks in the DFIR community with varying levels of experience using these languages...so don't be afraid to reach out for assistance.

Code Customization
There's a good deal of open source code out there that allows us to do the things we do.  In other cases, a tool that we use may not be open source, but we do have open source code that allows us to manipulate the output of the tool into a format that is more useful, and more easily incorporated into our analysis process.  Going back to the intro paragraph to this post, sometimes we may need to tweak some code, even if it's to simply change one small portion of the output from a decimal to hex when displaying a number.  Understanding some basic coding lets us not only be able to see what a tool is doing, but it also allows us to adjust that code when necessary.

Being able to customize code as needed also means that we can complete our analysis tasks in a much more thorough and timely manner.  After all, for "heavy lifting", or highly repetitive tasks, why not let the computer do most of the work?  Computers are really good at doing the same thing, over and over again, really fast...so why not take advantage of that?

While there is no requirement within the DFIR community (at large) to be able to write code, programming principles can go a long way toward developing our individual skills, as well as developing each of us into better analysts.  My advice to you is:

Don't be overwhelmed when you see code...try opening the code in a text viewer and just reading it.  Sure, you may not understand Perl or C or Python, but most times, you don't need to understand the actual code to figure out what it's doing.

Don't be afraid to reach out for help and ask a question.  Have a question about some code?  Reach out to the author.  Many times, folks crowdsource their questions, reaching to the "community" as a whole, and that may work for some.  However, I've had much better success by reaching directly to the coder...I can usually find their contact info in the headers of the code they wrote.  Who better to answer a question about some code than the person who wrote it?

Don't be afraid to ask for assistance in writing or modifying code.  From the very beginning (circa 2008), I've had a standing offer to modify RegRipper plugins or create custom plugins...all you gotta do is ask (provide a concise description of what's needed, and perhaps some sample data...).  That's it.  I've found that in most cases, getting an update/modification is as simple as asking.

Make the effort to learn some basic coding, even if it's batch scripting.  Program flow control structures are pretty consistent...a for loop is a for loop.  Just understanding programming can be so much more valuable than simply allowing you to write a program.

Wednesday, July 10, 2013

HowTo: Track Lateral Movement

A reader recently commented and asked that the topic of scoping an incident and tracking lateral movement be addressed.  I've performed incident response for some time and been involved in a wide variety of cases, so I thought I'd present something about the types of lateral movement I've encountered and how these types of cases were scoped. Some types of lateral movement are easier to scope than others, so YMMV.

When there's been lateral movement there are usually two systems involved; system A (the source) and system B (the destination).  Throughout this post, I'll try to point out which system the artifacts would appear on, as this can be a very important distinction.

SQL Injection
SQL injection (SQLi) is an interesting beast, as this exploit, if successful, allows an unprivileged user to access a system on your network, often with System-level privileges.  Not all SQLi cases necessarily involve lateral movement, specifically if the web and database servers are on the same system.  However, I have been involved in cases in which the web server was in the DMZ, but the database server was situated on the internal infrastructure.

After gaining access to a system via this type of exploit, the next thing we tended to see was that a reverse shell tool was downloaded and launched on the system, providing shell-based access to the attacker.  Very often, this can be achieved through the use of a modified version of VNC, of which there are several variants (OpenVNC, TightVNC, RealVNC, etc.). It was usually at this point that the intruder was able to orient themselves, perform recon, and then 'hop' to other systems, as necessary.

Tools: Editor, coding skillz

I remember in one case, an intruder had installed a reverse shell on a system (we found this later during our investigation) and had gone undetected until they found that they were on the system at the same time as one of the admins...at which point, they opened up Notepad and typed a message to the admin. It was only after this event that we were called. ;-)

Terminal Services
I was once engaged in a case where an employee's home system had been compromised, and a keystroke logger installed.  The intruder found that the user used their home system to access their employer's infrastructure via Terminal Services, and took advantage of this to use the stolen credentials to access the infrastructure themselves.  The access was via RDP, and after initial access to the infrastructure, the intruder continued to use RDP to access other systems.  Further, all of the systems that the intruder logged into had never been accessed with those credentials.  As such, it was a simple matter to examine a few artifacts on the each of the "compromised" systems in turn, and then to verify other systems on which the user profile was found.

The systems that we dealt with were a mix of Windows XP, and Windows 2000 and 2003 for servers.  As such, the artifacts we were interested in were found in the user profile's NTUSER.DAT hive file.  If the workstation systems had been Windows 7 systems, we would have included Jump Lists (specifically, those with the AppID specific to Terminal Services) in our examination.

The system A artifacts would include Registry values, and for Windows 7 systems, Jump Lists.  Depending upon the actions the user took (i.e., double-clicking a file), there may also be shortcuts/LNK files that point to the remote system.

System B artifacts would include logins found in the Security Event Log...on Win2003 systems, look for event ID 528 type 10 logins.  On Win2008 R2, look for event ID 4624 events, as well as TerminalServices-LocalSessionManager events with ID 21.

Tools: RegRipper tsclient.pl plugin, Jump List parser (system A), Logparser (system B)

Mapping Shares
While the interaction with shares is somewhat limited, it can still be seen as a form of "lateral movement".  Shares mapped via the Map Network Drive wizard appear within the user's Registry hive, in the Map Network Drive MRU key, on system A.

On system B, the connection would appear as a login, accessing resources in the Event Log.

As Event Logs tend to rotate, the artifacts on system A may persist for much longer than those on system B.

It is important to make the distinction between GUI and CLI artifacts. Many of the artifacts that we see in the user's hive that are associated with accessing other systems are the result of the user interacting via the Windows Explorer shell, which is why the path where they can be found is Windows\CurrentVersion\Explorer.  Access via CLI tools such as mapping/accessing a remote share via net.exe does not produce a similar set of artifacts.

Tools: RegRipper mndmru.pl plugin (system A); Logparser (system B)

It's funny that when I sit down to outline some of these HowTo blog posts, I start by typing in subheaders )like this one, which I then italicize) for topics to include in the overall subject of the post.  It's interesting that the shellbags artifacts tend to appear in so many of these posts!  Similar to mapping shares, these artifacts can provide indications of access to other systems.  There are specific top-level shell items that indicate network-based resources, and will illustrate (on system A) a user accessing those remote resources.

Tools: RegRipper shellbags.pl plugin (system A)

Scheduled Tasks
Scheduled Tasks can easily be created through the use of one of two CLI tools on systems:  schtasks.exe and at.exe.  Both tools utilize switches for creating scheduled tasks on remote systems.  On system A, you may find indications of the use of these CLI applications in the Prefetch files or other artifacts of program execution.  On system B, you may find indications of a service being started in the System Event Log (event ID 7035/7036), and on WinXP and 2003 systems, you may find indications of the task being run in the SchedLgU.txt file (although this file, like the Event Logs, tends to roll-over...).  On Windows 2008 R2, you should look to the Microsoft-Windows-TaskScheduler/Operational Event Log...event ID 102 indicates a completed task, and event ID 129 indicates that a task process was created.

PSExec and other similar services (I've seen rcmd and xcmd used) can be used to execute processes on remote services.  The artifacts you would look for would be similar to those for Scheduled Tasks, with the exception of the specific event records on system B.

Artifacts on system A might include Prefetch files and other artifacts of program execution.

Artifacts on system B might include System Event Log entries, specifically those with event source Service Control Manager and event IDs of 7035 (service sent a start command) and 7036 (service entered a running state).

Tools: LogParser, TZWorks evtwalk or evtx_view (system B)

Testing, and Artifacts
In order to see for yourself what these artifacts "look like", try running these tools on your own.  You can do so fairly easily by setting up a virtual system and using any of these methods to access the "remote" system.

Monday, July 08, 2013

HowTo: Determine User Access To Files

Sometimes during an examination, it is important for the analyst to determine files that the user may have accessed, or at least had knowledge of.  There are a number of artifacts that can be used to determine which files a user accessed.  As of Vista, the Windows operating systems, by default, do not update the last accessed times on files when normal system activity occurs, so some other method of determining the user (or process) that accessed files, and when, needs to be developed.  Fortunately, Windows systems maintain a good deal of information regarding files users have accessed, and some artifacts may be of value, particularly in the face of the use of anti- or counter-forensics techniques.

Files on the Desktop
By default, Windows does not normally place files on a user's desktop.  Most often, installation programs provide an option for putting a shortcut to the application on the desktop, but files themselves usually end up on the desktop as a result of direct and explicit actions taken by the user.  The presence of the files themselves on the desktop can be correlated with other artifacts in order to determine when the user may have accessed those files.

Recycle Bin
Clearly, if a user deleted files, they had knowledge of them, and accessed them to the point that they deleted those files.  However, keep in mind that this is not definitive...for example, if the user uses the Shift+Del key combination, or the NukeOnDelete value has been set, the files will bypass the Recycle Bin.  As such, an empty Recycle Bin should not lead to the assumption that the user took the explicit action to empty it.  It's trivial to check artifacts that can significantly impact your findings, so do so.

Tools: Various, including custom Perl script (recbin.pl)

LNK Files/Jump Lists
Most analysts are familiar with Windows shortcut/LNK files, particularly as a means of demonstrating user knowledge of and access to files.  LNK files within the user's Windows\Recent and Office\Recent folders are created automatically when the user double-clicks the files (and the application to view the file is launched), and will contain information about the location of the file, etc.

AutomaticDestinations Jump Lists are similarly created automatically by the operating system, as a result of user actions.  These files consist of a compound binary format, containing two types of streams...LNK streams, and a DestList stream, both of which are documented and easily parsed.  The DestList stream serves the purpose of an MRU list.

Browser History
Browser history, in general, illustrates access to pages found on the Internet, but can also provide information about other files that a user has accessed.  Browser history, particularly for IE, can contain file:/// entries, indicating access to local, rather than web-based, resources.

If you find that the LocalService, NetworkService, or Default User accounts have an IE browser history (i.e., the index.dat is populated...), this may be an indication of a process running with System level privileges that is accessing the Internet via the WinInet API.

When it comes to browser history, you will also want to look at file downloads.  Beginning with Windows XP SP2, files downloaded via IE or as OutLook attachments had a Zone Identifier NTFS alternate data stream attached to them by the operating system.  I have seen this, as well, on Windows 7 systems, but irrespective of the browser used.

To illustrate this, open a command prompt (should open to your user profile directory) and type the following command:

dir /s /r * | find "$data" /i | more

This can be very revealing.  Of course, it doesn't illustrate where the files are located, as you're applying a filter to the output of 'dir' and only showing specific lines, but it can provide a good indication of files that you've downloaded.  Also, I'd be very wary of any Zone Identifier ADSs that are more than 26 or 28 bytes in size.

Prefetch Files
It's common knowledge amongst analysts that application prefetch files provide information regarding when an application was run, and how many times that application was run.  These files also contain module paths, which are essentially embedded strings that point to various files, usually DLLs.  Many times, however, the module strings may point to other files.  For example, an application prefetch file for IE will include strings pointing to the index.dat files within the user profile (for the user who launched it).  The application prefetch file for sms.exe in Lance Mueller's first practical contains a path to "del10.bat", as well as a path to the sms.exe file itself, both of which are very telling.  I've seen application prefetch files for keystroke loggers that have contained the path to the file where the keystrokes are recorded, which just goes to show how useful these files can sometimes be, and that there's a great deal of information that can be pulled from these files.  Application prefetch files can be tied to applications launched by a particular user (i.e., correlate the last run time embedded within the file to program execution information for the user), which can then provide information regarding how the files were accessed.

Tools: Custom Perl script (pref.pl)

The Registry is well-known for being a veritable treasure trove of information, particularly when combined with other analysis techniques (timeline analysis, parsing VSCs, etc.).  While not everything can be found in the Registry (for example, Windows does not maintain records of files copied...), there is a great deal that can be found within the Registry.

Most of us are familiar with the RecentDocs key within the user hive.  This is one of the classic MRU keys, as the key itself and all of it's subkeys contain values, and on Windows 7 systems, one of the values is named MRUListEx, and contains the MRU order of the other keys. The other values beneath each key are numbered, and the data is a binary format that contains the name of the file accessed, as well as the name of an associated shortcut/LNK file.

Each of the subkeys beneath this key are named for various file extensions, and as such, not only provide information about which files the user may accessed, but also which applications the user may have had installed.

A means for determining the possible use of counter-forensics techniques is to compare the list of value names against the contents of the MRUListEx value; numbers in this value that do not have corresponding value names may indicate attempts to delete individual values.

Tools: RegRipper recentdocs.pl plugin

The user's ComDlg32 key contains information related to common dialogs used within Windows, and can provide specific indications regarding how the user interacted with the files in question.

Some of the subkeys beneath the ComDlg32 key are...

As described in a previous post, this key provides indications of program execution.

LastVisitedPidlMRU and LastVisitedPidlMRULegacy
These keys contain MRUListEx values, as well as numbered values.  The numbered values contain the name of the executable from which the common dialog was launched, followed by a shell item ID list of the path that was accessed.

This key is found on Vista+ systems, and corresponds to the OpenSaveMRU key found on XP systems.  The subkeys beneath this key correspond to the extensions of opened or saved files.  Each of the numbered values beneath the subkeys consist of shell item ID lists, and there's an MRUListEx value that provides the MRU functionality for each key.

On Windows XP, value data beneath the ComDlg32 subkeys consist of ASCII strings, whereas on Vista and Windows 7, the value data consists of shell items.  This means that in some cases, those data structures will contain time stamps.  However, to be of use during an examination, the analyst needs to understand how those time stamps are created, and maintained.

Tools: RegRipper comdlg32.pl plugin

MS Office File/Place MRU Values
Each of the applications within MS Office 2010 maintains an MRU list of not only files accessed, but places from which files have been accessed (in separate keys).  In addition to the paths to the files or folders, respectively, the value string data contain entries that look like, "[T01CD76253F25ECD0]", which is a string representation of a 64-bit FILETIME time stamp.  As such, these keys aren't MRU keys in the more traditional sense of having an MRUList or MRUListEx value.

Tools: RegRipper office2010.pl plugin

A while back, Andrew Case pointed out this interesting artifact.  When an MS Office document is downloaded or accessed from the network, there's a yellow bar that appears at the top of the application window, just below the menu bar.  Most often, this bar contains a button, which the user must click in order to enable editing of the document.  Within the TrustRecords key, the value names are the paths and names of the files accessed, and the first 8 bytes of the binary data appears to correlate to the time when the user clicked the "Enable Editing" button.

Tools: RegRipper trustrecords.pl plugin

Application-specific MRUs
A number of file viewers (Adobe Reader, MS Paint, etc.) maintain their own MRU lists.  Most often when interacting with the application, if you click on File in the menu bar of the app, the drop-down menu will contain (usually toward the bottom) a list of recently accessed files.  Many times, that information can be found in the Registry.

Tools:  RegRipper applets.pl and adoberdr.pl plugins

On Windows 8, the Photos key in the user's USRCLASS.DAT hive is used to track photos opened via the Photos app on the Windows 8 desktop (many thanks to Jason Hale for sharing his research on this topic).

Tools: RegRipper photos.pl plugin

Ah, there they are again!  Shellbags...can shellbags artifacts be used to help determine files that a user may have accessed?  Yes, they can.  On Windows XP, correlating the shellbags artifacts includes mapping the NodeSlot value in the BagMRU key to the Bags keys, where in some cases you may find what amounts to a directory listing of files; this does not appear to be the case on Windows 7.  However, the shellbags artifacts can illustrate directories that the user accessed, both locally and on network resources, and can therefore provide some information regarding files that the user may have accessed.

In addition to folders on local and removable drives, shellbags artifacts can also illustrate access to mapped network drives, Internet resources (i.e., accessing FTP sites via Windows Explorer), as well as to zipped archives.  Most of us don't usually think twice when we click on a zipped archive and it opens up just like a regular folder, in an Explorer window...however, it is important to note that this does appear as a shellbag artifact.

Tools: RegRipper shellbags.pl plugin

Similar to the shellbags artifacts, the TypedPaths key in the user's NTUSER.DAT hive maintains a list of folders that the user accessed; however, for this artifact, the paths were typed into the Windows Explorer Address Bar.

Users can also disable this feature, so if you find no values in the TypedPaths key, check for the AutoSuggest value.

Tools: RegRipper typedpaths.pl plugin

With Windows XP, user search terms were maintained in the ACMru key; on Windows 7, they're found in the WordWheelQuery key.  These values can provide information regarding what the user was looking for, and provide some indications of files they may have accessed (often determined through timeline analysis).

Tools: RegRipper acrmu.pl and wordwheelquery.pl plugins

A good deal of the information that provides indications of a user's access to files has time stamps associated with it, and as such, can be included in a timeline.  This timeline can provide context and granularity to your analysis, particularly when correlated with other artifacts.