Friday, November 18, 2016

The Joy of Open Source

Not long ago, I was involved in an IR engagement where an intruder had exploited a web-based application on a Windows 2003 system, created a local user account, accessed the system via Terminal Services using that account, run tools, and then deleted the account that they'd created before continuing on using accounts and stolen credentials.

The first data I got to look at was the Event Logs from the system; using evtparse, I created a mini-timeline and got a pretty decent look at what had occurred on the system.  The client had enabled process tracking so I could see the Security/592 and ../593 events, but unfortunately, the additional Registry value had not been created, so we weren't getting full command lines in the event records.  From the mini-timeline, I could "see" the intruder creating the account, using it, and then deleting it, all based on the event record source/ID pairs.

For account creation:
Security/624 - user account created
Security/628 - user account password set
Security/632 - member added to global security group
Security/642 - user account changed

For account deletion:
Security/630 - user account deleted
Security/633 - member removed from global security group
Security/637 - member removed from local security group

Once I was able to access an image of the system, a visual review of the file system (via FTK Imager) confirmed that the user profile was not visible within the active file system.  Knowing that the account had been a local account, I extracted the SAM Registry hive, and ran regslack.exe against it...and could clearly see two keys (username and RID, respectively), and two values (the "F" and "V" values) that had been deleted and were currently "residing" within unallocated space in the hive file.  What was interesting was that the values still included their complete binary data.

I was also able to see one of the deleted keys via RegistryExplorer.

SAM hive open in RegistryExplorer














Not that I needed to confirm it, but I also ran the RegRipper del.pl plugin against the hive and ended up finding indications of two other deleted keys, in addition to the previously-observed information.

Output of RR del.pl plugin (Excerpt)

















Not only that, but the plugin retrieves the full value data for the deleted values; as such, I was able to copy (via Notepad++) code for parsing the "F" and "V" value data out of the samparse.pl plugin and paste it into the del.pl plugin for temporary use, so that the binary data is parsed into something intelligible.

The del_tln.pl plugin (output below) made it relatively simple to add the deleted key information to a timeline, so that additional context would be visible.


Output of RR del_tln.pl plugin


If nothing else, this really illustrates one of the valuable aspects of open source software.  With relatively little effort and time, I was able to incorporate findings directly into my analysis, adding context and clarity to that analysis.  I've modified Perl and Python scripts to meet my own needs, and this is just another example of being able to make quick and easy changes to the available tools in order to meet immediate analysis needs.

Speaking of which, I've gone back and picked up something of a side project that I'd started a bit ago, based on a recent suggestion from a good friend. As I've started to dig into it a bit more, I've run into some challenges, particularly when it comes to "seeing" the data, and translating it into something readable.  Where I started with a hex editor and highlighting a DWORD value at a time, I've ended up writing and cobbling together bits of (open source) code to help me with this task. At first glance, it's like having a bunch of spare parts laying out on a workbench, but closer inspection reveals that it's all basically the same stuff, just being used in different ways.  What started a number of years ago with the header files from Peter Nordahl's ntchpwd utility became the first Registry parsing code that I wrote, which I'm still using to this day.

Take-Aways
Some take-aways from this experience...

When a new version of Windows comes out, everyone wants to know what the new 'thing' is...what's the latest and greatest artifact?  But what about the stuff that always works?  What about the old stuff that gets used again and again, because it works?

Understanding the artifact cluster associated with certain actions on the various versions of Windows can help in recognizing those actions when you don't have all of the artifacts available.  Using just the event record source/ID pairs, we could see the creation and deletion of the user account, even if we didn't have process information to confirm it for us.  In addition, the account deletion occurred through a GUI tool (mmc.exe running compmgmt.msc) and all the process creation information would show us is that the tool was run, not which buttons were pushed.  Even without the Event Log record metadata, we still had the information we extracted from unallocated space within the SAM hive file.

Having access to open source tools means that things can be tweaked and modified to suit your needs.  Don't program?  No problem.  Know someone who does?  Are you willing to ask for help?  No one person can know everything, and sometimes it's helpful to go to someone and get a fresh point of view.

Saturday, October 29, 2016

Ransomware

Ransomware
I think that we can all agree, whether you've experienced it within your enterprise or not, ransomware is a problem.  It's one of those things that you hope never happens to you, that you hope you never have to deal with, and you give a sigh of relief when you hear that someone else got hit.

The problem with that is that hoping isn't preparing.

Wait...what?  Prepare for a ransomware attack?  How would someone go about doing that?  Well, consider the quote from the movie "Blade":

Once you understand the nature of a thing, you know what it's capable of.

This is true for ransomware, as well as Deacon Frost.  If you understand what ransomware does (encrypts files), and how it gets into an infrastructure, you can take some simple (relative to your infrastructure and culture, of course) to prepare for such an incident to occur.  Interestingly enough, many of these steps are the same that you'd use to prepare for any type of incident.

First, some interesting reading and quotes...such as from this article:

The organization paid, and then executives quickly realized a plan needed to be put in place in case this happened again. Most organizations are not prepared for events like this that will only get worse, and what we see is usually a reactive response instead of proactive thinking.

....and...

I witnessed a hospital in California be shut down because of ransomware. They paid $2 million in bitcoins to have their network back.

The take-aways are "not prepared" and "$2 million"...because it would very likely have cost much less than $2 million to prepare for such attacks.

The major take-aways from the more general ransomware discussion should be that:

1.  Ransomware encrypts files.  That's it.

2.  Like other malware, those writing and deploying ransomware work to keep their product from being detected.

3.  The business model of ransomware will continue to evolve as methods are changed and new methods are developed, while methods that continue to work will keep being used.

Wait...ransomware has a business model?  You bet it does!  Some ransomware (Locky, etc.) is spread either through malicious email attachments, or links that direct a user's browser to a web site.  Anyone who does process creation monitoring on an infrastructure likely sees this.  In a webcast I gave last spring (as well as in subsequent presentations), I included a slide that illustrated the process tree of a user opening an email attachment, and then choosing to "Enable Content", at which point the ransomware took off.

Other ransomware (Samas, Le Chiffre, CryptoLuck) is deployed through a more directed means, bypassing email all together.  An intruder infiltrates an infrastructure through a vulnerable perimeter system, RDP, TeamViewer, etc., and deploys the ransomware in a dedicated fashion.  In the case of Samas ransomware, the adversary appears to have spent time elevating privileges and mapping the infrastructure in order locate systems to which they'd deploy the ransomware.  We've seen this in the timeline where the adversary would on one day, simply blast out the ransomware to a large number of systems (most appeared to be servers).

The Ransomware Economy
There are a couple of other really good posts on Secureworks blog regarding the Samas ransomware (here, and here).  The second blog post, by Kevin Strickland, talks about the evolution of the Samas ransomware; not long ago, I ran across this tweet that let us know that the evolution that Kevin talked about hasn't stopped.  This clearly illustrates that developers are continuing to "provide a better (i.e., less detectable) product", as part of the economy of ransomware.  The business models that are implemented the ransomware economy will continue to evolve, simply because there is money to be had.

There is also a ransomware economy on the "blue" (defender) side, albeit one that is markedly different from the "red" (attacker) side.

The blue-side economy does not evolve nearly as fast as the red-side.  How many victims of ransomware have not reported their incident to anyone, or simply wiped the box and moved on?  How many of those with encrypted files have chosen to pay the ransom rather than pay to have the incident investigated?  By the way, that's part of the red-side economy...make it more cost effective to pay the ransom than the cost of an investigation.

As long as the desire to obtain money is stronger that the desire to prevent that from happening, the red-side ransomware economy will continue to outstrip that of the blue-side.

Preparation
Preparation for a ransomware attack is, in many ways, no different from preparing for any other computer security incident.

The first step is user awareness.  If you see something, say something.  If you get an odd email with an attachment that asks you to "enable content", don't do it!  Instead, raise an alarm, say something.

The second step is to use technical means to protect yourself.  We all know that prevention works for only so long, because adversaries are much more dedicated to bypassing those prevention mechanisms than we are to paying to keep those protection mechanisms up to date.  As such, augmenting those prevention mechanisms with detection can be extremely effective, particularly when it comes to definitively nailing down the initial infection vector (IIV).  Why is this important?  Well, in the last couple of months, we've not only seen the deliver mechanism of familiar ransomware changing, but we've also seen entirely new ransomware variants infecting systems.  If you assume that the ransomware is getting in as an email attachment, then you're going to direct resources to something that isn't going to be at all effective.

Case in point...I recently examined a system infected with Odin Locky, and was told that the ransomware could not have gotten in via email, as a protection application had been purchased specifically for that purpose.  What I found was that the ransomware did, indeed, get on the system via email; however, the user had accessed their AOL email (bypassing the protection mechanism), and downloaded and executed the malicious attachment.

Tools such as Sysmon (or anything else that monitors process creation) can be extremely valuable when it comes to determining the IIV for ransomware.  Many variants will delete themselves after files are encrypted, (attempt to) delete VSCs, etc., and being able to track the process train back to it's origin can be extremely valuable in preventing such things in the future.  Again, it's about dedicating resources where they will be the most effective.  Why invest in email protections when the ransomware is getting on your systems as a result of a watering hole attack, or strategic web compromise?  Or what if it's neither of those?  What if the system had been compromised, a reverse shell (or some other access method, such as TeamViewer) installed and the system infected through that vector?

Ransomware will continue to be an issue, and new means for deploying are being developed all the time.  The difference between ransomware and, say, a targeted breach is that you know almost immediately when you've had files encrypted.  Further, during targeted breaches, the adversary will most often copy your critical files; with ransomware, the files are made unavailable to anyone.  In fact, if you can't decrypt/recover your files, there's really no difference between ransomware and secure deletion of your files.

We know that on the blue-side, prevention eventually fails.  As such, we need to incorporate detection into our security posture, so that if we can't prevent the infection or recover our files, we can determine the IIV for the ransomware and address that issue.

Addendum, 30 Oct: As a result of an exchange with (and thanks to) David Cowen, I think that I can encapsulate the ransomware business model to the following statement:

The red-side business model for ransomware converts a high number of low-value, blue-side assets into high-value attacker targets, with a corresponding high ROI (for the attacker).

What does mean?  I've asked a number of folks who are not particularly knowledgeable in infosec if there are any files on their individual systems without which they could simply not do their jobs, or without access to those files, their daily work would significantly suffer.  So far, 100% have said, "yes".  Considering this, it's abundantly clear that attackers have their own reciprocal Pyramid of Pain that they apply to defenders; that is, if you want to succeed (i.e., get paid), you need to impact your target in such a manner that it is more cost-effective (and less painful) to pay the ransom than it is perform any alternative.  In most cases, the alternative amounts to changing corporate culture.




AmCache.hve

I was working on an incident recently, and while extracting files from the image, I noticed that there was an AmCache.hve file.  Not knowing what I would find in the file, I extracted it to include in my analysis.  As I began my analysis, I found that the system I was examining was a Windows Server 2012 R2 Standard system.  This was just one system involved in the case, and I already had a couple of indicators.

As part of my analysis, I parsed the AppCompatCache value and found one of my indicators:

SYSVOL\downloads\malware.exe  Wed Oct 19 15:35:23 2016 Z

I was able to find a copy of the malware file in the file system, so I computed the MD5 hash, and pulled the PE compile time and interesting strings out of the file.  The compile time was  9 Jul 2016, 11:19:37 UTC.

I then parsed the AmCache.hve file and searched for the indicator, and found:

File Reference  : 28000017b6a
LastWrite          : Wed Oct 19 06:07:02 2016 Z
Path                   : C:\downloads\malware.exe
SHA-1               : 0000
Last Mod Time2: Wed Aug  3 13:36:53 2016 Z

File Reference   : 3300001e39f
LastWrite           : Wed Oct 19 15:36:07 2016 Z
Path                    : C:\downloads\malware.exe
SHA-1                : 0000
Last Mod Time2: Wed Oct 19 15:35:23 2016 Z

File Reference  : 2d000017b6a
LastWrite          : Wed Oct 19 06:14:30 2016 Z
Path                   : C:\Users\\Desktop\malware.exe
SHA-1               : 0000
Last Mod Time  : Wed Aug  3 13:36:54 2016 Z
Last Mod Time2: Wed Aug  3 13:36:53 2016 Z
Create Time       : Wed Oct 19 06:14:20 2016 Z
Compile Time    : Sat Jul  9 11:19:37 2016 Z

All of the SHA-1 hashes were identical across the three entries.  Do not ask for the hashes...I'm not going to provide them, as this is not the purpose of this post.

What this illustrates is the value of what what can be derived from the AmCache.hve file.  Had I not been able to retrieve a copy of the malware file from the file system, I would still have a great deal of information about the file, including (but not limited to) the fact that the same file was on the file system in three different locations.  In addition, I would also have the compile time of the executable file.

Sunday, October 16, 2016

Links and Updates

RegRipper Plugin
Not long ago, I read this blog post by Adapt Forward Cyber Security regarding an interesting persistence mechanism, and within 10 minutes, had RegRipper plugin written and tested against some existing data that I had available.

So why would I say this?  What's the point?  The point is that with something as simple as copy-paste, I extended the capabilities of the tool, and now have new functionality that will let me flag something that may be of interest, without having to memorize a checklist.  And as I pushed the new plugin out to the repository, everyone who downloads and uses the plugin now has that same capability, without having to have spent the time that the folks at Adapt Forward spent on this; through documentation and sharing, the DFIR community is able to extend the functionality of existing toolsets, as well as the reach of knowledge and experience.

Speaking of which, I was recently assisting with a case, and found some interesting artifacts in the Registry regarding LogMeIn logons; they didn't include the login source (there was more detail recovered from a Windows Event Log record), but they did include the user name and date/time.  This was the result of a creating a timeline that included Registry key LastWrite times, and led to investigating an unusual key/entry in the timeline.  I created a RegRipper plugin to extract the information (logmein.pl), and then created one to include the artifact in a timeline (logmein_tln.pl).  Shorty after creating them both, I pushed them up to Github.

Extending Tools, Extending Capabilities
Not long ago, I posted about parsing .pub files that were used to deliver malicious macros.  There didn't seem to be a great deal of interest from the community, but hey, what're you gonna do, right?  One comment that I did receive was, "yeah, so what...it's a limited infection vector."  You know what?  You're right, it is.  But the point of the post wasn't, "hey, look here's a new thing..."; it was "hey, look, here's an old thing that's back, and here's how, if you understand the details of the file structure, you can use that information to extend your threat intel, and possibly even your understanding of the actors using it."

And, oh, by the way, if you think that OLE is an old format, you're right...but if you think that it's not used any longer, you're way not right.  The OLE file format is used with Sticky Notes, as well as automatic Jump Lists.

Live Imaging
Mari had an excellent post recently in which she addressed live imaging of Mac systems.  As she pointed out in her post, there are times when live imaging is not only a good option, but the only option.

The same can also be true for Windows systems, and not just when encryption is involved.  There are times when the only way to get an image of a server is to do so using a live imaging process.

Something that needs to be taken into consideration during the live imaging of Windows systems is the state of various files and artifacts while the system is live and running.  For example, Windows Event Logs may be "open", and it's well known that the AppCompatCache data is written at system shutdown.

AmCache.hve
Not long ago, I commented regarding my experiences using the AmCache.hve file during investigations; in short, I had not had the same sort of experiences as those described by Eric Z.

That's changed.

Not long ago, I was examining some data from a point-of-sale breach investigation, and had noticed in the data that there were references to a number of tools that the adversary had used that were no longer available on the system.  I'd also found that the installed AV product wasn't writing detection events to the Application Event Log (as many such applications tend to do...), so I ran 'strings' across the quarantine index files, and was able to get the original path to the quarantined files, as well as what the AV product had alerted on.  In one instance, I found that a file had been identified by the AV product as "W32.Bundle.Toolbar"...okay, not terribly descriptive.

I parsed the AmCache.hve file (the system I was examining was a Windows 7 SP1 system), and searched the output for several of the file names I had from other sources (ShimCache, UserAssist, etc.), and lo and behold, I found a reference to the file mentioned above.  Okay, the AmCache entry had the same path, so I pushed the SHA-1 hash for the file up to VT, and the response identified the file as CCleaner.  This fit into the context of the examination, as we'd observed the adversary "cleaning up", using either native tools (living off the land), or using tools they'd brought with them.

Windows Event Log Analysis
Something I see over and over again (on Twitter, mostly, but also in other venues) is analysts referring to Windows Event Log records solely by their event ID, and not including the source.

Event IDs are not unique.  There are a number of event IDs out there that have different sources, and as such, have a completely different context with respect to your investigation.  Searching Google, it's easy to see (for example) that events with ID 4000 have multiple sources; DNS, SMTPSvc, Diagnostics-Networking, etc.  And that doesn't include non-MS applications...that's just what I found in a couple of seconds of searching.  So, searching across all event logs (or even just one event log file) for all events with a specific ID could result in data that has no relevance to the investigation, or even obscure the context of the investigation.

Okay...so what?  Who cares?  Well, something that I've found that really helps me out with an examination is to use eventmap.txt to "tag" events of interest ("interest", as in, "found to be interesting from previous exams") while creating a timeline.  One of the first things I'll do after opening the TLN file is to search for "[maldetect]" and "[alert]", and get a sense of what I'm working with (i.e., develop a bit of situational awareness).  This works out really well because I use the event source and ID in combination to identify records of interest.

As many of us still run across Windows XP and 2003 systems, this link provides a good explanation (and a graphic) of how wrapping of event records works in the Event Logs on those systems.

Thursday, September 22, 2016

Size Matters

Yes, it does, and sometimes smaller is better.

Here's why...the other day I was "doing" some analysis, trying to develop some situational awareness from an image of a Windows 2008 SP2 system.  To do so, I extracted data from the image...directory listing of the partition via FTK Imager, Windows Event Logs, and Registry hive files.  I then used this data to create a micro-timeline (one based on limited data) so that I could just get a general "lay of the land", if you will.

One of the things I did was open the timeline in Notepad++, run the slider bar to the bottom of the file, and search (going "up" in the file) for "Security-Auditing/".  I did this to see where the oldest event from the Security Event Log would be located.  Again, I was doing this for situational awareness.

Just to keep track, from the point where I had the extracted data sources, I was now just under 15 min into my analysis.

The next thing I did was go all the way back to the top of the file, and I started searching for the tags included in eventmap.txt.  I started with "[maldetect]", and immediately found clusters of malware detections via the installed AV product.

Still under 18 min at this point.

Then I noticed something interesting...there was as section of the timeline that had just a bunch of failed login attempts (Microsoft-Windows-Security-Auditing/4625 events), all of them type 10 logins.  I knew that one of the things about this case was unauthorized logins via Terminal Services, and seeing the failed login attempts helped me narrow down some aspects of that; specifically, the failed login attempts originated from a limited number of IP addresses, but there were multiple attempts, many using user names that didn't exist on the system...someone was scanning and attempting to brute force a login.

I already knew from the pre-engagement conference calls that there were two user accounts that were of primary interest...one was a legit account the adversary had taken over, the other was one the adversary had reportedly created.  I searched for one of those and started to see "Microsoft-Windows-Security-Auditing/4778" (session reconnect) and /4779 (session disconnect) events.  I had my events file, so I typed the commands:

type events.txt | find "Microsoft-Windows-Security-Auditing/4778" > sec_events.txt
type events.txt | find "Microsoft-Windows-Security-Auditing/4779" >> sec_events.txt

From there, I wrote a quick script that ran through the sec_events.txt file and gave me a count of how many times various system names and IP addresses appeared together.  From the output of the script, I could see that for some system names, ones that were unique (i.e., "Hustler", etc., but NOT "Dell-PC") were all connecting from the same range of IP addresses.

From the time that I had the data available, to the point where I was looking at the output of the script was just under 45 min.  Some of that time included noodling over how best to present what I was looking for, so that I didn't have to go through things manually...make the code do alphabetical sorting rather than having to it myself, that sort of thing.

The point of all this is that sometimes, you don't need a full system timeline, using all of the available data, in order to make headway in your analysis.  Sometimes a micro-timeline is much better, as it doesn't include all the "noise" associated with a bunch of unrelated activity.  And there are times when a nano-timeline is a vastly superior resource.

As a side note, after all of this was done, I extracted the NTUSER.DAT files for the two user profiles of interest from the image, added the UserAssist information from each of them to the main events file, and recreated the original timeline with new data...total time to do that was less than 10 min, and I was being lazy.  That one small action really crystallized the picture of activity on the system.

Addendum, 27 Sept:
Here's another useful command line that I used to get logon data:

type events.txt | find "Security-Auditing/4624" | find "admin123" | find ",10"


Monday, September 19, 2016

Links/Updates

Malicious Office Documents
Okay, my last post really doesn't seem to have sparked too much interest; it went over like a sack of hammers.  Too bad.  Personally, I thought it was pretty fascinating, and can see the potential for additional work further on down the road.  I engaged in the work to help develop a clearer threat intel picture, and there has simply been no interest.  Oh, well.

Not long ago, I found this pretty comprehensive post regarding malicious Office documents, and it covers both pre- and post-2007 formats.

What's in your WPAD?
At one point in my career, I was a security admin in an FTE position within a company.  One of the things I was doing was mapping the infrastructure and determining ingress/egress points, and I ran across a system that actually had persistent routes enabled via the Registry.  As such, I've always tried to be cognizant of anything that would redirect a system to a route or location other than what was intended.  For example, when responding to an apparent drive-by downloader attack, I'd be sure to examine not only the web history but also the user Favorites or Bookmarks; there have been several times where doing this sort of analysis has added a slightly different shade to the investigation.

Other examples of this include things like modifications to the hosts file.  Windows uses the hosts file for name resolution, and I've used this in conjunction with a Scheduled Task, as a sort of "parental control", leaving the WiFi up after 10pm for a middle schooler, but redirecting some sites to localhost.  Using that knowledge over the years, I've also examined the hosts file for indicators of untoward activity; I even had a plugin for the Forensic Scanner that would automatically extract any entries in the hosts file what was other than the default.  Pretty slick.

Not new...this one is over four years old...but I ran across this post on the NetSec blog, and thought that it was worth mentioning.  Sometimes, you just have to know what you're looking for when performing incident response, and sometimes what you're looking for isn't in memory, or in a packet capture.

Speaking of checking things related to the browser, I saw something that @DanielleEveIR tweeted recently, specifically:








I thought this was pretty interesting, not something I'd seen or thought of before.  Unfortunately, many of the malware RE folks I know are focused more on the network than the host, so things such as modifications of Registry values tend to fall through the cracks.  However, if you're running Carbon Black, this might make a pretty good watchlist item, eh?

I did a search and found a malware sample described here that exhibits this behavior, and I found another description here.  Hopefully, that might provide some sort of idea as to how pervasive this artifact is.

@DanielleEveIR had another interesting tweet, stating that if your app makes a copy of itself and then launches the copy, it might be malware.  Okay, she's starting to sound like the Jeff Foxworthy of IR..."you might be malware if..."...but she has a very good point.  Our team recently saw the LaZagne credential theft tool being run in an infrastructure, and if you've ever seen or tested this, that's exactly what it does.  This would make a good watchlist item, as well...regardless of what the application name is, if the process name is the same as the parent process name, flag that puppy!  You can also include this in any script that you use that parses Security Event Logs (for event ID 4688) or Sysmon Event Logs.

Defender Bias
There've been a number of blog posts that have discussed analyst bias when it comes to DFIR, threat intel, and attribution.

Something that I haven't seen discussed much is blue team or defender bias.  Wait...what?  What is "defender bias"?  Let's look at some examples...you're sitting in a meeting, discussing an incident that your team is investigating, and you're fully aware that you don't have all the data at this point.  You're looking at a few indicators, maybe some files and Windows Event Log records, and then someone says, "...if I were the bad guy, I'd...".  Ever have that happen?  Being an incident responder for about 17 years, I've heard that phrase spoken.  A lot.  Sometimes by members of my team, sometimes by members of the client's team.

Another place that defender bias can be seen is when discussing "crown jewels".  One of the recommended exercises while developing a CSIRP is to determine where the critical data for the organization is located within the infrastructure, and then develop response plans around that data. The idea of this exercise is to accept that breaches are inevitable, and collapse the perimeter around the critical data that the organization relies on to function.

But what happens when you don't have the instrumentation and visibility to determine what the bad guy is actually doing?  You'll likely focus on protected that critical data while the bad guy is siphoning off what they came for.

The point is that what may be critical to you, to your business, may not be the "crown jewels" from the perspective of the adversary.  Going back as far as we can remember, reports from various consulting organizations have referred to the adversary as having a "shopping list", and while your organization may be on that list, the real question isn't just, "..where are your critical assets?", it's also "...what is the adversary actually doing?"

What if your "crown jewels" aren't what the adversary is after, and your infrastructure is a conduit to someone else's infrastructure?  What if your "crown jewels" are the latest and greatest tech that your company has on the drawing boards, and the adversary is instead after the older gen stuff, the tech shown to work and with a documented history and track record of reliability?   Or, what if your "crown jewels" are legal positions for clients, and the adversary is after your escrow account?

My point is that there is going to be a certain amount of defender bias in play, but it's critical for organizations to have situational awareness, and to also realize when there are gaps in that situational awareness.  What you decide are the "crown jewels", in complete isolation from any other input, may not be what the adversary is after.  You'll find yourself hunkered down in your Maginot Line bunkers, awaiting that final assault, only to be mystified when it never seems to come.

Sunday, September 11, 2016

OLE...OLE, OLE, OLE!

Okay, if you've never seen The Replacements then the title of this post won't be nearly as funny to you as it is to me...but that's okay.

I recently posted an update blog that included a brief discussion of a tool I was working on, and why.  In short, and due in part to a recently publicized change in tactics, I wanted to dust off some old code I'd written and see what information or intel I could collect.

The tactic I'm referring to involves the use of malware delivered via '.pub' files.  I wasn't entirely too interested in this tactic until I found out that .pub (MS Publisher) files are OLE format files.

The code I'm referring to is wmd.pl, something I wrote a while back (according to the header information, the code is just about 10 yrs old!) and was written specifically to parse documents created using older versions of MS Word, specifically those that used OLE.

OLE
The Object Linking and Embedding (OLE) file format is pretty well documented at the MS site, so I won't spend a lot of time discussing the details here.  However, I will say that MS has referred to the file format as a "file system within a file", and that's exactly what it is.  If you look at the format, there's actually a 'sector allocation table', and it's laid out very similar to the FAT file system.  Also, at some levels of the 'file system' structure, there are time stamps, as well.  Now, the exact details of when and how these time stamps are created and/or modified (or if they are, at all) isn't exactly clear, but they can serve as an indicator, and something that we can incorporate with other artifacts such that when combining them with context, we can get a better idea of their validity and value.

For most of us who have been in the IR business for a while, when we hear "OLE", we think of the Blair document, and in particular, the file format used for pre-2007 versions of MS Office documents.  Further, many of us thought that with the release of Office 2007, the file format was going to disappear, and at most, we'd maybe have to dust off some tools or analysis techniques at some point in the future.  Wow, talk about a surprise!  Not only did the file format not disappear, as of Windows 7, we started to see it being used in more and more of the artifacts we were seeing on the system.  Take a look at the OLE Compound File page on the ForensicWiki for a list of the files on Windows systems that utilize the OLE file format (i.e., StickyNotes, auto JumpLists, etc.).  So, rather than "going away", the file format has become more pervasive over time.  This is pretty fascinating, particularly if you have a detailed understanding of the file structure format.  In most cases when you're looking at these files on a Windows system, the contents of the files will be what you're most interested in; for example, with automatic Jump Lists, we may be most interested in the DestList stream.  However, when an OLE compound file is created off of the system, perhaps through the use of an application, we (as analysts) would be very interested in learning all we can about the file itself.


Tools
So, the idea behind the tool I was working on was to pull apart one component of the overall attack to see if there were any correlations to the same component with respect other attacks.  I'm not going to suggest it's the same thing (because it's not) but the idea I was working from is similar to pulling a device apart and breaking down its components in order to identify the builder, or at the very least to learn a little bit more that could be applied to an overall threat intel picture.

Here's what we're looking at...in this case, the .pub files are arriving as email attachments, so you have a sender email address, contents of the email header and body, attachment name, etc.  All of this helps us build a picture of the threat.  Is the content of the email body pretty generic, or is it specifically written to illicit the desired response (opening the attachment) from the user to whom it was sent?  Is it targeted?  Is it spam or spear-phishing/whaling?

Then we have what occurs after the user opens the attachment; in some cases, we see that files are downloaded and native commands (i.e., bitsadmin.exe) are executed on the system.  Some folks have already been researching those areas or aspects of the overall attacks, and started pulling together things such as sites and files accessed by bitsadmin.exe, etc.

Knowing a bit about the file format of the attachment, I thought I'd take an approach similar to what Kevin talked about in his Continuing Evolution of Samas Ransomware blog post.  In particular, why not see if I could develop some information that could be mapped to other aspects of the attacks?  Folks were already using Didier's oledump.py to extract information about the .pub files, as well as extract the embedded macros, but I wanted to take a bit of a closer look at the file structure itself.  As such, I collected a number of .pub files that were known to be malicious in nature and contain embedded macros (using open sources), and began to run the tool I'd written (oledmp.pl) across the various files, looking not only for commonalities, but differences, as well.  Here are some of the things I found:

All of the files had different time stamps; within each file, all of the "directory" streams had the same time stamp.  For example, from one file:

Root Entry  Date: 30.06.2016, 22:03:16

All of the "directory" streams below the Root Entry had the same time stamp, as illustrated in the following image (different file from the one with 30 June time stamps):
.pub file structure listing
















Some of the files had a populated "Authress:" entry in the SummaryInformation section.  However, with the exception of those files, the SummaryInformation and DocumentSummaryInformation streams were blank.

All of the files had Trash sections (again, see the document structure specification) that were blank.
Trash Sections Listed
For example, in the image to the left, we see the tool listing the Trash sections and their sizes; for each file examined, the File Space section was all zeros, and the System Space section was all "0xFFFF".  Without knowing more about how these sections are managed, it's difficult to determine specifically if this is a result of the file being created by whichever application was used (sort of a 'default' configuration), or if this is the result of an intentional action.

Many (albeit not all) files contained a second stream with an embedded macro.  In all cases within the sample set, the stream was named "Module1", and contained an empty function.  However, in each case, that empty function had a different name.

Some of the streams of all of the files were identical across the sample set.  For example, the \Quill\QuillSub\ \x01CompObj stream for all of the files appears as you see in the image below.
\Quill\QuillSub\ \x01CompObj stream







All in all, for me, this was some pretty fascinating work.  I'm sure that there may be even more information to collect with a larger sample set.  In addition, there's more research to be done...for example, how do these files compare to legitimate, non-malicious Publisher files?  What tools can be used to create these files?

Wednesday, September 07, 2016

More Updates

Timelines
Mari had a great post recently that touched on the topic of timelines, which also happens to be the topic of her presentation at the recent HTCIA conference (which, by all Twitter accounts, we very well received).

A little treasure that Mari added to the blog post was how she went about modifying a Volatility plugin in order to create a new one.  Mari says in the post, "...nothing earth shattering...", but you know what, sometimes the best and most valuable things aren't earth shattering at all.  In just a few minutes, Mari created a new plugin, and it also happens to be her first Volatility plugin.  She shared her process, and you can see the code right there in the blog post.

Scripting
Speaking of sharing..well, this has to do with DFIR in general, but not Windows specifically...I ran across this fascinating blog post recently.  In short, the author developed a means (using Python) for turning listings of cell tower locations (pulled from phones by Cellebrite) into a Google Map.

A while back, I'd written and shared a Perl script that did something similar, except with WiFi access points.

The point is that someone had a need and developed a tool and/or process for (semi-)automatically parsing and processing the original raw data into a final, useful output format.

.pub files
I ran across this ISC Handler Diary recently...pretty interesting stuff.  Has anyone seen or looked at this from a process creation perspective?  The .pub files are OLE compound "structured storage" files, so has anyone captured information from endpoints (IRL, or via a VM) that illustrates what happens when these files are launched?

For detection of these files within an acquired image, there are some really good Python tools available that are good for generally parsing OLE files.  For example, there's Didier's oledump.py, as well as decalage/oletools.  I really like oledump.py, and have tried using it for various testing purposes in the past, usually using files either from actual cases (after the fact), or test documents downloaded from public sources.

while back I wrote some code (i.e., wmd.pl) specifically to parse OLE structured storage files, so I modified that code to essentially recurse through the OLE file structure, and when getting to a stream, simply dump the stream to STDOUT in a hex-dump format.  However, as I'm digging through the API, there's some interesting information available embedded within the file structure itself.  So, while I'm using Didier's oledump.py as a comparison for testing, I'm not entirely interested in replicating the great work that he's done already, as much as I'm looking for new things to pull out, and new ways to use the information, such as pulling out date information (for possible inclusion in a timeline, or inclusion in threat intelligence), etc.

So I downloaded a sample found on VirusTotal, and renamed the local copy to be more inline with the name of the file that was submitted to VT.

Here's the output of oledump.py, when run across the downloaded file:

Oledump.py output



















Now, here's the output from ole2.pl, the current iteration of the OLE parsing tool that I'm working on, when run against the same file:

Ole2.pl output



















As you can see, there are more than a few differences in the outputs, but that doesn't mean that there's anything wrong with either tool.  In fact, it's quite the opposite.  Oledump.py uses a different technique for tracking the various streams in the file; ole2.pl uses the designators from within the file itself.

The output of ole2.pl has 5 columns:
- the stream designator (from within the file itself)
- a tuple that tells me:
   - is the stream a "file" (F) or a "directory" (D)
   - if the stream contains a macro
   - the "type" (from here); basically, is the property a PropertySet?
- the date (OLE VT_DATE format is explained here)

Part of the reason I wrote this script was to see which sections within the OLE file structure had dates associated with them, as perhaps that information can be used as part of building the threat intel picture of an incident.  The script has embedded code to display the contents of each of the streams in a hex-dump format; I've disabled the code as I'm considering adding options for selecting specific streams to dump.

Both tools use the same technique for determining if macros exist in a stream (something I found at the VBA_Tools site).

One thing I haven't done is added code to look at the "trash" (described here) within the file format.  I'm not entirely sure how useful something like this would be, but hey, it may be something worth looking at.  There are still more capabilities I'm planning to add to this tool, because what I'm looking at is digging into the structure of the file format itself in order to see if I can develop indicators, which can then be clustered with other indicators.  For example, the use of .pub file attachments has been seen by others (ex: MyOnlineSecurity) being delivered via specific emails.  At this point, we have things such as the sender address, email content, name of the attachment, etc.  Still others (ex: MoradLabs) have shared the results of dynamic analysis; in this case, the embedded macro launching bitsadmin.exe with specific parameters.  Including attachment "tooling" may help provide additional insight into the use of this tactic by the adversary.

Something else I haven't implemented (yet) is extracting and displaying the macros.  According to this site, the macros are compressed, and I have yet to find anything that will let me easily extract and decompress a macro from the stream in which its embedded, using Perl.  Didier has done it in Python, so perhaps that's something I'll leave to his tool.

Threat Intel
I read this rather fascinating Cisco Continuum article recently, and I have to say, I'm still trying to digest it.  Part of the reason for this is that the author says some things I agree with, but I need to go back and make sure I understand what they're saying, as I might be agreeing while at the same time misunderstanding what's being said.

A big take-away from the article was:

Where a team like Cisco’s Talos and products like AMP or SourceFire really has the advantage, Reid said, is in automating a lot of these processes for customers and applying them to security products that customers are already using. That’s where the future of threat intelligence, cybersecurity products and strategies are headed.

Regardless of the team or products, where we seem to be now is that processes are being automated and applied to devices and systems that clients are already using, in many cases because the company sold the client the devices, as part of a service.