Saturday, September 12, 2015

Updates, Links, etc.

RegRipper Plugin Updates
I updated a plugin recently and provided a new one, and thought I'd share some information about those updates here...

The updated plugin is environment.pl, originally written in 2011; the update that I added to the code was to specifically look for and alert on the value described in this blog post.  So, four years later, I added a small bit of code to the plugin to look for something specific in the data.

I added the malware.pl plugin, which can be run against any hive; it has specific sections in its code that describe what's being looked for, in which hive, along with references as to the sources from which the keys, values or data in question were derived - why was I looking for them in the first place?  Essentially, these are all artifacts I find myself looking for time and again, and I figured I'd just keep them together in one plugin.  If you look at the plugin contents, you'll see that I copied the code from secrets.pl and included it.

There are a couple of other plugins I thought I'd mention, in case folks hadn't considered using them....

The sizes.pl plugin was written to address malware maintaining configuration information in a Registry value, as described in this Symantec post.  You can run this plugin against any hive.


The rlo.pl plugin is an interesting plugin, and the use of the plugin was illustrated in this Secureworks blog post.  As you can see in the figure to the left, there are two Registry keys that appear to have the same name.

In testing for this particular issue, I had specifically crafted two Registry key names, using the method outlined in the Secureworks blog post.  This allowed me to create some useful data that mimicked what we'd seen, and provided an opportunity for more comprehensive testing.

As you can see from the output of the plugin listed below, I had also crafted a Registry value name using the same method, to see if the plugin would detect that, as well.

C:\Perl\rr>rip -r d:\cases\local\ntuser.dat -p rlo
Launching rlo v.20130904
rlo v.20130904
(All) Parse hive, check key/value names for RLO character

RLO control char detected in key name: \Software\gpu.etadp  [gpupdate]
RLO control char detected in key name: \Software\.etadpupg  [gpupdate]
RLO control char detected in value name: \Software\.etadpupg :.etadpupg [gpupdate]

Now, when running the rlo.pl plugin, analysts need to keep in mind that it's looking for something very specific; in this case, indications of the RLO Unicode control character.  What's great about plugins like this is that you can include them in your process, run them every time you're conducting analysis, and they'll alert you when there's an issue.

Just as PSA, I have provided these plugins but I haven't updated any of the profiles...I leave that up to the users.  So, if you're downloading the plugins folder and refreshing it in place, do not expect to see the


Anti-Forensic Malware
I ran across this InfoSecurity Magazine article recently, and while the title caught my attention, I was more than a bit surprised at the lack of substance.

There are a couple of statements in the blog post that I wanted to address, and share my thoughts on...

Increasingly, bad actors are using techniques that leave little trace on physical disks. And unfortunately, the white hats aren’t keeping up: There’s a shortage of digital forensics practitioners able to investigate these types of offensives.

As to the first sentence, sometimes, yes.  Other times, not so much.

The second statement regarding "white hats" is somewhat ambiguous, don't you think?  Who are the "white hats"?  From my perspective, if "white hats" are the folks investigating these breaches, it's not so much that we aren't keeping up, as it is that the breaches themselves aren't being detected in a timely manner, due to a lack of instrumentation.  By the time the "white hats" get the call to investigate the breach, a great deal of the potential evidence has been obviated.

Finally, I don't know that I agree with the final statement, regarding the shortage of practitioners.  Sometimes, there's nothing to investigate.  As I described in a recent blog post, when putting together some recent presentations, I looked at the statistics in annual security trends reports.  One of the statistics I found interested was dwell time, or median time to detection.  The point I tried to make in the presentations was that when consultants go on-site to investigate a breach, they're able to see indicators that allow them to identify these numbers.  For example in the M-Trends 2015 report, there was an infrastructure that had been compromised 8 years before the compromise was detected.

I would suggest that it's not so much a shortage of practitioners able to investigate these breaches, it's a lack of management oversight that prevents the infrastructure from being instrumented in a manner that provides for timely detection of breaches.  By the time some breaches are detected (many through external, third party notification), the systems in question have likely been rebooted multiple times, potentially obviating memory analysis all together.

If a crime is committed and the perpetrator had to walk across a muddy field to commit that crime (leaving footprints), and that field is dug up and paved over with a parking lot before the crime is reported, you cannot then say that there aren't enough trained responders able to investigate the crime.

...seen a rise in file-less malware, which exists only in volatile memory and avoids installation on a target’s file system.

"File-less malware"?  Like Poweliks?  Here's a TrendMicro blog post regarding PhaseBot, which references a TrendMicro article on Poweliks.  Sure, there may not be a file on disk, but there's something pulled from the Registry, isn't there?

Malware comes from somewhere...it doesn't magically appear out of nowhere.  If you take a system off of the main network and reboot it, and find indications of malware persisting, then it's somewhere on the system.  Just because it is in memory, but there are no obvious indications of the malware within the file system doesn't mean that it can't be found.

Hunting
At the recent HTCIA 2015 Conference, I attended Ryan's presentation on "Hunting in the Dark", and I found it fascinating that at a sufficient level of abstraction, those of us who are doing "hunting" are doing very similar things; we may use different terms to describe it (what Ryan refers to as "harvesting and stacking", the folks I work with call it "using strategic rules")

Ryan's presentation was mostly directed to folks who work within one environment, and was intended to address the question of, "...how do I get started?"  Ryan had some very good advice for folks in that position...start small, take a small bite, and use it to get familiar with your infrastructure to learn what is "normal", and what might not be normal.

Along those lines, a friend of mine recently asked a question regarding detecting web shells in an environment using only web server logs.  Apparently in response to that question, ThreatStream posted an article explaining just how to do this.  So this is an example of how someone can start hunting within their own environment, with limited resources.  If you're hunting for web shells, there are number of other things I'd recommend looking at, but the original question was how to do so using only the web server logs.

The folks at ThreatStream also posted this article regarding "evasive maneuvers" used by a threat actor group.  If you read the article, you will quickly see that it is more about obfuscation techniques used in the malware and it's communications means, which can significantly effect network monitoring.  Reading the article, many folks will likely take a look at their own massive lists of C2 domain names and IP addresses, and append those listed in the article to that list.  So, like most of what's put forth as 'threat intelligence', articles such as this are really more a means for analysts to say, "hey, look how smart I am, because I figured this out...".  I'm sure that the discussion of assembly language code is interesting, and useful to other malware reverse engineers, but how does a CISO or IT staff utilize the contents of the third figure to protect and defend their infrastructure?

However, for anyone who's familiar with the Pyramid of Pain, you'll understand the efficacy of a bigger list of items that might...and do...change quickly.  Instead, if you're interested in hunting, I'd recommend looking for items such as the persistence mechanism listed in the article, as well as monitoring for the creation of new values (if you can).

Like I said, I agree with Ryan's approach to hunting, if you're new to it...start small, and learn what that set of artifacts looks like in your environment.  I did the same thing years ago, before the terms "APT" and "hunting" were in vogue...back then, I filed it under "doing my job".  Essentially, I wrote some small code that would give me a list of all systems visible to the domain controllers, and then reach out to each one and pull the values listed beneath the Run keys, for the system and the logged in user.  The first time I ran this, I had a pretty big list, and as I started seeing what was normal and verifying entries, they got whitelisted.  In a relatively short time, I could run this search during a meeting or while I was at lunch, and come back to about half a page of entries that had to be run down.

Tools
I ran across this post over at PostModernSecurity recently, and I think that it really illustrates somethings about the #DFIR community beyond just the fact that these tools are available for use.

The author starts his post with:

...I covet and hoard security tools. But I’m also frugal and impatient,..

Having written some open source tools, I generally don't appreciate it when someone "covets and hoards" what I've written, largely because in releasing the tools, I'd like to get some feedback as to if and how the tool fills a need.  I know that the tool meets my needs...after all, that's why I wrote it.  But in releasing it and sharing it with others, I've very often been disappointed when someone says that they've downloaded the tool, and the conversation ends right there, at that point...suddenly and in a very awkward manner.

Then there's the "frugal and impatient" part...I think that's probably true for a lot of us, isn't it?  At least, sometimes, that is.  However, there are a few caveats one needs to keep in mind when using tools like those the author has listed.  For instance, what is the veracity of the tools? How accurate are they?

More importantly, I saw the links to the free "malware analysis" sites...some referenced performing "behavioral analysis".  Okay, great...but more important than the information provided by these tools is how that information is interpreted by the analyst.  If the analyst is focused on free and easy, the question then becomes, how much effort have they put into understanding the issue, and are they able to correctly interpret the data returned by the tools?

For example, look at how often the ShimCache or AppCompatCache data from the Windows Registry is misinterpreted by analysts.  That misinterpretation then becomes the basis for findings that then become statements in reports to clients.

There are other examples, but the point is that if the analyst hasn't engaged in the academic rigor to understand something and they're just using a bunch of free tools, the question then becomes, is the analyst correctly interpreting the data that they're being provided by those tools?

Don't get me wrong...I think that the list of tools is a good one, and I can see myself using some of them at some point in the future.  But when I do so, I'll very likely be looking for certain things, and verifying the data that I get back from the tools.

Saturday, September 05, 2015

Registry Analysis

I gave a presentation on Registry analysis at the recent HTCIA2015 Conference, and I thought that there were some things from the presentation that might be worth sharing.

What is Registry analysis?  
For the purposes of DFIR work, Registry analysis is the collection and interpretation of data and metadata from Registry keys and values.

The collection part is easy...it's the interpretation part of that definition that is extremely important.  In my experience, I see a lot of issues with interpretation of data collected from the Registry.  The two biggest ones are what the timestamps associated with ShimCache entries mean, and what persistence via a particular key path really means.

Many times, you'll see the timestamps embedded in the ShimCache data referred to as either the execution time, or "creation/modification" time.  Referring to this timestamp as the "execution time" can be very bad, particularly if you're using it to demonstrate the window of compromise during an incident, or the time between first infection and discovery.  If the file is placed on a system and timestomped prior to being added to the ShimCache, or the method for getting it on the system preserves the original last modification time, that could significantly skew your understanding of the event.  Analysts need to remember that for systems beyond 32-bit XP, the timestamp in the ShimCache data is the last modification time from the file system metadata; for NTFS, this means the $STANDARD_INFORMATION attribute within the MFT record.

Ryan's slides include some great information about the ShimCache data, as does the original white paper on the subject.

With respect to persistence, I see a lot of write-ups that state that malware creates persistence by creating a value beneath the Run key in the HKCU hive, and the write-up then states that that means that the malware will be started again the next time the system reboots.  That's not the case at all...because if the persistence exists in a user's hive, then the malware won't be reactivated following a reboot until that user logs in.  I completely understand how this is misinterpreted, particularly (although not exclusively) by malware analysts...MS says this a lot in their own malware write-ups.  While simple testing will demonstrate otherwise, the vast majority of the time, you'll see malware analysts repeating this statement.

The point is that not all of the persistence locations within the Registry allow applications and programs to start on system start.  Some require that a user log in first, and others require some other trigger or mechanism, such as an application being launched.  It's very easy...too easy...to simply make the statement that any Registry value used for persistence allows the application to start on system reboot, because there's very little in the way of accountability.  I've seen instances during incident response where malware was installed only when a particular user logged into the system; if the malware used a Registry value in that user's NTUSER.DAT hive for persistence, the system was rebooted, and the user account was not used to log in, then the malware would not be active.  Making an incorrect statement about the malware could significantly impact the client's decision-making process (regarding AV licenses), or the decisions made by regulatory or compliance bodies (i.e., fines, sanctions, etc.).

Both of these items, when misinterpreted, can significantly impact the overall analysis of the incident.

Why do we do it?
There is an incredible amount of value in Registry analysis, and even more so when we incorporate it with other types of analysis.  Registry analysis is rarely performed in isolation; rather, most often, it's used to augment other analysis processes, particularly timeline analysis, allowing analysts to develop a clearer, more focused picture of the incident.  Registry analysis can be a significant benefit, particularly when we don't have the instrumentation in place that we would like to have (i.e., process creation monitoring, logging, etc.), but analysts also need to realize that Registry analysis is NOT the be-all-end-all of analysis.

In the presentation, I mention several of the annual security trend reports that we see; for example, from TrustWave, or Mandiant.  My point of bringing these up is that the reports generally have statistics such as dwell time or median number of days to detection, statistics which are based on some sort of empirical evidence that provides analysts with artifacts/indicators of an adversary's earliest entry into the compromised infrastructure.  If you've ever done this sort of analysis work, you'll know that you may not always be able to determine the initial infection vector (IIV), tracking back to say, the original phishing email or original web link/SWC site.  Regardless, this is always based on some sort of hard indicator that an analyst can point to as the earliest artifact, and sometimes, this may be a Registry key or value.

Think about it...for an analyst to determine that the earliest data of compromise was...for example, in the M-Trends 2015 Threat Report, 8 yrs prior to the team being called in...there has to be something on the system, some artifact that acts as a digital muddy boot print on a white carpet.  The fact of the matter is that it's something that the analyst can point to and show to another analyst in order to get corroboration.  This isn't something where the analysts sit around rolling D&D dice...they have hard evidence, and that evidence may often be Registry keys, or value data.

Wednesday, September 02, 2015

HTCIA2015 Conference Follow up

I spoke at the HTCIA 2015 conference, held in Orlando, FL, on Mon, 31 Aug.  In fact, I gave two presentations...Registry analysis, and lateral movement.  You can see the video for the lateral movement presentation I gave at BSideCincy here...many thanks to the BSides Cincy guys and Adrian.

I haven't spoken at, or attended an HTCIA conference in quite a while.  I had no idea if I was going to make it to this one, between airline delays and tropical storms.  This one was held at the Rosen Shingle Creek Resort, a huge ("palatial" doesn't cover it) conference center..."huge", in the manner of Caesar's Palace.  In fact, there was an Avon conference going on at the same time as the HTCIA conference, and there very well could have been other conferences there, as well.  Given the humidity and volume of rain, having everything you'd need in one location was a very good thing.  In fact, the rain was so heavy on Monday afternoon, after the final presentation, that there were leaks in the room.

After presenting on Monday, I attended Mari's presentation, which I've seen before...however, this is one of those presentations that it pays to see again.  I think that many times when we're deeply engaged in forensic analysis, we don't often think about other artifacts that may be of use...either we aren't aware of them, due to lack of exposure, or we simply forgot.  However, if you're doing ANYTHING at all related to determining what the user may have done on the system, you've got to at least consider what Mari was talking about.  Why?  Well, we all know that browsers have an automatic cache clean-up mechanism; if the user is right at about 19 days since the last cache clean-up in IE, and they do something bad, it's likely that the artifacts of activity are going to be deleted...which doesn't make them impossible to find, just harder.  The cookies that Mari has researched AND provided a tool to collect can illustrate user activity long after the fact, either in specific activity, or simply illustrating the fact that the user was active on the system at a particular time.

Also, Mari is one of the very few denizens of the DFIR community who finds something, digs into it, researches it and runs it down, then writes it up and provides a tool to do the things she talked about in her write-up.  This is very rare and unique within the community, and extremely valuable.  Her presentation on Google Analytics cookies could very well provide coverage of a gap that many don't even know exist in their analysis.

I was also able to see Ryan's presentation on Tuesday morning.  This one wasn't as heavily attended as the presentations on Monday, which is (I guess) to be expected.  But I'll tell you...a lot of folks missed some very good information.  I attended for a couple of reasons...one was that Ryan is a competitor, as much as a compatriot, within the community.  We both do very similar work, so I wanted to see what he was sharing about what he does.  I'm generally not particularly interested in presentations that talk about "hunting", because my experience at big conferences has often been that the titles of presentations don't match up with the content, but Ryan's definitely did so.  Some of what I liked about his presentation was how he broke things down...rather than going whole hog with an enterprise roll-out of some commercial package, Ryan broke things down with, "...here are the big things I look for during an initial sweep...", and proceeded from there.  He also recommended means for folks who want to start hunting in their own organization, and that they start small.  Trying to do it all can be completely overwhelming, so a lot of folks don't even start.  But taking just one small piece, and then using it to get familiar with what things look like in your environment, what constitutes "noise" vs "signal"...that's the way to get started.

What's interesting is that what Ryan talked about is exactly what I do in my day job.  I either go in blind, with very little information, on an IR engagement, or I do a hunt, where a client will call and say, "hey, I don't have any specific information that tells me that I've been compromised, but I want a sanity check...", and so I do a "blind" hunt, pretty much exactly as Ryan described in his presentation.   So it was interesting for me to see that, at a certain level of abstraction, we are pretty much doing the same things.  Now, of course there are some differences...tools, exact steps in the process, and even the artifacts that we're looking for or at, may be a little different.  But the fact of the matter is that just like I mentioned in my presentation, when a bad guy "moves through" an environment such as the Windows OS, there are going to be artifacts.  Looking for a footprint here, an over-turned stone there, and maybe a broken branch or two will give you the picture of where the bad guy went and what they did.  For me, seeing what Ryan recommended looking at was validating...because what he was talking about is what I do while both hunting and performing DFIR work.  It was also good to see him recommending ways that folks could start doing these sorts of things in their own environments.  It doesn't take a big commercial suite, or any special skills...it simply takes the desire, and the rest of what's needed (i.e., how to collect the information, what to look for, etc.) is all available.

All in all, I had a good time, and learned a lot from the folks I was able to engage with.

Addendum: While not related to the conference, here are some other good slides that provide information about a similar topic as Ryan's...