Friday, April 10, 2015

Talk Notes

Thanks to Corey Harrell, I was watching the Intro to Large Scale Detection Hunting presentation from the NoLaSec meeting in Dec, 2014, and I started to have some thoughts about what was being said.  I looked at the comments field on YouTube, as well as on David's blog, but thought I would put my comments here instead, as it would give me an opportunity to structure them and build them out a bit before hitting the "send" button.

First off, let me say that I thought the talk was really good.  I get that it was an intro talk, and that it wasn't going to cover anything in any particular depth or detail.  I liked it enough that I not only listened to it beginning-to-end twice this morning, but I also went back to certain things to re-listen to what was said.

Second, the thoughts I'm going to be sharing here are based on my perspective as an incident responder and host/endpoint analyst.

Finally, please do not assume that I am speaking for my employer.  My thoughts are my own and not to be misconstrued as being the policies or position of my employer.

Okay, so, going into this, here are some comments and thoughts based on what I saw/heard in the presentation...

Attribution...does it matter?

As David said, if you're gov, yeah (maybe).  If you're a mom-and-pop, not so much.  I would suggest that during both hunting and IR, attribution can be a distraction.  Why is that?

Let's look at it this way...what does attribution give us?  What does it tell us?  Some say that it informs as to the intent of the adversary, and that it tells us what they're after.  Really?    Many times, an organization that has been compromised doesn't fully understand that they have that's "of value".  Is it data of some kind?  Marketing materials?  Database contents?  Manufacturing specs?  Or, is it the access that organization has to another organization?  If you're doing some hunting, and run across an artifact or indicator that puts you on high alert, how are you able to perform attribution?

Let's say that you find a system with a bunch of batch files on it, and it looks as if the intruder was performing recon, and even dumping credentials from systems...at this point, how do you perform attribution?  How do you determine intent?

Here's an example...about 5 yrs ago, I was asked to look at a hard drive from a small company that had been compromised.  Everyone assumed that the intruder was after data regarding the company's clients, but it turned out that this small organization's money, which was managed via online banking.  The intruder had been able to very accurately determine who managed the account, and compromised that specific system with a keystroke logger that loaded into memory, monitored keystrokes sent to the browser when specific web sites were open, and sent the captured keystrokes off of the system without writing them to a file on disk.  It's pretty clear that the bad guy thought ahead, and knew that if the the employee was accessing the online banking web site, that they could just send the captured data off of the system to a remote site.

"If you can identify TTPs, you can..."

...but how do you identify TTPs?  David talked about identifying TTPs and disrupting those, to frustrate the adversary; examples of TTPs were glossed over, but I get that, it's an intro talk.  This goes back to what does something "look like"...what does a TTP "look like"?

I'm not saying that David missed anything by glossing over this...not at all.  I know that there was a time limit to the talk, and that you can only cover so much in a limited time.

Can't automate everything...

No, you can't.  But there's much that you can automate.  Look at your process in a Deming-esque manner, and maybe even look at ways to improve your process, using automation.

"Can't always rely on signatures..."

That really kind of depends on the "signatures" used.  For example, if you're using "signatures" in the sense that AV uses signatures, then no, you can't rely on them.  However, if you're aware that signatures can be obviated and MD5 hashes can be changed very quickly, maybe you can look at things that may not change that often...such as the persistence mechanism.  Remember Conficker?  MS identified five different variants, but what was consistent across them was the persistence mechanism.

This is a great time to mention artifact categories, which will ultimately lead to the use of an analysis matrix (the link is to a 2 yr old blog post that was fav'd on Twitter this morning...)...all of which can help folks with their hunting.  If you understand what you're looking for...say, you're looking for indications of lateral movement...you can scope your hunt to what data you need to access, and what you need to look for within that data.

It's all about pivoting...

Yes, it is.

...cross-section of behaviors for higher fidelity indicators...

Pivoting and identifying a cross-section of behaviors can be used together in order to build higher fidelity indicators.  Something else that can be used is to do the same thing is...wait for it...sharing.  I know that this is a drum that I beat that a lot of folks are very likely tired of hearing, but a great way of creating higher fidelity indicators is to share what we've seen, let others use it (and there's no point in sharing of others don't use it...), and then validate and extend those indicators.

David also mentioned the tools that we (hunters, responders) use, or can put to use, and that they don't have to be big, expensive frameworks.  While I was working in an FTE security engineer position a number of years ago, I wrote a Perl script that would get a list of systems active in the domain, and then reach out to each one and dump the contents of the Run key from the Registry, for both the system and the logged on user.  Over time, I built out a white list of known good entries, and ended up with a tool I could run when I went to lunch, or I could set up a scheduled task to have it run at night (the organization had shifts for fulfillment, and 24 hr ops).  Either way, I'd come back to a very short list (half a page, maybe) of entries that needed to be investigated.  This also let me know which systems were recurring...which ones I'd clean off and would end up "infected" all over again in a day or two, and we came up with ways to address these issues.

So, my point is that, as David said, if you know your network and you know your data sources, it's not that hard to put together an effective hunting program.

At one point, David mentioned, "...any tool that facilitates lateral movement...", but what I noticed in the recorded presentation was that there were no questions about what those tools might be, or what lateral movement might look like in the available data.

Once we start looking at these questions and their responses, the next step is to ask, do I have the data I need?  If you're looking at logs, do you have the right logs in order to see lateral movement?  If you have the right sources, are they being populated appropriately?  Is the appropriate auditing configured on the system that's doing the logging?  Do you need to add additional sources in order to get the visibility you need?

"Context is King!"

Yes, yes it is.  Context is everything.

"Fidelity of intel has to be sound" 

Intel is worthless if it's built on assumption, and you obviate the need for assumption by correlating logs with network and host (memory, disk) indicators.

"Intel is context applied to your data"

Finally, a couple of other things David talked about toward the end of the presentation were with respect to post-mortem lessons learned (and feedback loops), and that tribal knowledge must be shared.  Both of these are critical.  Why?

I guess that another way to ask that question is, is it really "cost effective" for every one of us to have to all learn the same lessons on our own?  Think about how "expensive" that is...you may see something and even if I were hunting in the same environment, I may not see that specific combination of events for the next 6 months or a year, if ever.  Or, I may see it, but not recognize it as something that needs to be examined.

Sharing tribal knowledge can also mean sharing what you've seen, even though others may have already "seen" the same thing, for two reasons:  (1) it validates the original finding, and (2) it lets others know that what they saw 6 months ago is still in use.

Consider this...I've seen many (and I do mean, MANY) malware analysts simply ignore the persistence mechanism employed by malware.  When asked, some will say, "...yeah, I didn't say anything because it was the Run key...".  Whoa, wait a second...you didn't think that was important?  Here it is 2015, and the bad guys are still using the Run key for persistence?  That's HUGE!  That not only tells us where to look, but it also tells us that many of the organizations that have been compromised could have detected the intrusion much sooner with even the most rudimentary instrumentation and visibility.

Dismissing the malware persistence mechanism (or any other indicator) in this way blinds the rest of the hunting team (and ultimately, the community) to it's value and efficacy in the overall effort.

Summary
Again, this was a very good presentation, and I think serves very well to open up further conversation.  There's only so much one can talk about in 26 minutes, and I think that the talk was well organized.  What needs to happen now is that people who see the presentation start implementing what was said (if they agree with it), or asking how they could implement it.

Resources
Danny's blog - Theoretical Concerns
David Bianco's blog - Pyramid of Pain
Andrew Case's Talk - The Need for Proactive Threat Hunting (slides)

No comments: