Friday, March 13, 2009

Incident Management 101

When responding to an incident, the single biggest, most important factor is information.

Some of you are going to read this and your first thought will be, "well...duh!" Well, think about it for a moment...if you're a consultant (like me), how many times have you received a call for assistance where the answer to all of your questions was either "I don't know" or "blue"? How many times have you responded to an incident where the customer even had as much as a usable network diagram? Remember, like any (potentially) bleeding victim, they want answers NOW, but it's like trying to diagnose someone via email...whether you're on the phone or on-site, the first thing you need to do is orient yourself to the situation, which, of course, takes time.

In any incident, there are going to be unknowns; ie, a lack of information. At first, you may not know what you're dealing with, so some data collection and analysis (you know..."investigation") will be required, and this should be based on a solid process.

One of the unknowns during an incident should not be your environment; if it is, you're in trouble. By environment, I mean everything, include very basic stuff like "TCP connections start with a three-stage handshake." Laugh if you will, but I'm serious. Having JUST that basic piece of information and understanding how to apply it makes a tremendous difference during incident response, for anyone. Also, you need other information such as, where are systems located, and what are the paths that data is supposed to (or can) take? Where are applications located? Is there any logging, and if so, what is logged?

When it comes to responding to an incident, there are four main locations for data collection:
1. The network - classic packet sniffing; analyze with Wireshark, NetWitness, etc.
2. Network devices - routers, firewalls, any appliances (IPS, Damballa, etc.)
3. Host system/memory
4. Host systems/physical disk - includes not just the host OS, but any applications, application logs, etc.

Data can be collected from any of these sources to assist you in your incident, IF you know where they are, how to (or who can) access them, and where the necessary data is located.

During an incident, something that is more dangerous than a lack of information (because you can always fill the gap, often at the expense of time...) is misinformation. Sometimes the best thing a responder or incident manager can do is recognize that they don't know something, and then gather data and facts, and perform analysis, in order to fill the gap. The absolute worst thing that can happen is when those gaps are filled (or created) by speculation and blatantly incorrect information.

As an example, I've been involved in a number of malware-related incidents in which the corporate AV solution did not detect a bit of malware, or they were hit with a variant of known malware family. In one instance, the IT staff captured a copy of the malware and ran "strings" on it, and then searched Google for one of the strings they found. On the second page of the Google search, they found a reference to a keystroke logger on a public forum. Assuming that this was 100% credible, they proclaimed that the malware had keystroke logging functionality...which immediately caused the Legal and Compliance department to fire off a brown star cluster (Marines will find that one humorous) and declare an emergency! After all, what started out as a quickly-eradicated malware issue now became one of potential sensitive data theft!

In this case, the information collected by the IT staff (ie, the string) had no context. Where was the string located within the EXE file? In the resource section, or in the import table? Depending on which one, that string can affect incident response in vastly different ways.

In another example, I was assisting an incident manager and providing advisory services when I was watching a couple of the IT staff assembled in the "war room". Two of the IT staff were talking about something related to the incident, and I heard one of them mention "keystroke logger". Given the incident, having a keystroke logger on systems would be very might even say, super bad. Another IT staffer who was working away on the other side of the room looked up when he heard this, and said, "the Trojan is a keystroke logger?" Right about that time, an IT manager walked into room and heard this and made the statement, "The Trojan is a keystroke logger." An hour later at a status meeting, the IT manager reported to the incident manager that a keystroke logger had been installed on systems on the network. Hint: during the ensuing hour, no one had done any examination of either the Trojan or any of the systems.

During incident response, the key to effectively managing an incident is knowing what you don't know, and doing what's necessary to fill that gap. Hint: Speculating ain't it!


Anonymous said...

Thanks for this interesting article. I could experience that getting the right information and being able to communicate are key challenges in incident response.

Keep up the good work with your blog!


H. Carvey said...


Thanks for the comment...

Yeah, it appears that in any incident, there is going to be misinformation unless things are carefully controlled. By that, I mean that the incident manager needs to keep to the facts, and weigh any information he or she does get with a skeptical eye.

PG said...

This article is so true -- anyone who has ever been in incident response will tell you that basic information is hard to come by.

We once had a guy who swore that his computer was being remote controlled by malware -- his mouse was moving involuntarily all over the screen. It turned out he had the same model wireless mouse as a co-worker sitting next to him and the radio signals were getting crossed. Pretty funny.

H. Carvey said...

That's too funny! A friend of mine told me about an incident once that involved an RFID keyboard with a reach of two offices down the hall...