Windows Incident Response: Characteristics of Effective Incident Response

Monday, January 05, 2009

Characteristics of Effective Incident Response

The Need
There is a need for effective incident response, now more than ever. However, the key to incident response is incident preparedness. Responding without being prepared to respond correctly is what turns an incident into a major data breach and a major embarrassment.

Addendum: If you don't think incidents are going to happen, or that they aren't going to happen to you, take a look at this SecurityFix post from Brian Krebs. To be clear, the very first sentence says "reported"...which perhaps indicates a subset of what was actually detected.

There are three primary characteristics of effective incident response, and knowing and understanding these lead to being better prepared to respond to incidents:

1. Completeness of Data
When I am called to respond to an incident, there are four sources of data I will generally look for, to some degree, regardless of the type of incident:

- Network traffic captures
- Logs from network devices (firewalls, routers, etc.)
- Host-based volatile data
- Host-based non-volatile data

Now, you may not need data from every source for every incident, and the value of the data from these sources may vary depending upon the incident. In all the time I've been doing IR, in various capacities, I can't recall a single time when I've had all four sources of data available...in fact, in most cases, I'm lucky to get one, and it's a miracle to have two.

2. Accuracy of Data
How accurate the data is in many cases can depend upon what it is you're actually trying to do. For example, capturing portions of volatile memory using a batch file and third party tools (ie, tlist.exe, tcpvcon.exe, etc.) to get the list of active processes, network connections, etc., may be accurate enough for your needs. However, in other instances, collecting the full contents of physical memory may be the only means of achieving accuracy (and completeness) of data.

3. Temporal proximity
This is a term I borrowed from Aaron Walters (of Volatility fame), not because it's cool and Star Trek-y sounding, but because it makes perfect sense. The sooner you begin responding to an incident, the more complete and accurate data you're going to get, and your overall response is going to be more effective and lead to better remediation, etc.

Since this field is wrought with analogies, let's try this one...as a homeowner, do you have smoke detectors in your home? How about fire extinguishers? I do. How about insurance? When you called for your insurance, did the insurance company ask how far you are from the nearest hydrant? How about from the nearest fire department? Did you get a home inspection prior to purchasing your home? Is it up to code with respect to exits, etc? Do you know what to do in case of a grease fire in your kitchen?

So, your home is full of valuables, in particular, your family. If your home were to catch fire, what would you do? First off, how would you know? Then, once you knew, what would you do? Would you just wait until the house burned down to call the fire department?

Now, map your home to your network infrastructure, and a fire to a data breach. Where are your valuables? Do you have a detection mechanism? Are you able to respond immediately and correctly to a grease fire?

This example shows, in part, that temporal proximity plays an important role in incident response, but it also highlights the need for approaching it the right way, which may include training. So what does this mean for the coming year? Well, it should mean that training will be more important than ever. Think about it. If a small fire starts in your house, would you (a) wait for an hour, then call the fire department, (b) wait for the house to burn down completely, or (c) begin putting the fire out yourself immediately? With respect to computer security incidents, who is in a better position to respond immediately...the sysadmin who is currently logged into the console, or an responder such as myself who is 24-48 hrs away from being on-site? Not only is the sysadmin there, but she more than likely knows the systems and the architecture very well; an external responder such as myself is going to have to get up to speed on your architecture, and even then won't have all of the little nuances.

This is all fine and good, but as Lon Solomon is fond of saying, "so what?" The fact is that many organizations simply don't put any effort or resources into their response plan because they feel that there's no requirement to do so.

External Forces
I'm not going to make a prediction for the future, but the primary drivers for incident response in 2009 (and beyond) will continue to be external forces; specifically, legislative and regulatory compliance requirements. Why is that? Because they have to be. After all, most of us think that if someone is making a business out of storing and/or processing our sensitive (PII, PHI, PCI) data, then of course they'll do everything they can to protect and secure that data. I mean, why wouldn't they, right? After all, if a buddy wants you to hold $50 for him, or the ring he's going to present his bride at their wedding, then you're going to do everything you can to protect that data, right? For many of us, this is just what we think of as common sense, but that's sadly not the case with your sensitive data. Look here, or here. And things only start to change when some external forces come into play, and those forces are strong enough to act as the stimulus to cause that change...those external forces being legal (think CA SB-1386, etc.) or regulatory compliance (think HIPAA, Visa PCI, etc.) requirements.

Monday, January 05, 2009

Characteristics of Effective Incident Response

No comments: