Windows Incident Response: The Problem with the Modern Security Stack

I read something interesting recently that stuck with me. Well, not "interesting", really...it was a LinkedIn post on security sales. I usually don't read or follow such things, but for some reason, I started reading through this one, and really engaging with the content. This piece said that "SOC analysts want fewer alerts", and went on with further discussions of selling solutions for the "security stack". I was drawn in by the reference to "security stack", and it got me to thinking...what constitutes a "security stack"? What is that, exactly?

Now, most folks are going to refer to various levels of tooling, but for every definition you hear, I want you to remember one thing...most of these "security stacks" stand on an extremely weak foundation. What this means is that if you lay your "security stack" over default installations of OSs and applications, with no configuration modifications or "hardening", and if you have no asset inventory, and you haven't performed even the most high-level attack surface reduction, it's all for naught. It's difficult to filter out noise and false positives in detections when nothing has been done to configure the endpoints themselves to reduce noise.

One approach to a security stack is to install EDR and other security tooling on the endpoints (all of them, one would hope), and manage it yourself, via your own SOC. I know of one organization several years ago that had installed EDR on a subset of their systems, and enabled in learning mode. Unfortunately, it was a segment on which a threat actor was very active, and rather than being used to take action against the threat actor, the EDR learned that the threat actor's activity was "normal".

I know of another organization that was hit by a threat actor, and during the after action review, they found that the threat actor had used "net user" (native tool/LOLBin) to create new user accounts within their environment. They installed EDR, and were not satisfied with the default detection rules, so they created one to detect the use of net.exe to create user accounts. They were able to do this because they knew that within their organization, they did not use this LOLBin to manage user accounts, and they also knew which app they used, which admins did this work, and from which workstations. As such, they were able to write a detection rule with 100% fidelity, knowing that any detection was going to be malicious in nature.

What happens if you outsource your "security stack", even just part of it, and don't manage that stack yourself (usually referred to as MDR or XDR)? Outsourcing your security stack can become even more of an issue, because while you have access to expertise (you hope), you're faced with another issue all together. Those experts in response, and detection engineering are now faced with receiving data from literally hundreds of other infrastructures, all in similar (but different) states of disarray as yours. The challenge then becomes, how do you write detections so that they work, but do not flood the SOC (and ultimately, customers) with false positives?

In some cases, the answer is, you don't. There are times when the activity that is 100% malicious on one infrastructure, is part of a critical business process for others. While I worked for one MDR company in particular, we saw that...a lot. We had a customer that had their entire business built on sharing Office documents with embedded macros over the Internet...and yes, that's exactly how a lot of malware made/makes it on to networks. We also had other customers for whom MSWord or Excel spawning cmd.exe or PowerShell (i.e., running a macro) could be a false positive. Under such circumstances, do you keep the detections and run the risk of regularly flooding the SOC with alerts that are all false positives, or do you take a bit of a different approach and focus on detecting post-exploitation activity only?

Figure 1: LinkedIn Post

A recent LinkedIn post from Chris regarding the SOC survey results is shown in figure 1. One of the most significant issues for a SOC is the "lack of logs". This could be due to a number of reasons, but in my experience over the years, the lack of logs is very often the result of configuration management, or lack thereof. For example, by default MSSQL servers log failed login attempts, and modifications to stored procedures (enabling, disabling); successful logins are not recorded by default. I've also seen customers either disable auditing of both successful logins and failed login attempts, or turn up the auditing so high that the information needed when an attack occurs is overwritten quickly, sometimes within minutes, or even quicker. All of this goes back to how the foundation of the security stack, the operating system and installed applications, are built, configured, and managed.

Figure 2: SecureList blog Infection Flow

Figure 2 illustrates the Infection Flow from a recent SecureList article documenting findings regarding the SideWinder APT. The article includes the statement, "The attacker sends spear-phishing emails with a DOCX file attached. The document uses the remote template injection technique to download an RTF file stored on a remote server controlled by the attacker. The file exploits a known vulnerability (CVE-2017-11882) to run a malicious shellcode..."; yes, it really does say "CVE-2017-11882", and yes, the CVE was published over 7 years ago. I'm sharing the image and link to the article not to shame anyone, but rather to illustrate that the underlying technology employed by many organizations may be out of date, unpatched, and/or consisting of default, easily compromised configurations.

The point I'm making here is that security stacks built on a weak foundation are bound to have problems, perhaps even catastrophic ones. A strong foundation begins with an asset inventory (of both systems and applications), and attack surface reduction (through configuration, patching, etc.). Very often, it doesn't take a great deal to harden systems; for example, here's a Huntress blog post where Dray provided free PowerShell code to provide a modicum of "hardening" to endpoints.

Common issues include:

- Publicly exposed RDP on servers and workstations, with no MFA; no one is watching the logs, so they don't see the brute force attacks, from public IP addresses

- Publicly exposed RDP with MFA, but other services not covered by MFA (SMB, MSSQL) are also exposed, so the MFA can be disabled; this applies to other security services, as well, such as anti-virus, and even EDR

- Exposed, unpatched, out of date services

- Disparate endpoints that are not covered by security services (webcams, anyone?)

1 comment:

Rick Henderson said...: Great thoughts Harlan. I worked for an EDR company who's customer was a major healthcare provider and they were a victim in a very big, public way. While we were wondering how they could possibly have got in (I knew my detections were good ;) ) it became clear not only did the compromised endpoints not have EDR, but the organization used multiple vendors and some endpoints just weren't being monitored.

And yeah, hearing about compromises due to old vulnerabilities is mindblowing.; 6:52 AM

Monday, March 10, 2025

The Problem with the Modern Security Stack

1 comment: