Sunday, October 10, 2021

Data Exfiltration, Revisited

I've posted on the topic of data exfiltration before (here, etc.) but often it's a good idea to revisit the topic. After all, it was almost two years ago that we saw the first instance of ransomware threat actors stating publicly that they'd exfiltrated data from systems, using this a secondary means of extortion. Since then, we've continued to see this tactic used, along with other tertiary means of extortion based on data exfiltration. We've also seen several instances where the threat actor ransom notes have stated that data was exfiltrated but the public "shaming" sites were noticeably empty.

As long as I've been involved in what was first referred to as "information security" (later referred to as "cyber security"), data exfiltration has been a concern to one degree or another, even in the absence of clearly-stated and documented analysis goals. With the advent of PCI forensic investigations (circa 2007-ish), "data exfiltration" became a formalized and documented analysis goal for every investigation, whether the merchant asked for it or not. After all, what value was the collected data if the credit card numbers were extracted from memory and left sitting on the server? Data exfiltration was/is a key component necessary for the crime, and as such, it was assumed often without being clearly identified.

One of the challenges of determining data exfiltration is visibility; systems and networks may simply not be instrumented in a manner that allows us to determine if data exfiltration occurred. By default, Windows systems do not have a great deal of data sources and artifacts that demonstrate data exfiltration in either a definitive or secondary manner. While some do exist, they very often are not clearly understood and investigated by those who then state, "...there was no evidence of data exfiltration observed..." in their findings.

Many years ago, I responded to an incident where an employee's home system had been compromised and a keystroke logger installed. The threat actor observed through the logs that the employee had remote access to their work infrastructure, and proceeded to use the same credentials to log into the corporate infrastructure. These were all Windows XP and 2003 systems, so artifacts (logs and other data sources) were limited in comparison to more modern versions of Windows, but we had enough indicators to determine that the threat actor had no idea where they were. The actor conducted searches that (when spelled correctly) were unlikely to prove fruitful...the corporate infrastructure was for a health care provider, and the actor was searching for terms such as "banking" and "password". All access was conducted through RDP, and as such, there were a good number of artifacts populated when the actor accessed files.

At that point, data exfiltration could have occurred through a number of means. The actor could have opened a file, and taken a picture or screen capture of their own desktop...they could have "exfiltrated" the data without actually "moving" it.

Jump forward a few years, and I was working on an APT investigation when EDR telemetry demonstrated that the threat actor had archived files...the telemetry included the password used in the command line. Further investigation led us to a system with a publicly-accessible IIS web server, albeit without any actual formal web sites being served. Web server logs illustrated that the threat actor downloaded zipped archives from that system successfully, and file system metadata indicated that the archive files were deleted once they'd been downloaded. We carved unallocated space and recovered a dozen accessible archives, which we were able to open using the password observed in EDR telemetry. 

In another instance, we observed that the threat actor had acquired credentials and was able to access OWA, both internally and externally. What we saw the threat actor do was access OWA from inside the infrastructure, create a draft email, attach the data to be exfiltrated to the email, and then access the email from outside of the infrastructure. At that point, they'd open the draft email, download the attachment, and delete the draft email. 

When I first began writing books, my publisher had an interesting method for transferring manuscript files. They sent me instructions for accessing their FTP site via Windows Explorer (as opposed to the command line), which left remnants on the system well beyond the lifetime of the book itself.

My point is that there are a number of ways to exfiltrate data from systems, and detecting data exfiltration can be extremely limited without necessary visibility. However, there are data sources on Windows systems that can provide definitive indications of data exfiltration (i.e., BITS upload jobs, web server logs, email, network connections/pcaps in memory dumps and hibernation files, etc.), as well as potential indications of data exfiltration (i.e., shellbags, SRUM, etc.). These data sources are relatively easy (almost trivial) to check, and in doing so, you'll have a comprehensive approach to addressing the issue.

Friday, October 08, 2021

Tips for DFIR Analysts, pt III

Learn to think critically. Don't take what someone says as gospel, just because they say it. Support findings with data, and clearly communicate the value or significance of something.

Be sure to validate your findings, and never rest your findings on a single artifact. Find an entry for a file in the AmCache? Great. But does that mean it was executed on the system? No, it does not...you need to validate execution with other artifacts in the constellation (EDR telemetry, host-based effects such as an application prefetch file, Registry modifications, etc.).

Have a thorough process, one that you can add to and extend. Why? Because things are always changing, and there's always something new. If you can automate your process, then so much the better...you're not loosing time and enabling crushing inefficiencies. So what do you need to look for? Well, the Windows Subsystem for Linux has been around for some time, and has even been updated (to WSL2). There are a number of versions of Linux you can install via WSL2, including Parrot OS. As one would expect, there's now malware targeting WSL2 (Lumen Black Lotus LabsTomsHardware, The Register).

Learn to communicate clearly and concisely. This includes both the written and spoken form. Consider using the written form to make the spoken form easier to communicate, by first writing out what you want to communicate.

Things are not always what they seem. Just because someone says something is a certain way doesn't make it the case. It's not that they're lying; more often than not, it's that they have a different perspective. Look at it this way...a user will have an issue, and you'll ask them to walk through what they did, to see if you can replicate the issue. You'll see data that indicates that they took a specific action, but they'll say, "I didn't do anything." What they mean is that they didn't do anything unusual or different from what they do on a daily basis.

There can often be different ways to achieve the same goal, different routes to the same ending. For example, Picus Security shared a number of different ways to delete shadow copies, which included resizing the VSC storage to be less than what was needed. From very cursory research, if a VSC does not fit into the available space, it gets deleted. This means that VSCs created will likely be deleted, breaking brittle detections looking for vssadmin.exe being used to directly delete the VSCs. Interestingly enough, I found this as a result of this tweet asking about a specific size (i.e., 401Mb).

Another example of different approaches to the same goal is using sc.exe vs reg.exe to control Windows services, etc. Different approaches may be predicated upon the threat actor's skill set, or it may be based on what the threat actor knows about the environment (i.e., situational awareness). Perhaps the route taken was due to the threat actor knowing the blind spots of the security monitoring tools, or of the analysts responding to any identified incidents.

There are also different ways to compromise a system...via email, the browser, any accessible services, even WSL2 (see above).

An issue within or challenge of DFIR has long been signal-to-noise ratio (SNR). In the early days of DFIR, circa Win2000/WinXP systems, the issue had to do with limited data...limited logging, limited tracking of user activity, etc. As a result, there are limited artifacts to tie to any particular activity, making validation (and to an extent, attribution) difficult, at best.

As DFIR has moved to the enterprise and analysts began engaging with EDR telemetry, we've also seen surges in host-based artifacts, not only between versions of Windows (XP, to Win7, to Win10) but also across builds within Windows 10...and now Windows 11 is coming out. With more people coming into DFIR, there's been a corresponding surge in the need to understand the nature of these artifacts, particularly within the context of other artifacts. This has all led to a perfect storm; increases in available data (more applications, more Windows Event Logs, more Registry entries, more data sources) and at the same time, a compounding need to correctly and accurately understand and interpret those artifacts. 

This situation can easily be addressed, but it requires a cultural change. What I mean is that a great deal of the parsing, enrichment and decoration of available data sources can be automated, but without DFIR analysts baking what they've discovered...new constellations or elements of constellations...back into the process, this entire effort becomes pointless beyond the initial creation. What allows automation such as this continue to add value over time is that it is developed, from the beginning to be expanded; for new data sources to be added, but also findings to be added.

Hunting isn't just for threat hunters. We most often think of "threat hunting" as using EDR telemetry to look for "badness", but DFIR analysts can do the same thing using an automated approach. Whether full images are acquired or triage data is collected across an enterprise, the data sources can be brought to a central location, parsed, enriched, decorated, and then presented to the analyst with known "badness" tagged for viewing and pivoting. From there, the analyst can delve into the analysis much sooner, with greater context, and develop new findings that are then baked back into the automated process.

Addendum, 9 Oct: Early in the above blog post, I stated, "Find an entry for a file in the AmCache? Great. But does that mean it was executed on the system? No, it does not...you need to validate execution with other artifacts in the constellation...". After I posted a link to this post on LinkedIn, a reader responded with, "...only it does."

However, this tweet states, "Amcache entries are created for executables that were never executed. Executables that were launched and then deleted aren't recorded. Also, Amcache entries aren't created for executables in non-standard locations (e.g., "C:\1\2\") _unless_ they were actually executed." 

Also, this paper states, on the second half of pg 24 of 66, "The appearance of a binary in the File key in AmCache.hve is not sufficient to prove binary execution but does prove the presence of the file on the system." Shortly after that, it does go on to say, "However, when a binary is referenced under the Orphan key, it means that it was actually executed." As such, when an analyst states that they found an entry "in" the AmCache.hve file, it is important to state clearly where it was found...specificity of language is critical. 

Finally, in recent instances I've engaged with analysts who've stated that an entry "in" the AmCache.hve file indicated program execution, and yet, no other artifacts in the constellation (Prefetch file, EDR telemetry, etc.) were found. 

EDR Bypasses

During my time in the industry, I've been blessed to have opportunities to engage with a number of different EDR tools/frameworks at different levels. Mike Tanji offered me a look at Carbon Black before carbonblack.com existed, while it still used an on-prem database. I spent a very good deal of time working directly with Secureworks Red Cloak, and I've seen CrowdStrike Falcon and Digital Guardian's framework up close. I've seen the birth and growth of Sysmon, as well as MS's "internal" Process Tracking (which requires an additional Registry modification to record full command lines). I've also seen Nuix Adaptive Security up close (including seeing it used specifically for threat hunting), which rounds out my exposure. So, I haven't seen all tools by any stretch of the imagination, but more than one or two.

Vadim Khrykov shared a fascinating tweet thread regarding "EDR bypasses". In the thread, Vadim lists three types of bypasses:

1. Technical capabilities bypass - focusing on telemetry EDR doesn't collect
2. EDR configuration bypass - EDR config being "aggressive" and impacting system performance 
3. EDR detection logic bypass - EDR collects the telemetry but there is no specific detection to alert on the technique used

Vadim's thread got me to thinking about bypasses I've seen or experienced over the years....

1. Go GUI

Most EDR tools are really good about collecting information about new processes that are created, which makes them very valuable when the threat actor has only command line access to the system, or opts to use the command line. However, a significant blind spot for EDR tools is when GUI tools are used, because in order to access the needed functionality, the threat actor makes selections and pushes buttons, which are not registered by the EDR tools. This is a blind spot, in particular, for EDR tools that cannot 'see' API calls.

As such, this does not just apply to GUI tools; EXE and DLL files can either run external commands (which are picked up by EDR tools), or access the same functionality via API calls (which are not picked up by EDR tools).

This has the overall effect of targeting analysts who may not be looking to artifact constellations. That is to say that analysts should be validating tool impacts; if an action occurred, what are the impacts of that action on the eco-system (i.e., Registry modifications, Windows Event Log records, some combination thereof, etc.)? This way, we can see the effects of an action even in the absence of telemetry specifically of that action. For example, did a button push lead to a network connection, or modify firewall settings, or establish persistence via WMI? We may not know that the button was pushed, but we would still see the artifact constellations (even partial ones) of the impact of that button push.

Take Defender Control v1.6, for example. This is a simple GUI tool with a couple of buttons that allows the user to disable or enable Windows Defender. EDR telemetry will show you when the process is created, but without looking further for Registry modifications and Windows Event Log records, you won't know if the user opened it and then closed it, or actually used it to disable Windows Defender.

2. Situational awareness 

While I was with CrowdStrike, I did a lot of public presentations discussing the value of threat hunting. During these presentations I included a great deal of telemetry taken directly from the Falcon platform, in part demonstrating the situational awareness (or lack thereof) of the threat actor. We'd see some who didn't seem to be aware or care that we were watching, and we'd see some who were concerned that someone was watching (checking for the existence of Cb on the endpoint). We'd also see other threat actors who not only sought out which specific EDR platform was in use, but also began reaching out remotely to determine other systems on which that platform was not installed, and then moving laterally to those systems.

I know what you're thinking...what's the point of EDR if you don't have 100% coverage? And you're right to think that, but over the years, for a variety of reasons, more than a few organizations impacted by cyber attacks have had limited coverage via EDR monitoring. This may have to do with Vadim's reason #2, or it may have to do with basic reticence to install EDR capability (concerns about the possibility of Vadim's reason #2...).

3. Vadim's #2++

To extend Vadim's #2 a bit, some other things I've seen over the years is customers deploying EDR frameworks, albeit only on a very limited subset of systems. 

I've also seen where deploying EDR within an impact organization's environment has been inhibited by...well...the environment, or the staff. I've seen the AD admin refuse to allow EDR to be installed on _his_ systems because we (my team) might remove it at the end of the engagement and leave a backdoor. I've seen users in countries with very strict privacy laws refuse to allow EDR to be installed on _their_ systems. 

I've seen EDR installed and run in "learning" mode during an engagement, so that the EDR "learned" that the threat actor's actions were "good".

One of the biggest variants of this "bypass" is an EDR that is sending alerts to the console, but no one is watching. As odd as it may sound, this happens considerably more often than you'd think.

EDR is like any other tool...it's value depends heavily upon how it's used or employed. When you look closely at such tools, you begin to see their "blind spots", not just in the sense of things that are not monitored, but also how DFIR work significantly enhances the visibility into a compromise or incident.

Thursday, September 23, 2021

Imposter Syndrome

Imposter Syndrome

This is something many of us have experienced to one degree or another, at various times. Many have experienced, some have overcome it, others may not be able to and wonder why.

HealthLine tells us, "Imposter feelings represent a conflict between your own self-perception and the way others perceive you." I would modify that slight to, "...the way we believe others perceive us." Imposter syndrome is something internalized, and has very little to do with the outside world.

I wanted to take the opportunity to share with you, the reader, what I've learned over the years about what's really happening in the world when we're having those feelings of imposter syndrome.

Perception: I don't want to present at a conference, or ask a question at a conference, because everyone knows more than I do, and will ask questions I don't know the answer to.

Reality: Have you ever stood in the back of the room at a conference and just...watched? Most people you see aren't even listening the presentation. Some are on social media (Twitter, Discord, Slack, etc.), and some are on their phones.

When someone asks a question about a presentation, calm yourself so instead of the ringing of fear in your ears, you actually hear the question. Does the question have anything to do with the presentation? I ask, because in my experience, I've gotten a few questions following a presentation, and some of those questions have been way off topic. Looking back, what seems to happen is that someone hears a term that they're familiar with, or something that strikes a chord, and they take it way out in left field. 

My point is simply that, at conferences, the simple reality a lot of people aren't really paying attention. This can be very important, even critical, when the fear behind imposter syndrome is that you're under a microscope.

I've found that I've had to engage with people directly to get a reaction. When I worked for CrowdStrike, I was involved with a number of public speaking engagements, and for the last 5 or so months, the events were about 3 1/2 hrs long. Across the board, almost no one actually asked questions. For those who did, it took time for them to do so; some wouldn't ask questions until the 2 1/2 or 3 hr mark. Some would ask questions, but only after I'd goaded someone else into asking a question.

Perception: I could never write a book or blog post, no matter how prepared I am, because the moment it's published it'll torn apart, and I'll be eviscerated and exposed as the fraud that I am.

Reality: Most people don't actually read books and blog posts. My experience in writing books is that some will purchase the books, but few are willing to actually read the books, and even fewer will write reviews. It's the same with blog posts...you put work into the topic and content, and if you don't share a link to the post on social media, very few actually see the post. If you do share it on social media, you may get "likes" and retweets, but no one actually comments, neither on the tweet nor the post.

My overall point is that imposter syndrome is an internal dialog that prevents us from being our best and reaching our potential. We can overcome it by replacing it with a dialog that is more based on reality, on what actually happens, in the real world. 

My perspective on this is based on experience. In 2005, Cory Altheide and I published a paper discussing the artifacts associated with the use of USB devices on Windows systems (at the time, primarily Windows XP). After the paper was published, I spoke on the topic at a conference in my area, one primarily attended by law enforcement, and got a no questions. Several years later, I was presenting at another conference, and had mistakenly opened the wrong version of presentation; about halfway through, there was a slide with three bullets that all said, "blah blah blah". At the end of the presentation, three attendees congratulated me on a "great presentation". 



Sunday, September 19, 2021

Distros and RegRipper

Over the years, every now and then I've taken a look around to try to see where RegRipper is used. I noticed early on that it's included in several security-oriented Linux distros. So, I took the opportunity to compile some of the links I'd found, and I then extended those a bit with some Googling. I will admit, I was a little surprised to see how, over time, how far RegRipper has gone, from a "here, look at this" perspective.

Not all of the below links are current, some are several years old. As such, they are not the latest and greatest; however, they may still apply and they may still be useful/valuable.

RegRipper on Linux (Distros) 
KaliKali GitLab 
SANS SIFT 
CAINE  
Installing RegRipper on Linux 
Install RRv2.8 on Ubuntu 
CentOS RegRipper package 
Arch Linux  
RegRipper Docker Image 
Install RegRipper via Chocolatey 

Forensic Suites
Something I've always been curious about is why the value of RegRipper being incorporated into and maintained through a forensic analysis suite isn't more of "a thing", but that fact doesn't prevent RegRipper and tools like it from being extremely valuable in a wide range of analyses.

RegRipper is accessible via Autopsy 
OSForensics Tutorial 
Launching RegRipper via OpenText/EnCase

When I worked for Nuix, I worked with Dan Berry's developers to build extensions for Yara and RegRipper (Nuix RegRipper Github) giving users of the Workstation product access to these open source tools in order to extend their capabilities. While both extensions really do a great deal to leverage the open source tool for use by the investigator, I was especially happy to see how the RegRipper extension turned out. The extension would automatically locate hive files, regardless of the Windows version (including the AmCache.hve file), automatically run the appropriate plugins against the hive, and then automatically incorporate the RegRipper output into the case file. In this way, the results were automatically incorporated into any searches the investigator would run across the case. During testing, we added images of Windows XP, Windows 2008 and Windows 7 systems to a case file, and the extension ran flawlessly.

It seems that RegRipper (as well as other tools) have been incorporated into KAPE, particularly into the Registry and timelining modules. This means that whether you're using KAPE freely, or you're using the enterprise license, you're likely using RegRipper and other tools I've written, to some extend.

I look back on this section, and I really have to wonder why, given how I've extended RegRipper since last year, why there is no desire to incorporate RegRipper into (and maintain it through) a commercial forensic analysis suite. Seriously.

Presentations/Courses
I've covered RegRipper as a topic in this blog, as well as in my books. I've also given presentations discussing the use of RegRipper, as have others. Here are just a few links:

OSDFCon 2020 - Effectively Using RegRipper (video)
PluralSight Course 

RegRipper in Academia
Okay, I don't have a lot of links here, but that's because there were just so many. I typed "site:edu RegRipper" into a Google search and got a LOT of hits back; rather than listing the links, I'm just going to give you the search I ran and let you do with it what you will. Interestingly, the first link in the returned search results was from my alma mater, the Naval Postgraduate School; specifically, Jason Shaver's thesis from 2015.

On Writing DFIR Books, pt II

Part I of this series kicked things off for us, and honestly I have no idea how long this series will be...I'm just writing the posts without a specific plan or outline for the series. In this case, I opted to take an organic approach, and wanted to see where it would go.

Content
Okay, so you have an idea for a book, but about...what? You may have a title or general idea, but what's the actual content you intend to write about? Is it more than a couple of paragraphs; can you actually create several solid chapters without having to use a lot of filler and fluff? Back when I was actively writing books, this was something on the forefront of my mind, not only because I was writing books, but later I got a question or two from others along these lines.

In short, I write about stuff I know, or stuff I've done. Not everything I've written about came from work; a good bit of what I've written about came from research I'd done, either following up on something I'd seen or based on an idea I had. For example, during one period not long after we'd transitioned to Windows 7, I wanted to follow up on creating, using and detection NTFS alternate data streams (ADSs), and I'd found some content that provided alternate means for launching executables written to ADSs. I wanted to see if ADSs were impacted by scripting languages, and I added the results of what I found to the book content.

A number of years ago, I had access to an MSDN account, and that access was used to install operating systems and applications, and then to "see" the toolmarks or artifact constellations left behind by various activities, particularly identifying the impact of different applications and configurations. Unfortunately, the MS employee who'd provided me with the account moved on, and the license eventually expired, which was unfortunate, as I was able to see the impact of different activities not only with respect to the operating system, but also with respect to different application suites.

Sources of content can include (and in my case, have included) just about anything; blog posts, blog post drafts, presentation materials, the results of ad hoc testing, etc. 

Outline
Something I've learned over the years is that the easiest way to get started writing book content is to start with an outline. The outline allows (or forces) you to organize your thoughts into a structured format, and allows you to see the flow of the book from a high level. This also allows you to start adding flesh to the bones, if you will, seeing where there structure is and adding that content we talked about. It also allows you see where there is value in consistency, in doing the same or similar writing (or testing) in different locations in the book (i.e., "chapters") in order to be as complete as possible. A well-structured outline will allow you to see gaps.

Further, if you have a well-structured outline that you're working from, the book almost writes itself. 

Monday, September 06, 2021

Tips for DFIR Analysts, pt II

On the heels of my first post with this subject, I thought I'd continue adding tips as they came to mind...

I've been engaged with EDR frameworks for some time now. I first became aware of Carbon Black before it was "version 1.0", and before "carbonblack.com" existed. Since then, I've worked for several organizations that developed EDR frameworks (Secureworks, Nuix, CrowdStrike, Digital Guardian), and others that made use of frameworks created by others. I've also been very happy to see the development and growth of Sysmon, and used it in my own testing.

One thing I've been acutely aware of is the visibility afforded by EDR frameworks, as well as the extent of that visibility. This is not a knock against these tools...not at all. EDR frameworks and tools are incredibly powerful, but they are not a panacea. For example, most (I say "most" because I haven't seen all EDR tools) track process creation telemetry, but not process exit codes. As such, it can be detrimental to assume that because the EDR telemetry shows a process being created, that the process successfully executed and completed. Some EDR tools can block processes based on specific criteria...I saw a lot of this at CrowdStrike, and shared more than a few examples in public speaking events. 

In other instances, the process may have failed to execute all together. For example, it may be been detected by AV shortly after it started executing. I've actually been caught by Windows Defender; prior to initiating testing, I'll disable real-time monitoring, but leave Defender untouched other than that. I'll then go about my testing, and then at some point in the future (sometimes around 4 hrs), I'll be continuing my testing, only to have Windows Defender recover (real-time monitoring is automatically re-enabled), and what I was working on was quarantined.

Did the executable throw an error shortly after launch, with a WER record, or an Application PopUp message, being generated in the Windows Event Log? 

Were you able to validate the impact of the executable or command? For example, if the threat actor was seen running a command that would impact the Windows Registry and result in Windows Event Log records being generated, were those artifacts validated and observed on the system?

The overall point is that while EDR frameworks provide a tremendous amount of visibility, but they do not provide ALL visibility.

What's Old Is New Again
Something a bit more on the deeper forensicy-technical side...

I ran across a couple of things recently that reminded me that what I found fascinating and dug into deeply twenty years ago may not be that well known today. For example, last year, Windows Defender/MpCmdRun.exe was identified as an LOLBin, and that was accompanied by the fact that files could be written to alternate data streams (ADSs). 

I also ran across something "new" regarding the "zone.identifier" ADSs associated with downloads (Edge, Chrome); specifically, the ZoneID ADSs are no longer 26 or 29 bytes in size, now they're much larger because they include more information, including HostURL and RefererURL entries, as illustrated in fig 1.

Fig 1: ZoneID ADS contents




This clearly provides some useful information regarding the source of the file. The ADS illustrated in fig 1 is from a PDF document I'd downloaded to my desktop via Chrome; as such, it wasn't found in my Downloads folder (a dead giveaway for downloads...), but the existence and contents of the ADS clearly demonstrate that the file was indeed downloaded to the system.

Now, we'll just need to see if other means for downloading files...BITS jobs, LOLBins, etc...produce similar results.

On Writing DFIR Books, pt I

During my time in the industry, I've authored 9 books under three imprints, and co-authored a tenth.

There, I said it. The first step in addressing a problem is admitting you have one. ;-)

Seriously, though, this is simply to say that I have some experience, nothing more. During the latter part of my book writing experience, I saw others who wanted to do the same thing, but ran into a variety of roadblocks, roadblocks I'd long since navigated. As a result, I tried to work with the publisher to create a non-paid liaison role that would help new authors overcome many of those issues, so that a greater portfolio of quality books became available to the industry. By the time I convinced one editor of the viability and benefit of such a program, they had decided to leave their profession, and I had to start all over again, quite literally from the very beginning.

Authoring a book has an interesting effect, in that it tends to create a myth around the author, one that they're not aware of at first. It starts with someone saying, "...you wrote a book, so you must X..". Let "X" be just about anything. 

"Of course you're good at spelling, you wrote a book." Myth.

"You must be rolling in money, you wrote a book." Myth.

All of these things are assumptions, myths built up only to serve as obstacles. The simple fact is that if you feel like you want to write a book, you can. There's nothing stopping you, except...well...you. To that end, I thought I'd write a series of posts that dispel the myths and provide background and a foundation for those considering the possibility of writing a book.

There are a number of different routes to writing books. Richard Bejtlich has authored or co-authored a number of books, the most recent of which have been reprints of his Tao Security blog posts. Emma Bostian tweeted about her success with "side projects", the majority of which consisted of authoring and marketing her ebooks.

The Why
So, why write books at all? In an email that Gen Jim Mattis (ret) authored that later went viral, he stated:

By reading, you learn through others’ experiences, generally a better way to do business, especially in our line of work where the consequences of incompetence are so final for young men.

Yes, Gen Mattis was referring to warfighting, but the principle equally well for DFIR work. In his book, "Call Sign Chaos", Mattis further stated:

...your personal experiences alone aren't broad enough to sustain you.

This is equally true in DFIR work; after all, what is "analysis" but the analyst applying the sum total of their knowledge and experience to the amassed data? As such, the reason to write books is that no one of us knows everything, and we all have vastly different experiences. Even working with another analyst on the same incident response engagement, I've found that we've had different experiences due in large part to our different perspectives.

The simple fact is that these different perspectives and experiences can be profoundly valuable, but only if they're shared. A while back, I engaged in an analysis exercise where I downloaded an image and memory sample provided online, and conducted analysis based on a set of goals I'd defined. During this exercise, I extracted significantly different information from the memory sample using two different tools; I used Volatility to extract information about netstat-style network connections, and I also used bulk_extractor to retrieve a PCAP file, built from the remnants of actual packets extracted from memory. I shared what I'd done and found with one other analyst, and to be honest, I don't know if they ever had the chance to try it, or remembered to do so the next time the opportunity arose. Since then, I have encountered more than a few analysts to whom this approach never occurred, and while they haven't always seen significant value from the effort, it remains a part of their toolkit. I also included the approach in "Investigating Windows Systems", where it is available, and I assume more than one analyst has read it and taken note.

Speaking for myself, I began writing books because I couldn't find what I wanted on the shelves of the bookstore. It's as simple as that. I'd see a title with the words "Windows" and "forensics" in the title, and I'd open it, only to find that the dive did not go deep enough for me. At the time, many of the books related to Windows forensics were written by those who'd "grown up" using Linux, and this was clearly borne out in the approach taken, as well as the coverage, in the books.

The First Step
The first step to successfully writing a book is to read. That's right...read. By reading, we get to experience a greater range of authorship, see and decide what we enjoy reading (and what we pass on), and then perhaps use that in our own writing.

My first book was "Windows Forensics and Incident Recovery", published in 2004. The format and structure of chapter 6 of that book is based on a book I read while I was on active duty in the military titled "The Defense of Duffer's Drift". I liked the way that the author presented the material so much that I thought it would be a useful model for sharing my own story. As it turned out, that was the one chapter that my soon-to-be wife actually read completely, as it is the only chapter that isn't completely "technical".

With that, thoughts, comments and questions are, as always, welcome. Keep an eye open for more to come!


Friday, August 27, 2021

Building a Career in CyberSecurity

There's been a lot of discussion on social media around how to "break into" the cybersecurity field, not only for folks just starting out but also for those looking for a career change. This is not unusual, given what we've seen in the public news media around cyber attacks and ransomware; the idea is that cybersecurity is an exploding career field that is completely "green fields", with an incredible amount of opportunity.

Jax Scott recently shared a YouTube video (be sure to comment and subscribe!) where she provides five steps to level up any career, based on her "must read for anyone seeking a career in cybersecurity" blog post. Jax makes a lot of great points, and rather than running through each one and giving my perspective, I thought I'd elaborate a bit on one in particular.

Jax's first tip is to network. This is profound...really profound...for a number of reasons.

First, what I see a LOT of is folks on social media asking for advice on getting into the cybersecurity field, without realizing that the "cybersecurity field" is a huge, expansive...there are a lot of different things you can do in the field. Networking lets you see what you may not see, and it affords you the opportunity to see different aspects of the field. For example, there are more technical (pen testing, digital forensics) aspects of "cybersecurity", as well as less technical (incident management, compliance, policies, etc.) aspects. Not everyone is suited to everything in this field...I once worked with/mentored an incident response consultant who got so anxious when it was their turn to go on-site that they once had to check themselves into the hospital, and another analyst had to take the engagement.

Second, when you do network, make sure that it's purposeful and intentional. Clicking "like" or "follow", or just sending someone a blind connection request on LinkedIn, isn't really "networking", because it's too passive. If you're networking to develop an understanding of the field, and to find a (new) job, just following or connecting to someone isn't going to get you there.

Networking with intent affords us something else, as well. In his book, "Call Sign Chaos", retired Marine general Jim Mattis stated that "...your personal experiences alone aren't broad enough to sustain you." This is just as true in the cybersecurity field as it is to the warfighter, and intentional networking allows us to broaden our experiences through purposeful engagement with others.

I see recommendations on LinkedIn all the time with tips for how to develop your "brand", and most include things such as leaving a comment rather than liking a post, referring to/referencing other posts, as well as other activities that are active, rather than passive. All of these amount to the same thing...purposeful, intentional networking.

Be sure to check out and subscribe to Jax's YouTube videos for a lot of great insight and information, as well as follow the "Hackerz and Haecksen" podcast for some insightful interviews and content!

Thursday, August 26, 2021

Tips for DFIR Analysts

Over the years as a DFIR analyst...first doing digital forensics analysis, and then incorporating that analysis as a component of IR activity...there have been some stunningly simple truths that I've learned, truths that I thought I'd share. Many of these "tips" are truisms that I've seen time and time again, and recognized that they made much more sense and had more value when they were "named".

Tips, Thought, and Stuff to Think About

Computer systems are a finite, deterministic space. The adversary can only go so far, within memory or on the hard drive. When monitoring computer systems and writing detections, the goal is not write the perfect detection, but rather to force the adversary into a corner, so that no matter what they do, they will trigger something. So, it's a good thing to have a catalog of detections, particularly if it is based on things like, "...we don't do this here..".

For example, I worked with a customer who'd been breached by an "APT" the previous year. During the analysis of that breach, they saw that the threat actor had used net.exe to create user accounts within their environment, and this is something that they knew that they did NOT do. There were specific employees who managed user accounts, and they used a very specific third-party tool to do so. When they rolled out an EDR framework, they wrote a number of detection rules related to user account management via net.exe. I was asked to come on-site to assist them when the threat actor returned; this time, they almost immediately detected the presence of the threat actor. Another good example is, how many of us log into our computer systems and type, "whoami" at a command prompt? I haven't seen many users do this, but I've seen threat actors do this. A lot.

From McChrystal's "Team of Teams", there's a difference between "complexity" and "complicated". We often refer to computer systems and networks as "complex", when they are really just complicated, and inherently knowable. We, as humans, tend to make things that are complicated out to be complex.

A follow-on to the previous tip is that there is an over-use of the term "sophisticated" to describe a significant number of attacks. When you look at the data, very often you'll see that attacks are only as sophisticated as they need to be, and in most cases, they really aren't all that sophisticated. An RDP server with an account password of "password" (I've seen this recently...yes, during the summer of 2021), or a long-patched vulnerability with a freely available published exploit (i.e., JexBoss was used by the Samas ransomware actors during the first half of 2016).

When performing DF analysis, the goal is to be as comprehensive and thorough as possible. A great way to achieve this is through automation. For example, I developed RegRipper because I found that I was doing the same things over and over again, and I wanted a way to make my job easier. The RegRipper framework allowed me to add checks and queries without having to write (or rewrite) entirely new tools every time, as well as provided a framework for easy sharing between analysts.

TCP networking is a three-stage handshake, UDP is "fire and forget". This one tip helped me a great deal during my early days of DFIR consulting, particularly when having discussions with admins regarding things like firewalls and switches.

Guessing is lazy. Recognize when you're doing it before someone else does. If there is a gap in data or logs, say so. At some point, someone is going to see your notes or report, and see beyond the veil of emphatic statements, and realize that there are gaping holes in analysis that were spackled over with a thin layer of assumption and guesswork. As such, if you don't have a data source...if firewall logs were not available, or Windows Event Logs were disabled, say so.

The corollary to the tip on "guessing" is that nothing works better than a demonstration. Years ago, I was doing an assessment of a law enforcement headquarters office, and I was getting ready to collect password hashes from the domain server using l0phtcrack. The admin said that the systems were locked down and there was no way I was going to get the password hashes. I pressed the Enter key down, and had the hashes almost before the Enter key returned to its original position. The point is, rather than saying that a threat actor could have done something, a demonstration can drive the point home much quicker.

Never guess at the intentions of a threat actor. Someone raised in the American public school system, with or without military or law enforcement experience, is never going to be able determine the mindset of someone who grew up in the cities of Russia, China, etc. That is, not without considerable training and experience, which many of us simply do not have. It's easy to recognize when someone's guessing the threat actor's intention, because they'll start off a statement with, "...if I were the threat actor...".

If no one is watching, there is no need for stealth. The lack of stealth does not bely sophistication. I was in a room with other analysts discussing the breach with the customer when one analyst described what we'd determined through forensic analysis as, "...not terribly sophisticated...", in part because the activity wasn't very well hidden, nor did the attacker cover their tracks. I had to later remind the analyst that we had been called in a full 8 months after the threat actor's most recent activity.

The adversary has their own version of David Bianco's "Pyramid of Pain", and they're much better at using it. David's pyramid provides a framework for understanding what we (the good guys) can do to impact and "bring pain" to the threat actor. It's clear from engaging in hundreds of breaches, either directly or indirectly, that the bad guys have a similar pyramid of their own, and that they're much better at using theirs.

We're not always right, or correct. It's just a simple fact. This is also true of "big names", ones we imagine are backed by significant resources (spell checkers, copy editors, etc.), and as such, we assume are correct and accurate. As such, we shouldn't blindly accept what others say in open reporting, not without checking and critical thinking.

There are a lot of assumptions in this industry. I'm sure it's the same in other industries, but I can't speak to those industries. I've seen more than a few assumptions regarding royalties for published books; new authors stating out big publishers may start out at 8%, or less. And that's just for paper copies (not electronic), and only for English language editions. I had a discussion once with a big name in the DFIR community who assumed that because I worked for a big name company, of course I had access to commercial forensic suites; they'd assumed that my commenting on not having access to such suites was a load of crap. When I asked what made them think that I would have access to these expensive tool sets, they ultimately said that yes, they'd assumed that I would.

If you're new to DFIR, or if you've been around for a while, you've probably found interviewing for a job to be nerve-racking, anxiety-producing affairs. One thing to keep in mind is that most of the folks you're interviewing with aren't terribly good at it, and are probably just as nervous as you. Think about it...how many times have you seen courses offered in how to conduct a job interview, from the perspective of the interviewer?

Friday, June 25, 2021

What We Know About The Ransomware Economy

Okay, I think that we can all admit that ransomware has consumed the news cycle of late, thanks to high visibility attacks such as Colonial Pipeline and JBS. Interestingly enough, there wasn't this sort of reaction the second time the City of Baltimore got attacked, which (IMHO) belies the news cycle more than anything else.

However, while the focus is on ransomware, for the moment, it's a good time to point out that there's more to this than just the attacks that get blasted across news feeds. That is, ransomware itself is an economy, an eco-system, which is a moniker that goes a long way to toward describing why victims of these attacks are impacted to the extent that they are. What I mean by this is that everything...EVERYTHING...about what goes into a ransomware attack is directed at the final goal of the threat actor...someone...getting paid. What goes further to making this an eco-system is that when a ransomware attack does result in the threat actor getting paid, there are also others in the supply chain (see what I did there??) who are also getting paid.

I was reading Unit42's write-up on the Prometheus ransomware recently, and I have to say, a couple of things really stood out for me, one being the possible identification of a "false flag". The Prometheus group apparently made a claim that is unsupported by the data Unit42 has observed. Even keeping collection bias in mind, this is still very interesting. What would be the purpose of such a "false flag"? Does it tell use that the Prometheus group has insight into the workings of most counter threat intel (CTI) functions; have they "seen" CTI write-ups and developed their own version of the MITRE ATT&CK matrix?

Regardless of the reason or rationale behind the statement, Unit42 is...wait for it...relying on their data.  Imagine that!

Another thing that stood out is the situational awareness of the ransomware developer.

When Prometheus ransomware is executed, it tries to kill several backups and security software-related processes, such as Raccine...

Well, per the article, this is part of the ransomware itself, and not something the threat actors appear to be doing themselves. Being part of more than a few ransomware investigations over the years, relying on both EDR telemetry and #DFIR data, I've seen different levels of situational awareness on the part of threat actors. In some cases where the EDR tool blocks a threat actor's commands, I've seen them either give up, or disable or remove AV tools. In some cases, the threat actor has removed AV tools prior to performing a query, so the question becomes, was that tool even installed on the system?

This does, however, speak to how the barrier for entry has been lowered; that is, a far less sophisticated actor is able to be just as effective, or more so. Rather than having to know and manage all the parts of the "business", rather than having to invest in the resources required to gain access, navigate the compromised infrastructure, and then develop and deploy ransomware...you can just buy those things that you need. Just like the supply chain of a 'normal' business. Say that you want to start a business that's going to provide products to people...are you going to build your own shipping fleet, or are you going to use a commercial shipper (DHL, FedEx, UPS, etc.)?

Further, from the article:

At the time of writing, we don’t have information on how Prometheus ransomware is being delivered, but threat actors are known for buying access to certain networks, brute-forcing credentials or spear phishing for initial access.

This is not unusual. This write-up appears to be based primarily on OSINT, and does not seem to be derived from actual intrusion data or intelligence. The commands listed in the article for disabling Raccine are reportedly embedded in the ransomware executable itself, and not something derived from EDR telemetry, nor DFIR analysis. So what this is saying is that threat actors generally gain access by brute-forcing credentials (or purchasing them), or spear phishing, or by purchasing access from someone who's done either of the first two.

Again, this speaks to how the barrier for entry has been lowered.  Why put the effort into gaining access yourself when you can just purchase access someone else has already established?  

We’ve compiled this report to shed light into the threat posed by the emergence of new ransomware gangs like Prometheus, which are able to quickly scale up new operations by embracing the ransomware-as-a-service (RaaS) model, in which they procure ransomware code, infrastructure and access to compromised networks from outside providers. The RaaS model has lowered the barrier to entry for ransomware gangs.

Purchasing access to compromised computer systems...or compromising computer systems for the purpose of monetizing that access...is nothing new. Let's look back 15+ years to when Brian Krebs interviewed a botherder known as "0x80". This was an in-person interview with a purveyor of access to compromised systems, which is just part of the eco-system. Since then, the whole thing has clearly been "improved upon".

This just affirms that, like many businesses, the ransomware economy, the eco-system, has a supply chain. This not only means that there are specializations within that supply chain, and that the barrier to entry is lowered, it also means that attribution of these cybercrimes is going to become much more difficult, and possibly tenuous, at best.

Thoughts on Assessing Threat Actor Intent & Sophistication

I was reading this Splunk blog post recently, and I have to say up front, I was disappointed by the fact that the promise of the title (i.e., "Detecting Cl0p Ransomware") was not delivered on by the remaining content of the post. Very early on in the blog post is the statement:

Ransomware is by nature a post-exploitation tool, so before deploying it they must infiltrate the victim's infrastructure. 

Okay, so at this point, I'm looking for something juicy, some information regarding the TTPs used to "infiltrate the victim's infrastructure" and to locate files of interest for staging and exfil, but instead, the author(s) dove right into analyzing the malware itself, through reverse engineering. Early in that malware RE exercise is the statement:

This ransomware has a defense evasion feature where it tries to delete all the logs in the infected machine to avoid detection.

The embedded command is essentially a "one-liner" used to list and clear all Windows Event Logs, leveraging wevtutil.exe. However, while used for "defense evasion", it occurred to me that this command is not, in fact, intended to "avoid detection".  After all, with ransomware, the threat actors want to get paid, so they want to be detected. In fact, to ensure they're detected, the actors put ransom notes on the system, with clear statements, declarations, warnings, and time limits. In this case, the ransom note says that if action is not taken in two weeks, the files will be deleted. So, yes, it's safe to say that clearing all of the Windows Event Logs is not about avoiding detection. If anything, its really nothing more than a gesture of dominance, the threat actor saying, look at what I can do to your system. 

So, what is the purpose of clearing the Windows Event Logs? As a long-time #DFIR analyst, the value of the Windows Event Logs in such cases is to assist in a root-cause analysis (RCA) investigation, and clearing some Windows Event Logs (albeit not ALL of them) will hobble (but not completely remove) a responder's ability to determine aspects of the attack cycle such as lateral movement. By tracing lateral movement, the investigator can determine the original system used by the threat actor to gain access to the infrastructure, the "nexus" or "foothold" system, and from there, determine how the threat actor gained access. I say "hobble" because clearing the Windows Event Logs does not obviate the ability of the investigator to recover the event records, it simply requires a bit more effort. However, the vast majority of organizations impacted by ransomware are not conducting full investigations or RCAs, and #DFIR consulting firms are not publicly sharing ransomware trends and TTPs, anonymized through aggregation. In short, clearing the Windows Event Logs, or not, would likely have little impact either way on the response.

But why clear ALL Windows Event Logs? IMHO, it was used to ensure that the ransomware attack was, in fact, detected. Perhaps the assumption is that most organizations have some small modicum of a detection capability, and the most rudimentary SIEM or EDR framework should throw an alert of some kind in the face of the "wevtutil cl" command, or when the SIEM starts receiving events indicating that Windows Event Logs were cleared (especially if the Security Event Log was cleared!).

Sunday, June 06, 2021

Toolmarks: LNK Files in the news again

 As most regular readers of this blog can tell you, I'm a bit of a fan of LNK files...a LNK-o-phile, if you will. I'm not only fascinated by the richness of the structure, but as I began writing a parser for LNK files, I began too see some interesting aspects of intelligence that can be gleaned from LNK files, in particular, those created within a threat actors development environment, and deployed to targets/infrastructures. First, there are different ways to create LNK files using the Windows API, and what's really cool is that each method has it's own unique #toolmarks associated with it!  

Second, most often there is a pretty good amount of metadata embedded in the LNK file structure. There are file system time stamps, and often we'll see a NetBIOS system name, a volume S/N, a SID, or other pieces of information that we can use in a VirusTotal retro-hunt in order to build out a significant history of other similar LNK files.

In the course of my research, I was able to create the smallest possible functioning LNK file, albeit with NO (yes, you read that right...) metadata. Well, that's not 100% true...there is metadata within the LNK file. Specifically, the Windows version identifier is still there, and this is something I purposely left. Instead of zero'ing it out, I altered it to an as-yet-unseen value (in this case, 0x0a). You can also alter each version identifier to their own value, rather than keeping them all the same.

Microsoft recently shared some information about NOBELIUM sending LNK files embedded within ISO files, as did the Volexity team. Both discuss aspects of the NOBELIUM campaign; in fact, they do so in a similar manner, but each with different details. For example, the Volexity team specifically states the following (with respect to the LNK file):

It should be noted that nearly all of the metadata from the LNK file has been removed. Typically, LNK files contain timestamps for creation, modification, and access, as well as information about the device on which they were created.

Now, that's pretty cool! As someone who's put considerable effort into understanding the structure of LNK files, and done research into creating the smallest, minimal, functioning LNK file, this was a pretty interesting statement to read, and I wanted to learn more.

Taking a look at the metadata for the reports.lnk file (from fig 4 in the Microsoft blog post, and fig 3 of the Volexity blog post) we see

guid               {00021401-0000-0000-c000-000000000046}
shitemidlist    My Computer/C:\/Windows/system32/rundll32.exe
**Shell Items Details (times in UTC)**
  C:0                   M:0                   A:0                  Windows  (9)
  C:0                   M:0                   A:0                  system32  (9)
  C:0                   M:0                   A:0                  rundll32.exe  (9)

commandline  Documents.dll,Open
iconfilename   %windir%/system32/shell32.dll
hotkey             0x0
showcmd        0x1

***LinkFlags***
HasLinkTargetIDList|IsUnicode|HasArguments|HasExpString|HasIconLocation

***PropertyStoreDataBlock***
GUID/ID pairs:
{46588ae2-4cbc-4338-bbfc-139326986dce}/4      SID: S-1-5-21-8417294525-741976727-420522995-1001

***EnvironmentVariableDataBlock***
EnvironmentVariableDataBlock: %windir%/system32/explorer.exe

***KnownFolderDataBlock***
GUID  : {1ac14e77-02e7-4e5d-b744-2eb1ae5198b7}
Folder: CSIDL_SYSTEM

While the file system time stamps embedded within the LNK file structure appear to have been zero'd out, a good deal of metadata still exists within the structure itself. For example, the Windows version information (i.e., "9") is still available, as are the contents of several ExtraData blocks. The SID listed in the PropertyStoreDataBlock can be used to search across repositories, looking for other LNK files that contain the same SID. Further, the fact that these blocks still exist in the structure gives us clues as to the process used to create the original LNK file, before the internal structure elements were manipulated.

I'm not sure that this is the first time this sort of thing has happened; after all, the MS blog post makes no mention of metadata being removed from the LNK file, so it's entirely possible that it's happened before but no one's thought that it was important enough to mention. However, items such as ExtraDataBlocks and which elements exist within the structure not only give us clues (toolmarks) as to how the file was created, but the fact that metadata elements were intentionally removed serve as additional toolmarks, and provide insight into the intentions of the actors.

But why use an ISO file? Well, interesting you should ask.  Matt Graeber said:

Adversaries choose ISO/IMG as a delivery vector b/c SmartScreen doesn't apply to non-NTFS volumes

In the ensuring thread, @arekfurt said:

Adversaries can also use the iso trick to put evade MOTW-based macro blocking with Office docs.

Ah, interesting points! The ISO file is downloaded from the Internet, and as such, would likely have a zone identifier ADS associated with it (I say, "likely" because I haven't seen it mentioned as a toolmark), whereas once the ISO file is mounted, the embedded files would not have zone ID ADSs associated with them. So, the decision to use an ISO file was intentional, and not just cool...in fact, it appears to have been intentionally used for defense evasion.

Testing, and taking DFIR a step further

One of Shakespeare's lines from Hamlet I remember from high school is, "...there are more things on heaven and earth, Horatio, than are dreamt of in your philosophy." And that's one of the great things about the #DFIR industry...there's always something new. I do not for a moment think that I've seen everything, and I, for one, find it fascinating when we find something that is either new, or that has been talked about but is being seen "in the wild" for the first time.

Someone mentioned recently that Microsoft's Antimalware Scan Interface (i.e., AMSI) could be used for persistence, and that got me very interested.  This isn't something specifically or explicitly covered by the MTRE ATT&CK framework, and I wanted to dig into this a bit more to understand it. As it can be used for persistence, it offers not only an opportunity for a detection, but also for a #DFIR detection and artifact constellation that can provide insight into threat actor sophistication and intent, as well as attribution. 

AMSI was introduced in 2015, and discussions of issues with it and bypassing it date back to 2016. However, the earliest discussion of the use of AMSI for persistence that I could find is from just last year. An interesting aspect of this means of persistence isn't so much as a detection itself, but rather how it's investigated. I've worked with analysis and response teams over the years, and one of the recurring questions I've had when something "bad" is detected is where that event occurred in relation to others. For example, whether you're using EDR telemetry or a timeline of system activity, all events tend to have one thing in common...a time stamp indicating the time at which they occurred. That is, either the event itself has an associated time stamp (file system time stamp, Registry key LastWrite time, PE file compile time, etc.), or some monitoring functionality is able to associate a time stamp with the observed event. As such, determining when a "bad" event occurred in relation to other events, such as a system reboot or a user login, can provide insight into determining if the event is the result of some persistence mechanism. This is necessary, as while EDR telemetry in particular can provide a great deal of insight, it is largely blind to a great deal of on-system artifacts (for example, Windows Event Log records). However, adding EDR telemetry to on-system artifact constellations significantly magnifies the value of both.

As I started researching this issue, the first resource to really jump out at me was this blog post from PenTestLab. It's thorough, and provides a good deal of insight as well as resources. For example, this post links to not only another blog post from b4rtik, but to a compiled DLL from netbiosX that you can download and use in testing. As a side note, be sure to install the Visual C++ redistributable on your system if you don't already have it, in order to get the DLL registration to work (Thanks to @netbiosX for the assist with that!)

I found that there are also other resources on the topic from Microsoft, as well.

Following the PenTestLab blog post, I registered the DLL, waited for a bit, and then collected a copy of the Software hive via the following command:

C:\Users\Public>reg save HKLM\Software software.sav

This testing provides the basis for developing #DFIR resources, including a RegRipper plugin. This allows detections to persist, particularly in larger environments, so that corporate knowledge is maintained.

It also sets the stage for further, additional testing. For example, you can install Sysmon, open a Powershell command prompt, submit the test phrase, and then create a timeline of system activity once calc.exe opens, to determine (or begin developing) the artifact constellation associated with this use of 

Speaking of PenTestLab, Sysmon, and taking things a step further, there's this blog post from PenTestLab on Threat Hunting AMSI Bypasses, including not just a Yara rule to run across memory, but also a Registry modification that indicates yet another AMSI bypass.

Monday, April 12, 2021

On #DFIR Analysis, pt III - Benefits of a Structured Model

 In my previous post, I presented some of the basic design elements for a structured approach to describing artifact constellations, and leveraging them to further DFIR analysis. As much of this is new, I'm sure that this all sounds like a lot of work, and if you've read the other posts on this topic, you're probably wondering about the benefits to all this work. In this post, I'll take shot at netting out some of the more obvious benefits.

Maintaining Corporate Knowledge

Regardless of whether you're talking about an internal corporate position or a consulting role, analysts are going to see and learn new things based on their analysis. You're going to see new applications or techniques used, and perhaps even see the same threat actor making small changes to their TTPs due to some "stimulus". You may find new artifacts based on the configuration of the system, or what applications are installed. A number of years ago, a co-worker was investigating a system that happened to have LANDesk installed, along with the software monitoring module. They'd found that the LANDesk module maintains a listing of executables run, including the first run time, the last run time, and the user account responsible for the last execution, all of which mapped very well in to a timeline of system activity, making the resulting timeline much richer in context.

When something like this is found, how is that corporate knowledge currently maintained? In this case, the analyst wrote a RegRipper plugin, that still exists and is in use today. But how are organizations (both internal and consulting teams) maintaining a record of artifacts and constellations that analysts discover?

Maybe a better question is, are organizations doing this at all?  

For many organizations with a SOC capability, detections are written, often based on open reporting, and then tested and put into production. From there, those detections may be tuned; for internal teams, the tuning would be based on one infrastructure, but for MSS or consulting orgs, the tuning would be based on multiple (and likely an increasing number of) infrastructures. Those detections and their tuning are based on the data source (i.e., SIEM, EDR, or a combination), and serve to preserve corporate knowledge. The same approach can/should be taken with DFIR work, as well.

Consistency

One of the challenges inherent to large corporate teams, and perhaps more so to consulting teams, is that analysts all have their own way of doing things. I've mentioned previously that analysis is nothing more that an individual applying the breadth of their knowledge and experience to a data source. Very often, analysts will receive data for a case, and approach that data initially based on their own knowledge and experience. Given that each analyst has their own individual approach, the initial parsing of collected data can be a highly inefficient endeavor when viewed across the entire team. And because the approach is often based solely on the analyst's own individual experience, items of importance can be missed. 

What if each analyst were instead able to approach the data sources based not just on their own knowledge and experience, but the collective experience of the team, regardless of the "state" (i.e., on vacation/PTO, left the organization, working their own case, etc.) of the other analysts? What if we were to use a parsing process that was not based on the knowledge, experience and skill of one analyst but instead on that of all analysts, as well as perhaps some developers? That process would normalize all available data sources, regardless of the knowledge and experience of an individual analyst, and the enrichment and decoration phase would also be independent of the knowledge and skill of a single analyst.

Now, something like this does not obviate the need for analysts to be able to conduct their own analysis, in their own way, but it does significantly increase efficiency, as analysts are no longer manually parsing individual data sources, particularly those selected based upon their own experience. Instead, the data sources are being parsed, and then enriched and decorated through an automated means, one that is continually improved upon. This would also reduce costs associated with commercial licenses, as teams would not have to purchase each analyst licenses for several products (i.e., "...I prefer this product to this work, and this other product for these other things...").

By approaching the initial phases of analysis in such a manner, efficiency is significantly increased, cost goes way down, and consistency goes through the roof. This can be especially true for organizations that encounter similar issues often. For example, internal organizations protecting corporate assets may regularly see certain issues across the infrastructure, such as malicious downloads or phishing with weaponized attachments. Similarly, consulting organizations may regularly see certain types of issues (i.e., ransomware, BEC, etc.) based on their business and customer base. Having an automated means for collecting, parsing, and enriching known data sources, and presenting them to the analyst saves time and money, gets the analyst to conducting analysis sooner, and provides for much more consistent and timely investigations.

Artifacts in Isolation

Deep within DFIR, behind the curtains and under the dust ruffle, when we see what really goes on, we often see analysts relying far too much on single sources of data or single artifacts for their analysis, in isolation from each other. This is very often the result of allowing analysts to "do their own thing", which while sounding like an authoritative way to conduct business, is highly inefficient and fraught with mistakes.

Not long ago, I heard a speaker at a presentation state that they'd based their determination of the "window of compromise" on a single data point, and one that had been misinterpreted. They'd stated that the ShimCache/AppCompatCache timestamp for a binary was the "time of execution", and extended the "window of compromise" in their report from a few weeks to four years, without taking time stomping into account. After the presentation was over, I had a chat with the speaker and walked through my reasoning. Unfortunately, the case they'd presented on had been completed (and adjudicated) two years prior to the presentation. For this case, the victim had been levied a fine based on the "window of compromise".

Very often, we'll see analysts referring to a single artifact (ShimCache entry, perhaps an entry in the AmCache.hve file, etc.) as definitive proof of execution. This is perhaps an understanding based on over-simplification of the nature of the artifact, and without corresponding artifacts from the constellation, will lead to inaccurate findings. It is often not until we peel back the layers of the analysis "onion" that it becomes evident that the finding, as well as the accumulated findings of the incident, were incorrectly based on individual artifacts, from data sources viewed in isolation from other pertinent data sources. Further, the nature of those artifacts being misinterpreted; rather than demonstrating program execution, they simply illustrate the existence of a file on the system.

Summary

Over the years that I've been doing DFIR work, little in the way of "analysis" has changed. We started out with getting a few hard drives that we imaged and analyzed, to going on-site to do collections of images, then to doing triage collections to scope the systems that needed to be analyzed. We then extended our reach further through the use of EDR telemetry, but analysis still came down to individual analysts applying just their own knowledge and experience to collected data sources. It's time we change this model, and leverage the capabilities we have on hand in order to provide more consistent, efficient, accurate, and timely analysis.

Saturday, April 10, 2021

On #DFIR Analysis, pt II - Describing Artifact Constellations

 I've been putting some serious thought into the topic of a new #DFIR model, and in an effort to extend and expand upon my previous post a bit, I wanted to take the opportunity to document and share some of my latest thoughts.

I've discussed toolmarks and artifact constellations previously in this blog, and how they apply to attribution. In discussing a new #DFIR model, the question that arises is, how do we describe an artifact or toolmark constellation in a structured manner, so that it can be communicated and shared?  

Of course, the next step after that, once we have a structured format for describing these constellations, is automating the sharing and "machine ingestion" of these constellation descriptions. But before we get ahead of ourselves, let's discuss a possible structure a bit more. 

The New #DFIR Model

First off, to orient ourselves, figure 1 illustrates the proposed "new" #DFIR model from my previous blog post. We still have the collect, parse, and enrich/decorate phases prior to the output and data going to the analyst, but in this case, I've highlighted the "enrich/decorate" phase with a red outline, as that is where the artifact constellations would be identified.

Fig 1: New DFIR Model 
We can assume that we would start off by applying some known constellation descriptions to the parsed data during the "enrich/decorate" phase, so the process of identifying a toolmark constellation should also include some means of pulling information from the constellation, as well as "marking" or "tagging" the constellation in some manner, or facilitating some other means of notifying the analyst. From there, the expectation would be that new constellations would be defined and described through analysis, as well as through open sources, and applied to the process.

We're going to start "small" in this case, so that we can build on the structure later. What I mean by that is that we're going to start with just DFIR data; that is, data collected as either a full disk acquisition, or as part of triage response to an identified incident. We're going to start here because the data is fairly consistent across Windows systems at this point, and we can add EDR telemetry and input from a SIEM framework at a later date. So, just for the sake of  this discussion, we're going to start with DFIR data.

Describing Artifact Constellations

Let's start by looking a common artifact constellation, one for disabling Windows Defender. We know that there are a number of different ways to go about disabling Windows Defender, and that regardless of the size and composition of the artifact constellation they all result in the same MITRE ATT&CK sub-technique. One way to go about disabling Windows Defender is through the use of Defender Control, a GUI-based tool. As this is a GUI-based tool, the threat actor would need to have shell-based access to the system, such through a local or remote (Terminal Services/RDP) login. Beyond that point, the artifact constellation would look like:
  • UserAssist entry in the NTUSER.DAT indicating Defender Control was launched
  • Prefetch file created for Defender Control (file system/MFT; not for Windows server systems)
  • Registry values added/modified in the Software hive
  • "Microsoft-Windows-Windows Defender%4Operational.evtx" event records generated
Again, this constellation is based solely on DFIR or triage data collected from an endpoint. Notice that I point out that one artifact in the constellation (i.e., the Prefetch file) would not be available on Windows server systems. This tells us that when working with artifact constellations, we need to keep in mind that not all of the artifacts may be available, for a variety of reasons (i.e., version of Windows, system configuration, installed applications, passage of time, etc.). Other artifacts that may be available but are also heavily dependent upon the configuration of the endpoint itself include (but are not limited to) a Security-Auditing/4688 event in the Security Event Log pertaining to Defender Control, indicating the launch of the application, or possibly a Sysmon/1 event pertaining to Defender Control, again indicating the launch of the application. Again, the availability of these artifacts depends upon the specific nature and configuration of the endpoint system.

Another means to achieve the same end, albeit without requiring shell-based access, is with a batch file that modifies the specific Registry values (Defender Control modifies two Registry values) via the native LOLBIN, reg.exe. In this case, the artifact constellation would not need to (although it may be) be preceded by a Security-Auditing/4624 (login) event of either type 2 (console) or type 10 (remote). Further, there would be no expectation of a UserAssist entry (no GUI tool needs to be launched), and the Prefetch file creation/modification would be for reg.exe, rather than Defender Control.  However, the remaining two artifacts in the constellation would likely remain the same.

Fig 2: WinDefend Exclusions
Of course, yet another means for "disabling Windows Defender" could be as simple as adding an exclusion to the tool, in any one or more of the five subkeys illustrated in figure 2. For example, we've seen threat actors create exceptions for any file ending in ".exe", found in specific paths, or any process such as Powershell.

The point is that while there are different ways to achieve the same end, each method has its own unique toolmark constellation, and the constellations could then be used to apply attribution.  For example, the first method for disabling Windows Defender described above was observed being used by the Snatch ransomware threat actors during several attacks in May/June 2020. Something like this would not be exclusive, of course, as a toolmark constellation could be applied to more than one threat actor or group. After all, most of what we refer to as "threat actor groups" are simply how we cluster IOCs and TTPs, and a toolmark constellation is a cluster of artifacts associated with the conduct of particular activity. However, these constellations can be applied to attribution.

A Notional Description Structure

At this point, a couple of thoughts or ideas jump out at me.  First, the individual artifacts within the constellation can be listed in a fashion similar to what's seen in Yara rules, with similar "strings" based upon the source. Remember, by the time we're to the "enrich/decorate" phase, we've already normalized the disparate data sources into a common structure, perhaps something similar to the five-field TLN format used in (my) timelines. The time field of the structure would allow us to identify artifacts within a specified temporal proximity, and each description field would need to be treated or handled (that is, itself parsed) differently based upon the source field. The source field from the normalized structure could be used in a similar manner as the various 'string' identifiers in Yara (i.e., 'ascii', 'nocase', 'wide', etc.) in that they would identify the specific means by which the description field should be addressed. 

Some elements of the artifact constellation may not be required, and this could easily be addressed through something similar to Yara 'conditions', in that the various artifacts could be grouped with parens, as well as 'and' and 'or', identifying those artifacts that may not be required for the constellation to be effective, although not complete. From the above examples, the Registry values being modified would be "required", as without them, Windows Defender would not be disabled. However, a Prefetch file would not be "required", particularly when the platform being analyzed is a Windows server. This could be addressed through the "condition" statement used in Yara rules, and a desirable side effect of having a "scoring value" would be that an identified constellation would then have something akin to a "confidence rating", similar to what is seen on sites such as VirusTotal (i.e., "this sample was identified as malicious by 32/69 AV engines"). For example, from the above bulleted artifacts that make up the illustrated constellation, the following values might be applied:

  • Required - +1
  • Not required - +1, if present
  • +1 for each of the values, depending upon the value data
  • +1 for each event record
If all elements of the constellation are found within a defined temporal proximity, then the "confidence rating" would be 6/6. All of this could be handled automatically by the scanning engine itself.

A notional example constellation description based on something similar to Yara might then look something like the following:

strings:

    $str1 = UserAssist entry for Defender Control
    $str2 = Prefetch file for Defender Control
    $str3 = Windows Defender DisableAntiSpyware value = 1
    $str4 = Windows Defender event ID 5010 generated
    $str5 = Windows Defender DisableRealtimeMonitoring value = 1
    $str6 = Windows Defender event ID 5001 generated

condition:

    $str1 or $str2 and ($str3 and $str4 and $str5 and $str6);

Again, temporal proximity/dispersion would need to be addressed (most likely within the scanning engine itself), either with an automatic 'value' set, or by providing a user-defined value within the rule metadata. Additionally, the order of the individual artifacts would be important, as well. You wouldn't want to run this rule and in the output find that $str1 was found 8 days after the conditions for $str3 and $str5 being met. Given that the five-field TLN format includes a time stamp as its first field, it would be pretty trivial to compute a temporal "Hamming distance", of sorts, a well as ensure proper sequencing of the artifacts or toolmarks themselves.  That is to say that $str1 should appear prior to $str3, rather than after it, but not so far so as to be unreasonable and create a false positive detection.

Finally, similar to Yara rules, the rule name would be identified in the output, along with a "confidence rating" of 6/6 for a Windows 10 system (assuming all artifacts in the cluster were available), or 5/6 for Windows Server 2019.

Counter-Forensics

Something else that we need to account for when addressing artifact constellations is counter-forensics, even that which is unintentional, such as the passage of time. Specifically, how do we deal with identifying artifact constellations when artifacts have been removed, such as application prefetching being disabled on Windows 10 (which itself may be part of a different artifact constellation), or files being deleted, or something like CCleaner being run?

Maybe a better question is, do we even need to address this circumstance? After all, the intention here is not to address every possible eventuality or possible circumstance, and we can create artifact constellations for various Windows functionality being disabled (or enabled).

Thursday, April 01, 2021

LNK Files, Again

 I ran across SharpWebServer via Twitter recently...the first line of the readme.md file states, "A Red Team oriented simple HTTP & WebDAV server written in C# with functionality to capture Net-NTLM hashes." I thought this was fascinating because it ties directly to a technique MITRE refers to as "Forced Authentication".  What this means is that a threat actor can (and has...we'll get to that shortly) modify Windows shortcut/LNK files such that the iconfilename field points to an external resource. What happens is that when LNK file is launched, Explorer will reach out to the external resource and attempt to authenticate, sending NTLM hashes across the wire.  As such, SharpWebServer is built to capture those hashes.

What this means is that a threat actor can gain access to an infrastructure, and as has been observed, use various means to maintain persistence...drop backdoors or RATs, create accounts on Internet-facing systems, etc.  However, many (albeit not all) of these means of persistence can be overcome via the judicious use of AV, EDR monitoring, and a universal password change.

Modifying the iconfilename field of an LNK file is a means of persisting beyond password changes, because even after passwords are change, the updated hashes will be sent across the wire.

Now, I did say earlier that this has been used before, and it has.  CISA Alert TA18-074A includes a section named "Persistence through LNK file manipulation". 

Note that from the alert, when looking at the "Contents of enu.cmd", "Persistence through LNK file manipulation", and "Registry Modification" sections, we can see a pretty comprehensive set of toolmarks associated with this threat actor.  This is excellent intrusion intelligence, and should be incorporated into any and all #DFIR parsing, enrichment and decoration, as well as threat hunting.

However, things are even better! This tweet from bohops illustrates how to apply this technique to MSWord docs.