Monday, February 06, 2023

Why Lists?

So much of what we see in cybersecurity, in SOC, DFIR, red teaming/ethical hacking/pen testing, seems to be predicated on lists. Lists of tools, lists of books, lists of sites with courses, lists of free courses, etc. CD-based distros are the same way, regardless of whether they're meant for red- or blue-team efforts; the driving factor behind them is often the list of tools embedded within the distribution. For example, the Kali Linux site says that it has "All the tools you need". If you go to the SANS SIFT Workstation site, you'll see the description that includes, "...a collection of free and open-source incident response and forensic tools." Here's a Github site that lists "blue team tools"...but that's it, just a list.

Okay, so what's up with lists, you ask? What's the "so, what?" 

Lists are great...they often show us new tools that we'd hadn't seen or heard about, possibly tools that might be more effective or efficient for us and our workflows. Maybe a data source has been updated and there's a tool that addresses that new format, or maybe you're run across a case that includes the use of a different web browser, and there's a tool that parses the history for you. So, having lists is good, and familiar...because that's the way we've always done it, right? A lot of folks developing these lists came into the industry themselves at one point, looked around, and saw others posting lists. As such, the general consensus seems to be, "share lists"...either share a list you found, or share a list you've added to.

Lists, particularly checklists, can be useful. They can ensure that we don't forget something that's part of a necessary process, and if we intentionally and purposely manage and maintain that checklist, it can be our documentation; rather than writing out each step in our checklist as part of our case notes/documentation, we can just say, "...followed/completed the checklist version xx.xx, as of Y date...", noting any discrepancies or places we diverged. The value of a checklist depends upon how it's used...if it's downloaded and used because it's "cool", and it's not managed and never updated, then it's pretty useless.

Are lists enough?

I recently ran across a specific kind of list...the "cheat sheet". This specific cheat sheet was a list of Windows Event Log record event IDs. It was different from some other similar cheat sheets I'd seen because it was broken down by Windows Event Log file, with the "event IDs of interest" listed beneath each heading. However, it was still just a list, with the event IDs listed along with a brief description of what they meant.

However, even though this cheat sheet was "different", it was still just a list and it still wasn't sufficient for analysis today. 

Why is that?

Because a simple list doesn't give you the how, nor does it give you the why. Great, so I found a record with that event ID, and someone's list said it was "important", but that list doesn't tell me how this event ID is important to nor used in my investigation, nor how I can leverage that event ID to answer my investigative questions. The cheat sheet didn't tell me anything about how that specific event ID 

We have our lists, we have our cheat sheets, and now it's time to move beyond these and start developing the how and why; how to use the entry in an investigation, and why it's important. We need to focus less on simple lists and more on developing investigative goals and artifact constellations, so that we can understand what that entry means within the overall context of our investigation, and what it means when the entry is absent. 

We need to share more about how the various items on our lists are leveraged to reach or further our investigative goals. Instead of a list of tools to use, talk about how you've used one of those tools as part of your investigative process, to achieve your investigative goals.

Having lists or cheat sheets like those we've been seeing perpetuates the belief that it's sufficient to examine data sources in isolation from each other, and that's one of the biggest failings of these lists. As a community, and as an industry, we need to move beyond these ideas of isolation and sufficiency; while they seem to bring about an immediate answer or immediate findings, the simple fact is that neither serves us well when it comes to finding accurate and complete answers.

Sunday, February 05, 2023

Validating Tools

Many times, in the course of our work as analysts (SOC, DFIR, etc.), we run tools...and that's it. But do we often stop to think about why we're running that tool, as opposed to some other tool? Is it because that's the tool everyone we know uses, and we just never thought to ask about another? Not so much the how, but do we really think about the why?

The big question, however, we validate our tools? Do we verify that the tools are doing what they are supposed to, what they should be doing, or do we simply accept the output of the tool without question or critical thought? Do we validate our tools against our investigative goals?

Back when Chris Pogue and I were working PCI cases as part of the IBM ISS X-Force ERS team, we ran across an instance where we really had to dig in and verify our toolset. Because we were a larger team, with varying skill levels, we developed a process for all of the required searches, scans and checks (search for credit card numbers, scans for file names, paths, hashes, etc.) based on Guidance Software's EnCase product, which was in common usage across the team. As part of the searches for credit card numbers (CCNs), we were using the built-in function isValidCreditCard(). Not long after establishing this process, we had a case where JCB and Discover credit cards had been used, but these weren't popping up in our searches.

Chris and I decided to take a look at this issue, and we went to the brands and got test card numbers...card numbers that would pass the necessary checks (BIN, length, Luhn check), but were not actual cards used by consumers. We ran test after test, and none using the isValidCreditCard() returned the card numbers. We tried reaching out via the user portal, and didn't get much in the way of a response that was useful. Eventually, we determined that those two card brands were simply not considered "valid" by the built-in function, so we overrode that function with one of our one, one that included 7 regexes in order to find all valid credit card numbers, which we verified with some help from a friend

We learned a hard lesson from this exercise, one that really cemented the adage, "verify your tools". If you're seeing (or not, as the case may be) something that you don't expect to see in the output of your tools, verify the tool. Do not assume that the tool is correct, that the tool author knew everything about the data they were dealing with and had accounted for edge cases. This is not to say that tool authors aren't smart and don't know what they're doing...not at all. In fact, it's quite the opposite, because what can often happen is that the data changes over time (we see this a LOT with Windows...), or there are edge cases that the tool simply doesn't handle well.

So we're not just asking about the general "verify your tools" adage; what we're really asking about is, "do you verify your tools against your investigative goals?". The flip side of this is that if you can't articulate your investigative goals, why are you running any tools in the first place?

Not long ago, I was working with someone who was using a toolset built out of open source and free tools. This toolset included a data collection component, middleware (parsed the data), and a backend component for engaging with and displaying the parsed data. The data collection component included retrieving a copy of the WMI repository, and I asked the analyst if they saw any use of WMI persistence, to which they said, "no". In this particular case, open reporting indicated that these threat actors had been observed using WMI for persistence. While the data collection component retrieved the WMI repository, the middleware component did not include the necessary code to parse that repository, and as such, one could not expect to see artifacts related to WMI persistence in the backend, even if they did exist in the repository. 

The issue was that we often expect the tools or toolset to be complete in serving our needs, without really understanding those "needs", nor the full scope of the toolset itself. Investigative needs or goals may not be determined or articulated, and the toolset was not validated against investigative goals, so assumptions were made, including ones that would lead to incomplete or incorrect reporting to customers.

Going Beyond Tool Validation to Process Validation
Not long ago, I included a question in one of my tweet responses: "how would you use RegRipper to check to see if Run key values were disabled?" The point of me asking that question was to determine who was just running RegRipper because it was cool, and who was doing so because they were trying to answer investigative questions. After several days of not getting any responses to the question (I'd asked the same question on LinkedIn), I posed the question directly to Dr. Ali Hadi, who responded by posting a YouTube video demonstrating how to use RegRipper. Dr. Hadi then posted a second YouTube video, asking, "did the program truly run or not?", addressing the issue of the StartupApproved\Run key.

The point is, if you're running RegRipper (or any other tool for that matter), why are you running it? Not how...that comes later. If you're running RegRipper thinking that it's going to address all of your investigative needs, then how do you know? What are your "investigative needs"? Are you trying to determine program execution? If so, the plugin Dr. Hadi illustrated in both videos is a great place to start, but it's nowhere near complete. 

You see, the plugin will extract values from the keys listed in the plugin (which Dr. Hadi illustrated in one of the videos). That version includes the StartupApproved\Run key in the plugin, as well, as it was added before I had a really good chance to conduct some more comprehensive testing with respect to that key and it's values. I've since removed the key (and the other associated keys) from the plugin and moved them to a separate plugin, with associated MITRE ATT&CK mapping and analysis tips.

As you can see from Dr. Hadi's YouTube video, it would be pretty elementary for a threat actor to drop a malware executable in a folder, and create a Run key value that points to it. Then, create a StartupApproved\Run key value that disables the Run key entry so that it doesn't run. What would be the point of doing this? Well, for one, to create a distraction so that the responder's attention is focused elsewhere, similar to what happened with this engagement.

If you are looking to determine program execution and you're examining the contents of the Run keys, then you'd also want to include the Microsoft-Windows-Shell-Core%4Operational Event Log, as well, as the event records indicate when the key contents are processed, as well as when execution of individual programs (pointed to by the values) began and completed. This is a great way to determine program execution (not just "maybe it ran"), as well as to see what may have been run via the RunOnce key, as well.

The investigative goal is to verify program execution via the Run/RunOnce keys, from both the Software and NTUSER.DAT hives. A tool was can use is RegRipper, but even so, this will not allow us to actually validate program execution; for that, we need a process that includes incorporating the Microsoft-Windows-Shell-Core%4Operational Event Log, as well as the Application Event Log, looking for Windows Error Reporting or Application Popup events. For any specific programs we are interested in, we'd need to look at artifacts that included "toolmarks" of that program, looking for any file system, Registry, or other impacts on the system.

If you're going to use a tool in SOC or DFIR work, understand the why; what investigative questions or goals will the tool help you answer/achieve? Then, validate that the tool will actually meet those needs. Would those investigative goals be better served by a process, one that addresses multiple aspects of the goal? For example, if you're interested in IP addresses in a memory dump, searching for the IP address (or IP addresses, in general) via keyword or regex searches will not be comprehensive, and will lead to inaccurate reporting. In such cases, you'd want to use Volatility, as well as bulk_extractor, to look for indications of network connections and communications.

Monday, January 30, 2023

Soft Skills: Writing


Like math in middle school, this is one of those subjects that we pushed back on, telling ourselves, "I'll never have to use this...", and then quite shockingly finding that it's amazing how much writing we actually do. However, are we doing it well, given the particular circumstances of the writing? We "write" on social media, not being too overly concerned about things like grammar, spelling, or even word choice, falling back on the old, " know what I meant...", or blaming auto-correct for the miscommunication.

I'll be the first to admit, I'm not an "expert" at writing, nor am I "the best". But I will say that I am intentional in my writing, and this is something that's led me to...not been the result of...maintaining a blog, and publishing several books, with others in the hopper.

Writing is a necessary skill that many who need to, do not intentionally engage in, and of those who do, very few accept criticism or feedback well. For myself, I have a long history in my career of having to write, which stated in college in the mid-to-late '80s. I was an engineering major, and my English professor said that I wrote "like an English major". To this day, I still don't know what that meant, because I was constantly switching verb tenses in my writing, which was reflected in my grades. 

While on active duty, I had to write, and because it was the military, there was was part of what we did, so there was no getting away from it, shrinking back and retreating when someone had recommendations. It started in training, with things like operations orders, and continued out "in the fleet", progressing into JAG manual investigations, fitness reports, pro-con assessments, etc. There was all kinds of writing, and there was a LOT of feedback, whether you wanted or liked it, or not.

Another thing that was clear about the writing in that environment was that not everyone received the same feedback, and not everyone took the feedback they received the same way. Very early on in my career, I learned some "truths" about writing fitness reports, particularly from knowledgeable individuals. I reported to my first unit in May 1990, and not long after, a CWO-2 named "Rick" returned from a promotion board at HQMC. He was able (and willing) to act as a confidant and mentor, particularly given not only his longevity in service, but also based on his very recent experience. He not only shared what he'd learned throughout his career, but also the insights he'd learned from the recent promotion panel. I was able to take what I learned from Rick, and use it going forward, but just a couple of years later, I had a SSgt who was applying for the Warrant Officer program, who had some fitness reports written on him that included questionable statements, statements that stood out as being starkly and glaringly counter to what I'd learned.

What I learned on active duty served me very well in the private sector, the biggest lesson being, "...don't get butt hurt when someone says something...". Look beyond the "how" of what's said, to the "what". Don't get so wrapped in the "you were mean to me" emotional response that you miss the gem hidden beyond that will get you over that hump, and allow you to be a better writer. Look beyond your own initial, visceral, emotional response, and closely examine the "what".

Now, if you're posting to social media, you may not care about grammar, spelling, punctuation, etc. It may not matter, and that's fine. If you're not trying to convey a thought or idea, and you're just "sh*t posting", then it really doesn't matter if the reader understands what you're trying to say. 

But what about if you're filling out a ticket, or reporting on an investigation? What if you're actually trying to convey something, because it's "important"? Now, I put the word "important" in quotes, because throughout the passed two decades, I've talked to more than a few in the DFIR community who haven't really grasped how important their communication is, how what they are sharing in a ticket or in a report is actually used by someone else to make a decision, to commit resources (or not), or to levy a fine or punishment. Many analysts never see what's done with their work, they never see corporate counsel or HR, or a regulatory body using what they've written to make decisions.

I've also seen far too many times how a simple, "...what does this mean?" or "...can you clarify which version of Windows you're working with..." is wildly misinterpreted and internalized as, "I'm being called out unfairly."

What Can I Do?
So, what? So, what does this all mean to you, the reader, and what can you do? What I'm going to share here are some of the lessons I've learned over the past three decades...

The first step is to recognize that we can all get better at communicating, and in particular, writing. So, start writing. Comment on posts (Twitter, LinkedIn) rather than simply clicking "Like". Did you like a book you read? Write a review. Ask someone a question about a book they read or about a post they wrote or recommended. 

To get better at writing, it's best to read. If you read something that you enjoyed reading, consider why you enjoyed it. Was it the content itself, or the writing style? If it was the writing style, try emulating that style. Is it more formal, clinical, or perhaps more conversational? Consider what works for you, what do you enjoy, and how can you make that part of how you write.

Do not assume any response or feedback you receive is intended to be negative. Yes, I get it...this is the Internet, and there is a lot of negativity, and when you encounter that, the best thing to do is ignore it. But when you receive feedback on something, particularly when it's sought out, don't immediately assume that it's negative, or that you've done something profoundly wrong. Instead, recognize the negative feelings you're having, take a deep breath...and look beyond those feelings and really try to see the "what" beyond the "how". Look for what's being said, beyond how it's being said, or how it makes you feel.

Friday, January 27, 2023

Updates, Compilation

Thoughts on Detection Engineering
I read something online recently that suggested that the role of detection engineering is to reduce the false positive (FPs) alerts sent to the SOC. In part, I fully agree with this; however, "cyber security" is a team sport, and it's really incumbent upon SOC and DFIR analysts to support the detection engineering effort through their investigations. This is something I addressed a bit ago in this blog, first here, and then here

From the second blog post linked above, the most important value-add is the image to the right. This is something I put together to illustrate what, IMHO, should be the interaction between the SOC, DFIR, threat hunting, threat intel, and detection engineering. As you see from the image, the idea is that the output of DFIR work, the DFIR analysis, feeds back into the overall process, through threat intel and detection engineering. Then, both of those functions further feed back into the overall process at various points, one being back into the SOC through the development of high(er) fidelity detections. Another feedback point is that threat intel or gaps identified by detection engineer serve to inform what other data sources may need to be collected and parsed as part of the overall response process.

The overall point here is that the SOC shouldn't be inundated or overwhelmed with false positive (FP) detections. Rather, the SOC should be collecting the necessary metrics (through an appropriate level of investigation) to definitively demonstrate that the detections are FPs, and the feed that directly to the DFIR cycle to collect and analyze the necessary information to determine how to best address those FPs.

One example of the use of such a process, although not related to false positives, can be seen here. Specifically, Huntress ThreatOps analysts were seeing a lot of malware (in particular, but not solely restricted to Qakbot) on customer systems that seemed to be originating from phishing campaigns that employed disk image file attachments. One of the things we did was create an advisory for customers, providing a means to disable the ability for users to just double-click the ISO, IMG, or VHD files and automatically mount them. Users are still able to access the files programmatically, they just can't mount them by double-clicking them.

While this specific event wasn't related to false positives, it does illustrate how taking a deeper look at an issue or event can provide something of an "upstream remediation", approaching and addressing the issue much earlier in the attack chain

If you're into podcasts, Zaira provided me the wonderful opportunity to appear on the Future of Cyber Crime podcast! It was a great opportunity for me to engage with and learn from Zaira! Thank you so much!

Recycle Bin Persistence  
D1rkMtr recently released a Windows persistence mechanism (tweet found here) based on the Recycle Bin. This one is pretty interesting, not just in it's implementation but you have to wonder how someone on the DFIR side of that persistence mechanism would even begin to investigate it. 

I know how I would...I created a RegRipper plugin for it, one that will be run on every investigation automatically, and provide an analysis tip so I never forget what it's meant to show.

recyclepersist v.20230122
(Software, USRCLASS.DAT) Check for persistence via Recycle Bin
Category: persistence (MITRE T1546)

Classes\CLSID\{645FF040-5081-101B-9F08-00AA002F954E}\shell\open\command not found.
Classes\Wow6432Node\CLSID\{645FF040-5081-101B-9F08-00AA002F954E}\shell\open\command not found.

Analysis Tip: Adding a \shell\open\command value to the Recycle Bin will allow the program to be launched when the Recycle Bin is opened. This key path does not exist by default; however, the \shell\empty\command key path does.


Speaking of RegRipper plugins, I ran across this blog post recently about retrieving Registry values to decrypt files protected by DDPE. For me, while the overall post was fascinating in the approach taken, the biggest statement from the post was:

I don’t have a background in Perl and it turns out I didn’t need to. If the only requirement is a handful of registry values, several plugins that exist in the GitHub repository may be used as a template. To get a feel for the syntax, I found it helpful to review plugins for registry artifacts I’m familiar with. After a few moments of time and testing, I had an operational plugin.

For years, I've been saying that if there's a plugin that needs to be created or modified, it's as easy as either creating it yourself, by using copy-paste, or by reaching out and asking. Providing a clear, concise description of what you're looking for, along with sample data, has regularly resulted in a working plugin being available in an hour or so.

However, taking the reigns of the DIY approach as been something that Corey Harrell started doing years ago, and what let to such tools as auto_rip.

Now, this isn't to say that it's always that easy...talking through adding JSON output took some discussion, but the person who asked about that was willing to discuss it, and I think we both learned from the engagement.

Anyone who's followed me for a short while will know that I'm a really huge proponent for making the most of what's available, particularly when it comes to file metadata. One of the richest and yet largely untapped (IMHO) sources of such metadata are LNK files. Cisco's Talos team recently published a blog post titled, "Following the LNK Metadata Trail".

The article is interesting, and while several LNK builders are identified, the post falls just short of identifying toolmarks associated with these builders. At one point, the article turns to Qakbot campaigns and states that there was no overlap in LNK metadata between campaigns. This is interesting, when compared to what Mandiant found regarding two Cozy Bear campaigns separated by 2 years (see figs 5 & 6). What does this say to you about the Qakbot campaigns vs the Cozy Bear campaigns?

Updates to MemProcFS-Analyzer 
Evild3ad79 tweeted that MemProcFS-Analyser has been updated to version 0.8. Wow! I haven't had the opportunity to try this yet, but it does look pretty amazing with all of the functionality provided in the current version! Give it a shot, and write a review of your use of the tool!

OneNote Tools
Following the prevalence of malicious OneNote files we've seen though social media over the past few weeks, both Didier Stevens and Volexity crew have released tools for parsing those OneNote files.

Addendum, 30 Jan: Matthew Green added a OneNote parser/detection artifact to Velocidex.

Sunday, January 15, 2023

Wi-Fi Geolocation, Then and Now

I've always been fascinated by the information maintained in the Windows Registry. But in order to understand this, to really get a view into this, you have to know a little bit about my background. The first computer I remember actually using was a Timex-Sinclair 1000, just like the one in the image shown to the right. You connected it to the TV, programs were created via the keyboard and usually copied from "recipes" in the manual or in a magazine, and the "programs" could be saved to or loaded from a tape in a tape recorder. Yes, you read that right...a tape recorder. I was programming BASIC programs on this system, and then on a Mac IIe. After that, it was the Epson QX-10, and then for a very long time, in high school and then in college (I started college in August, 1985), the TRS-80

The point of all of this is that the configuration of these systems, particularly as we moved to systems running MS-DOS, was handled through configuration files, particularly autoexec.bat and a myriad *.ini files. Even when I started using Windows 3.1 or Windows 3.11 for Workgroups, the same held true...configuration files. We started to see the beginnings of the Registry with Windows 95, and files such as system.dat. 

Even from the very beginning of my experience with the Windows Registry, the amount and range of information stored in this data source has been absolutely incredible. In 2005, Cory Altheide and I published the first paper outlining artifacts associated with USB devices being connected to Windows (Windows XP) systems. What we were looking at at the time was commonalities across systems when the same device was connected to multiple systems, say, to run programs from the thumb drive, or copy files from systems to then take back to a central computer system.

From there, this topic has continued to be explored and unraveled, even as Windows itself continued to evolve and recognize different types of devices (thumb drives, digital cameras, smart phones) based on the protocol used.

In 2009, I wrote a blog post about another artifact stored within the Windows Registry; specifically, MAC addresses of wireless access points that a Windows system had connected to. By tracking this information and mapping the geo-location of those wireless access points based on data recorded in online databases, the idea was that an analyst could track the movements of that system, and hence, the owner. 

Why was this interesting? I'd heard more than a few stories from analysts and investigators who talked about an (former) employee of a company who, usually after the fact, was found to have visited a competitor's offices prior to resigning and accepting employment with that competitor. In one instance, not only did the employee connect their work computer to the Wifi system at a competitor's location, but they also connected to a Starbuck's store Wifi system that morning, next to or close to the competitor's location. With the time stamps of the connections, analysts were then able to use other timeline information to illustrate applications opened and files accessed until the system was shut down again.

I updated the tool I wrote in 2011, and as you can see from the post and comments, there was still interest in this topic at the time. I remember working on the tool, and taking the lat/long coordinates returned by the online database to populate a Google Map URL. So, over the course of about 2 yrs, the interest...or at least, my moving this forward, or at least revisiting it, was still there.

I recently ran across this tweet (I saw it on 15 Jan 2023), which led me to this Github repository.

This is what I love, truly love to something that was of interest at one point is once again on the forefront of someone's mind, to the point where they create a tool, and post it on Github. This truly shows that no matter how much work and effort is put into something at one point, there will always be growth, and different aspects of the early project (the platform, the Registry, the online databases, etc.) will be extended. This also shows that nothing ever really goes away...