Monday, February 06, 2023

Why Lists?

So much of what we see in cybersecurity, in SOC, DFIR, red teaming/ethical hacking/pen testing, seems to be predicated on lists. Lists of tools, lists of books, lists of sites with courses, lists of free courses, etc. CD-based distros are the same way, regardless of whether they're meant for red- or blue-team efforts; the driving factor behind them is often the list of tools embedded within the distribution. For example, the Kali Linux site says that it has "All the tools you need". If you go to the SANS SIFT Workstation site, you'll see the description that includes, "...a collection of free and open-source incident response and forensic tools." Here's a Github site that lists "blue team tools"...but that's it, just a list.

Okay, so what's up with lists, you ask? What's the "so, what?" 

Lists are great...they often show us new tools that we'd hadn't seen or heard about, possibly tools that might be more effective or efficient for us and our workflows. Maybe a data source has been updated and there's a tool that addresses that new format, or maybe you're run across a case that includes the use of a different web browser, and there's a tool that parses the history for you. So, having lists is good, and familiar...because that's the way we've always done it, right? A lot of folks developing these lists came into the industry themselves at one point, looked around, and saw others posting lists. As such, the general consensus seems to be, "share lists"...either share a list you found, or share a list you've added to.

Lists, particularly checklists, can be useful. They can ensure that we don't forget something that's part of a necessary process, and if we intentionally and purposely manage and maintain that checklist, it can be our documentation; rather than writing out each step in our checklist as part of our case notes/documentation, we can just say, "...followed/completed the checklist version xx.xx, as of Y date...", noting any discrepancies or places we diverged. The value of a checklist depends upon how it's used...if it's downloaded and used because it's "cool", and it's not managed and never updated, then it's pretty useless.

Are lists enough?

I recently ran across a specific kind of list...the "cheat sheet". This specific cheat sheet was a list of Windows Event Log record event IDs. It was different from some other similar cheat sheets I'd seen because it was broken down by Windows Event Log file, with the "event IDs of interest" listed beneath each heading. However, it was still just a list, with the event IDs listed along with a brief description of what they meant.

However, even though this cheat sheet was "different", it was still just a list and it still wasn't sufficient for analysis today. 

Why is that?

Because a simple list doesn't give you the how, nor does it give you the why. Great, so I found a record with that event ID, and someone's list said it was "important", but that list doesn't tell me how this event ID is important to nor used in my investigation, nor how I can leverage that event ID to answer my investigative questions. The cheat sheet didn't tell me anything about how that specific event ID 

We have our lists, we have our cheat sheets, and now it's time to move beyond these and start developing the how and why; how to use the entry in an investigation, and why it's important. We need to focus less on simple lists and more on developing investigative goals and artifact constellations, so that we can understand what that entry means within the overall context of our investigation, and what it means when the entry is absent. 

We need to share more about how the various items on our lists are leveraged to reach or further our investigative goals. Instead of a list of tools to use, talk about how you've used one of those tools as part of your investigative process, to achieve your investigative goals.

Having lists or cheat sheets like those we've been seeing perpetuates the belief that it's sufficient to examine data sources in isolation from each other, and that's one of the biggest failings of these lists. As a community, and as an industry, we need to move beyond these ideas of isolation and sufficiency; while they seem to bring about an immediate answer or immediate findings, the simple fact is that neither serves us well when it comes to finding accurate and complete answers.

Sunday, February 05, 2023

Validating Tools

Many times, in the course of our work as analysts (SOC, DFIR, etc.), we run tools...and that's it. But do we often stop to think about why we're running that tool, as opposed to some other tool? Is it because that's the tool everyone we know uses, and we just never thought to ask about another? Not so much the how, but do we really think about the why?

The big question, however, we validate our tools? Do we verify that the tools are doing what they are supposed to, what they should be doing, or do we simply accept the output of the tool without question or critical thought? Do we validate our tools against our investigative goals?

Back when Chris Pogue and I were working PCI cases as part of the IBM ISS X-Force ERS team, we ran across an instance where we really had to dig in and verify our toolset. Because we were a larger team, with varying skill levels, we developed a process for all of the required searches, scans and checks (search for credit card numbers, scans for file names, paths, hashes, etc.) based on Guidance Software's EnCase product, which was in common usage across the team. As part of the searches for credit card numbers (CCNs), we were using the built-in function isValidCreditCard(). Not long after establishing this process, we had a case where JCB and Discover credit cards had been used, but these weren't popping up in our searches.

Chris and I decided to take a look at this issue, and we went to the brands and got test card numbers...card numbers that would pass the necessary checks (BIN, length, Luhn check), but were not actual cards used by consumers. We ran test after test, and none using the isValidCreditCard() returned the card numbers. We tried reaching out via the user portal, and didn't get much in the way of a response that was useful. Eventually, we determined that those two card brands were simply not considered "valid" by the built-in function, so we overrode that function with one of our one, one that included 7 regexes in order to find all valid credit card numbers, which we verified with some help from a friend

We learned a hard lesson from this exercise, one that really cemented the adage, "verify your tools". If you're seeing (or not, as the case may be) something that you don't expect to see in the output of your tools, verify the tool. Do not assume that the tool is correct, that the tool author knew everything about the data they were dealing with and had accounted for edge cases. This is not to say that tool authors aren't smart and don't know what they're doing...not at all. In fact, it's quite the opposite, because what can often happen is that the data changes over time (we see this a LOT with Windows...), or there are edge cases that the tool simply doesn't handle well.

So we're not just asking about the general "verify your tools" adage; what we're really asking about is, "do you verify your tools against your investigative goals?". The flip side of this is that if you can't articulate your investigative goals, why are you running any tools in the first place?

Not long ago, I was working with someone who was using a toolset built out of open source and free tools. This toolset included a data collection component, middleware (parsed the data), and a backend component for engaging with and displaying the parsed data. The data collection component included retrieving a copy of the WMI repository, and I asked the analyst if they saw any use of WMI persistence, to which they said, "no". In this particular case, open reporting indicated that these threat actors had been observed using WMI for persistence. While the data collection component retrieved the WMI repository, the middleware component did not include the necessary code to parse that repository, and as such, one could not expect to see artifacts related to WMI persistence in the backend, even if they did exist in the repository. 

The issue was that we often expect the tools or toolset to be complete in serving our needs, without really understanding those "needs", nor the full scope of the toolset itself. Investigative needs or goals may not be determined or articulated, and the toolset was not validated against investigative goals, so assumptions were made, including ones that would lead to incomplete or incorrect reporting to customers.

Going Beyond Tool Validation to Process Validation
Not long ago, I included a question in one of my tweet responses: "how would you use RegRipper to check to see if Run key values were disabled?" The point of me asking that question was to determine who was just running RegRipper because it was cool, and who was doing so because they were trying to answer investigative questions. After several days of not getting any responses to the question (I'd asked the same question on LinkedIn), I posed the question directly to Dr. Ali Hadi, who responded by posting a YouTube video demonstrating how to use RegRipper. Dr. Hadi then posted a second YouTube video, asking, "did the program truly run or not?", addressing the issue of the StartupApproved\Run key.

The point is, if you're running RegRipper (or any other tool for that matter), why are you running it? Not how...that comes later. If you're running RegRipper thinking that it's going to address all of your investigative needs, then how do you know? What are your "investigative needs"? Are you trying to determine program execution? If so, the plugin Dr. Hadi illustrated in both videos is a great place to start, but it's nowhere near complete. 

You see, the plugin will extract values from the keys listed in the plugin (which Dr. Hadi illustrated in one of the videos). That version includes the StartupApproved\Run key in the plugin, as well, as it was added before I had a really good chance to conduct some more comprehensive testing with respect to that key and it's values. I've since removed the key (and the other associated keys) from the plugin and moved them to a separate plugin, with associated MITRE ATT&CK mapping and analysis tips.

As you can see from Dr. Hadi's YouTube video, it would be pretty elementary for a threat actor to drop a malware executable in a folder, and create a Run key value that points to it. Then, create a StartupApproved\Run key value that disables the Run key entry so that it doesn't run. What would be the point of doing this? Well, for one, to create a distraction so that the responder's attention is focused elsewhere, similar to what happened with this engagement.

If you are looking to determine program execution and you're examining the contents of the Run keys, then you'd also want to include the Microsoft-Windows-Shell-Core%4Operational Event Log, as well, as the event records indicate when the key contents are processed, as well as when execution of individual programs (pointed to by the values) began and completed. This is a great way to determine program execution (not just "maybe it ran"), as well as to see what may have been run via the RunOnce key, as well.

The investigative goal is to verify program execution via the Run/RunOnce keys, from both the Software and NTUSER.DAT hives. A tool was can use is RegRipper, but even so, this will not allow us to actually validate program execution; for that, we need a process that includes incorporating the Microsoft-Windows-Shell-Core%4Operational Event Log, as well as the Application Event Log, looking for Windows Error Reporting or Application Popup events. For any specific programs we are interested in, we'd need to look at artifacts that included "toolmarks" of that program, looking for any file system, Registry, or other impacts on the system.

If you're going to use a tool in SOC or DFIR work, understand the why; what investigative questions or goals will the tool help you answer/achieve? Then, validate that the tool will actually meet those needs. Would those investigative goals be better served by a process, one that addresses multiple aspects of the goal? For example, if you're interested in IP addresses in a memory dump, searching for the IP address (or IP addresses, in general) via keyword or regex searches will not be comprehensive, and will lead to inaccurate reporting. In such cases, you'd want to use Volatility, as well as bulk_extractor, to look for indications of network connections and communications.

Monday, January 30, 2023

Soft Skills: Writing


Like math in middle school, this is one of those subjects that we pushed back on, telling ourselves, "I'll never have to use this...", and then quite shockingly finding that it's amazing how much writing we actually do. However, are we doing it well, given the particular circumstances of the writing? We "write" on social media, not being too overly concerned about things like grammar, spelling, or even word choice, falling back on the old, " know what I meant...", or blaming auto-correct for the miscommunication.

I'll be the first to admit, I'm not an "expert" at writing, nor am I "the best". But I will say that I am intentional in my writing, and this is something that's led me to...not been the result of...maintaining a blog, and publishing several books, with others in the hopper.

Writing is a necessary skill that many who need to, do not intentionally engage in, and of those who do, very few accept criticism or feedback well. For myself, I have a long history in my career of having to write, which stated in college in the mid-to-late '80s. I was an engineering major, and my English professor said that I wrote "like an English major". To this day, I still don't know what that meant, because I was constantly switching verb tenses in my writing, which was reflected in my grades. 

While on active duty, I had to write, and because it was the military, there was was part of what we did, so there was no getting away from it, shrinking back and retreating when someone had recommendations. It started in training, with things like operations orders, and continued out "in the fleet", progressing into JAG manual investigations, fitness reports, pro-con assessments, etc. There was all kinds of writing, and there was a LOT of feedback, whether you wanted or liked it, or not.

Another thing that was clear about the writing in that environment was that not everyone received the same feedback, and not everyone took the feedback they received the same way. Very early on in my career, I learned some "truths" about writing fitness reports, particularly from knowledgeable individuals. I reported to my first unit in May 1990, and not long after, a CWO-2 named "Rick" returned from a promotion board at HQMC. He was able (and willing) to act as a confidant and mentor, particularly given not only his longevity in service, but also based on his very recent experience. He not only shared what he'd learned throughout his career, but also the insights he'd learned from the recent promotion panel. I was able to take what I learned from Rick, and use it going forward, but just a couple of years later, I had a SSgt who was applying for the Warrant Officer program, who had some fitness reports written on him that included questionable statements, statements that stood out as being starkly and glaringly counter to what I'd learned.

What I learned on active duty served me very well in the private sector, the biggest lesson being, "...don't get butt hurt when someone says something...". Look beyond the "how" of what's said, to the "what". Don't get so wrapped in the "you were mean to me" emotional response that you miss the gem hidden beyond that will get you over that hump, and allow you to be a better writer. Look beyond your own initial, visceral, emotional response, and closely examine the "what".

Now, if you're posting to social media, you may not care about grammar, spelling, punctuation, etc. It may not matter, and that's fine. If you're not trying to convey a thought or idea, and you're just "sh*t posting", then it really doesn't matter if the reader understands what you're trying to say. 

But what about if you're filling out a ticket, or reporting on an investigation? What if you're actually trying to convey something, because it's "important"? Now, I put the word "important" in quotes, because throughout the passed two decades, I've talked to more than a few in the DFIR community who haven't really grasped how important their communication is, how what they are sharing in a ticket or in a report is actually used by someone else to make a decision, to commit resources (or not), or to levy a fine or punishment. Many analysts never see what's done with their work, they never see corporate counsel or HR, or a regulatory body using what they've written to make decisions.

I've also seen far too many times how a simple, "...what does this mean?" or "...can you clarify which version of Windows you're working with..." is wildly misinterpreted and internalized as, "I'm being called out unfairly."

What Can I Do?
So, what? So, what does this all mean to you, the reader, and what can you do? What I'm going to share here are some of the lessons I've learned over the past three decades...

The first step is to recognize that we can all get better at communicating, and in particular, writing. So, start writing. Comment on posts (Twitter, LinkedIn) rather than simply clicking "Like". Did you like a book you read? Write a review. Ask someone a question about a book they read or about a post they wrote or recommended. 

To get better at writing, it's best to read. If you read something that you enjoyed reading, consider why you enjoyed it. Was it the content itself, or the writing style? If it was the writing style, try emulating that style. Is it more formal, clinical, or perhaps more conversational? Consider what works for you, what do you enjoy, and how can you make that part of how you write.

Do not assume any response or feedback you receive is intended to be negative. Yes, I get it...this is the Internet, and there is a lot of negativity, and when you encounter that, the best thing to do is ignore it. But when you receive feedback on something, particularly when it's sought out, don't immediately assume that it's negative, or that you've done something profoundly wrong. Instead, recognize the negative feelings you're having, take a deep breath...and look beyond those feelings and really try to see the "what" beyond the "how". Look for what's being said, beyond how it's being said, or how it makes you feel.

Friday, January 27, 2023

Updates, Compilation

Thoughts on Detection Engineering
I read something online recently that suggested that the role of detection engineering is to reduce the false positive (FPs) alerts sent to the SOC. In part, I fully agree with this; however, "cyber security" is a team sport, and it's really incumbent upon SOC and DFIR analysts to support the detection engineering effort through their investigations. This is something I addressed a bit ago in this blog, first here, and then here

From the second blog post linked above, the most important value-add is the image to the right. This is something I put together to illustrate what, IMHO, should be the interaction between the SOC, DFIR, threat hunting, threat intel, and detection engineering. As you see from the image, the idea is that the output of DFIR work, the DFIR analysis, feeds back into the overall process, through threat intel and detection engineering. Then, both of those functions further feed back into the overall process at various points, one being back into the SOC through the development of high(er) fidelity detections. Another feedback point is that threat intel or gaps identified by detection engineer serve to inform what other data sources may need to be collected and parsed as part of the overall response process.

The overall point here is that the SOC shouldn't be inundated or overwhelmed with false positive (FP) detections. Rather, the SOC should be collecting the necessary metrics (through an appropriate level of investigation) to definitively demonstrate that the detections are FPs, and the feed that directly to the DFIR cycle to collect and analyze the necessary information to determine how to best address those FPs.

One example of the use of such a process, although not related to false positives, can be seen here. Specifically, Huntress ThreatOps analysts were seeing a lot of malware (in particular, but not solely restricted to Qakbot) on customer systems that seemed to be originating from phishing campaigns that employed disk image file attachments. One of the things we did was create an advisory for customers, providing a means to disable the ability for users to just double-click the ISO, IMG, or VHD files and automatically mount them. Users are still able to access the files programmatically, they just can't mount them by double-clicking them.

While this specific event wasn't related to false positives, it does illustrate how taking a deeper look at an issue or event can provide something of an "upstream remediation", approaching and addressing the issue much earlier in the attack chain

If you're into podcasts, Zaira provided me the wonderful opportunity to appear on the Future of Cyber Crime podcast! It was a great opportunity for me to engage with and learn from Zaira! Thank you so much!

Recycle Bin Persistence  
D1rkMtr recently released a Windows persistence mechanism (tweet found here) based on the Recycle Bin. This one is pretty interesting, not just in it's implementation but you have to wonder how someone on the DFIR side of that persistence mechanism would even begin to investigate it. 

I know how I would...I created a RegRipper plugin for it, one that will be run on every investigation automatically, and provide an analysis tip so I never forget what it's meant to show.

recyclepersist v.20230122
(Software, USRCLASS.DAT) Check for persistence via Recycle Bin
Category: persistence (MITRE T1546)

Classes\CLSID\{645FF040-5081-101B-9F08-00AA002F954E}\shell\open\command not found.
Classes\Wow6432Node\CLSID\{645FF040-5081-101B-9F08-00AA002F954E}\shell\open\command not found.

Analysis Tip: Adding a \shell\open\command value to the Recycle Bin will allow the program to be launched when the Recycle Bin is opened. This key path does not exist by default; however, the \shell\empty\command key path does.


Speaking of RegRipper plugins, I ran across this blog post recently about retrieving Registry values to decrypt files protected by DDPE. For me, while the overall post was fascinating in the approach taken, the biggest statement from the post was:

I don’t have a background in Perl and it turns out I didn’t need to. If the only requirement is a handful of registry values, several plugins that exist in the GitHub repository may be used as a template. To get a feel for the syntax, I found it helpful to review plugins for registry artifacts I’m familiar with. After a few moments of time and testing, I had an operational plugin.

For years, I've been saying that if there's a plugin that needs to be created or modified, it's as easy as either creating it yourself, by using copy-paste, or by reaching out and asking. Providing a clear, concise description of what you're looking for, along with sample data, has regularly resulted in a working plugin being available in an hour or so.

However, taking the reigns of the DIY approach as been something that Corey Harrell started doing years ago, and what let to such tools as auto_rip.

Now, this isn't to say that it's always that easy...talking through adding JSON output took some discussion, but the person who asked about that was willing to discuss it, and I think we both learned from the engagement.

Anyone who's followed me for a short while will know that I'm a really huge proponent for making the most of what's available, particularly when it comes to file metadata. One of the richest and yet largely untapped (IMHO) sources of such metadata are LNK files. Cisco's Talos team recently published a blog post titled, "Following the LNK Metadata Trail".

The article is interesting, and while several LNK builders are identified, the post falls just short of identifying toolmarks associated with these builders. At one point, the article turns to Qakbot campaigns and states that there was no overlap in LNK metadata between campaigns. This is interesting, when compared to what Mandiant found regarding two Cozy Bear campaigns separated by 2 years (see figs 5 & 6). What does this say to you about the Qakbot campaigns vs the Cozy Bear campaigns?

Updates to MemProcFS-Analyzer 
Evild3ad79 tweeted that MemProcFS-Analyser has been updated to version 0.8. Wow! I haven't had the opportunity to try this yet, but it does look pretty amazing with all of the functionality provided in the current version! Give it a shot, and write a review of your use of the tool!

OneNote Tools
Following the prevalence of malicious OneNote files we've seen though social media over the past few weeks, both Didier Stevens and Volexity crew have released tools for parsing those OneNote files.

Addendum, 30 Jan: Matthew Green added a OneNote parser/detection artifact to Velocidex.

Sunday, January 15, 2023

Wi-Fi Geolocation, Then and Now

I've always been fascinated by the information maintained in the Windows Registry. But in order to understand this, to really get a view into this, you have to know a little bit about my background. The first computer I remember actually using was a Timex-Sinclair 1000, just like the one in the image shown to the right. You connected it to the TV, programs were created via the keyboard and usually copied from "recipes" in the manual or in a magazine, and the "programs" could be saved to or loaded from a tape in a tape recorder. Yes, you read that right...a tape recorder. I was programming BASIC programs on this system, and then on a Mac IIe. After that, it was the Epson QX-10, and then for a very long time, in high school and then in college (I started college in August, 1985), the TRS-80

The point of all of this is that the configuration of these systems, particularly as we moved to systems running MS-DOS, was handled through configuration files, particularly autoexec.bat and a myriad *.ini files. Even when I started using Windows 3.1 or Windows 3.11 for Workgroups, the same held true...configuration files. We started to see the beginnings of the Registry with Windows 95, and files such as system.dat. 

Even from the very beginning of my experience with the Windows Registry, the amount and range of information stored in this data source has been absolutely incredible. In 2005, Cory Altheide and I published the first paper outlining artifacts associated with USB devices being connected to Windows (Windows XP) systems. What we were looking at at the time was commonalities across systems when the same device was connected to multiple systems, say, to run programs from the thumb drive, or copy files from systems to then take back to a central computer system.

From there, this topic has continued to be explored and unraveled, even as Windows itself continued to evolve and recognize different types of devices (thumb drives, digital cameras, smart phones) based on the protocol used.

In 2009, I wrote a blog post about another artifact stored within the Windows Registry; specifically, MAC addresses of wireless access points that a Windows system had connected to. By tracking this information and mapping the geo-location of those wireless access points based on data recorded in online databases, the idea was that an analyst could track the movements of that system, and hence, the owner. 

Why was this interesting? I'd heard more than a few stories from analysts and investigators who talked about an (former) employee of a company who, usually after the fact, was found to have visited a competitor's offices prior to resigning and accepting employment with that competitor. In one instance, not only did the employee connect their work computer to the Wifi system at a competitor's location, but they also connected to a Starbuck's store Wifi system that morning, next to or close to the competitor's location. With the time stamps of the connections, analysts were then able to use other timeline information to illustrate applications opened and files accessed until the system was shut down again.

I updated the tool I wrote in 2011, and as you can see from the post and comments, there was still interest in this topic at the time. I remember working on the tool, and taking the lat/long coordinates returned by the online database to populate a Google Map URL. So, over the course of about 2 yrs, the interest...or at least, my moving this forward, or at least revisiting it, was still there.

I recently ran across this tweet (I saw it on 15 Jan 2023), which led me to this Github repository.

This is what I love, truly love to something that was of interest at one point is once again on the forefront of someone's mind, to the point where they create a tool, and post it on Github. This truly shows that no matter how much work and effort is put into something at one point, there will always be growth, and different aspects of the early project (the platform, the Registry, the online databases, etc.) will be extended. This also shows that nothing ever really goes away...

Saturday, December 31, 2022

Persistence and LOLBins

Grzegorz/@0gtweet tweeted something recently that I thought was fascinating, suggesting that a Registry modification might be considered an LOLBin. What he shared was pretty interesting, so I tried it out.

First, the Registry modification:

reg add "HKLM\System\CurrentControlSet\Control\Terminal Server\Utilities\query" /v LOLBin /t REG_MULTI_SZ /d 0\01\0LOLBin\0calc.exe

Then the command to launch calc.exe:

query LOLBin

Now, I've tried this on a Windows 10 system and it works great, even though Terminal Services isn't actually running on this system. Running just the "query" command on both Windows 10 and Windows 11 systems (neither with Terminal Services running) results in the same output on both:

Invalid parameter(s)

Running the "query" command with different parameters (i.e., "process", "user", etc.) proxies that command to the appropriate entry based on the value in the Registry, as illustrated in figure 1.

Fig 1: query key values

As such, running "query user" runs quser.exe, and you see the same output as if you simply ran "quser". 

Note that the Utilities key has two other subkeys, in addition to "query"; "change" and "reset", as illustrated in fig. 2.

Fig. 2: Utilities subkeys

So, I thought, what if I change the key path from "query", and make the same modification (via the 'reg add' command above) to the "change" subkey...would that have the same effect? Well, I tried it with an elevated command prompt, for both the "change" and "reset" subkeys, and got "Access is denied." both times. Okay, so we can only use an Admin level, anyway...with the "query" subkey.

So what?
Part of what makes this particular persistence so insidious (IMHO) is how it can be launched. When we use the Run keys (or a Scheduled Task, or a Windows service) for persistence, an analyst may not have any trouble seeing the launch mechanism for the program/malware. In fact, even outside of Registry analysis, some SOC consoles will allow the analyst to see notable events, and even provide enough process lineage to allow the analyst to deduce that a notable event occurred.

To Grzegorz's point, this is an interesting LOLBin, because query.exe exists on the system by default. IMHO, this is important because when I started learning computers over 40 yrs ago (yes, circa 1982), all we had was the command line; as such, understanding which commands were available on the system, as well as things like STDOUT, STDERR, file redirection, etc., were all just part of what we learned. Now, 40 yrs later, we have entire generations of analysts (SOC, DFIR, etc.) who "grew up" without ever touching a command line. So, when I saw Grzegorz's tweet, the first thing I did was go the command prompt and type "query", and saw the response listed above. Easy peasy...but how many analysts do that? What's going to happen if a SOC analyst sees telemetry for "query user"? Or, what happens if a DFIR analyst sees a Prefetch file for query.exe? What assumptions will they make, and how will those assumptions drive the rest of their analysis and response?

What is a way to use this? Well, let's say you gain access to a system...and this is just hypothetical...and run a script that enables RDP (if it's not already running), enables StickyKeys, writes a Trojan to an alternate data stream (thanks to Dr. Hadi for some awesome research!!) and then creates a value beneath the "query" key for the Trojan. That way, if your activity on other systems gets discovered, you have a way back into the infrastructure that doesn't require authentication. Connect to the system via the Remote Desktop Client, access StickyKeys so that you get a System-level command prompt, and you type "query LOLBin". Boom, you're back in!

As you might expect, I did write a RegRipper plugin for this persistence mechanism, the output of which makes it pretty straightforward to see any changes that may have been made. For example, "normal" look like the following:

utilities v.20221231
(System) Get TS Utilities subkey values
Category: persistence - T1546

ControlSet001\Control\Terminal Server\Utilities\change
LastWrite time: 2018-04-12 09:20:04Z
logon           0 1 LOGON chglogon.exe
port            0 1 PORT chgport.exe
user            0 1 USER chgusr.exe
winsta          1 WINSTA chglogon.exe

ControlSet001\Control\Terminal Server\Utilities\query
LastWrite time: 2018-04-12 09:20:04Z
appserver       0 2 TERMSERVER qappsrv.exe
process         0 1 PROCESS qprocess.exe
session         0 1 SESSION qwinsta.exe
user            0 1 USER quser.exe
winsta          1 WINSTA qwinsta.exe

ControlSet001\Control\Terminal Server\Utilities\reset
LastWrite time: 2018-04-12 09:20:04Z
session         0 1 SESSION rwinsta.exe
winsta          1 WINSTA rwinsta.exe

Analysis Tip: The "query" subkey beneath "\Terminal Server\Utilities" can be used for persistence. Look for unusual value names.


Keeping Grounded

As 2022 comes to a close, I reflect back over the past year, and the previous years that have gone before. I know we find it fascinating to hear "experts" make predictions for the future, but I tend to believe that there's more value in reflecting on and learning from the past.

Years ago, I remember hearing about something in legal circles referred to as "the CSI effect". In short, the unrealistic portrayal of "forensics" on TV shows had influenced public opinion. People would watch an hour-long crime drama TV show and what they saw set the expectation in their minds of "forensics" should be, and this unrealistic expectation made it difficult for prosecutors to convince some juries of their evidence.

Over the holiday season, a "bomb cyclone" across the US combined with the numbers of folks wanting to travel to cause travel delays with the airlines, as one might expect. Planes needed to be deiced, but at some locations, travel was simply impossible. However, one airline in particular experienced heavier than usual delays and cancellations, to the point where the Transportation Secretary took notice. For several days, the evening national news covered this story, focusing in on the failures of the airline, and how stranded passengers were standing in long lines just to seek assistance from the airline's customer service. As each day went by and media reports highlighted how the cascading failures were snowballing and impacting travelers, all of this served to create an sense of negativity toward the airline management. Everyone I spoke with over the holidays had the same negative perspective of the airline's management.

On the morning of 28 Dec 2022, I saw the following post on LinkedIn:

Erin's message served as a stark reminder that there's often more to the story, that regardless of what we see being reported in the media, there are often stories that are not covered and reported, and that do not make it into the public eye. News outlets have a limited amount of time to cover a hand-picked menu of events of the day, so we have to be conscious of "collection bias", and if what we're seeing and hearing is playing into a narrative that we assume is correct.

The point here is to remain grounded, and as we roll over into the New Year, this is a good opportunity to make a resolution to remain grounded, and to seek out accountability partners and mentors to help us remain grounded. Don't be so focused on the negative aspects of an event that we loose sight of the positive things that happen, and that the folks who make those positive things happen need our support more than someone we believe is to blame needs our anger. 

Sunday, December 25, 2022

Why I love RegRipper

Yes, yes, I're probably thinking, "you wrote it, dude", and while that's true, that's not the reason why I really love RegRipper. Yes, it's my "baby", but there's so much more to it than that. For me, it's about flexibility and utility. At the beginning of 2020, there was an issue with the core Perl module that RegRipper is built on...all of the time stamps were coming back as all zeros. So, I tracked down the individual line of code in the specific module, and changed it...then recompiled the EXEs and updated the Github repo. Boom. Done. I've written plugins during investigations, based on new things I found, and I've turned around working plugins in under an hour for folks who've reached out with a concise request and sample data. When I've seen something on social media, or something as a result of engaging in a CTF, I can tweak RegRipper; add a plugin, add capability, extend current functionality, etc. Updates are pretty easy. Yes, yes...I know what you're going to say..."...but you wrote it." Yes, I did...but more importantly, I'm passionate about it. I see far too few folks in the industry who know anything about the Registry, so when I see something on social media, I'll try to imagine how what's talked about could be used maliciously, and write a plugin.

And I'm not the only one writing plugins. Over the past few months, some folks have reached out with new plugins, updates, fixes, etc. I even had an exchange with someone the other day that resulted in them submitting a plugin to the repo. Even if you don't know Perl (a lot of folks just copy-paste), getting a new plugin is as easy as sending a clear, concise description of what you're looking for, and some sample data.

Not long ago, a friend asked me about JSON output for the plugins, so I've started a project to create JSON-output versions of the plugins where it makes sense to do so. The first was for the AppCompatCache...I still have a couple of updates to do on what information appears in the output, but the basic format is there. Here's an excerpt of what that output currently looks like:

  "pluginname": "appcompatcache_json"
  "description": "query\parse the appcompatcache\shimcache data source"
  "key": ".ControlSet001\Control\Session Manager."
  "value": "AppCompatCache"
  "LastWrite Time": "2019-02-15 14:01:26Z"
  "members": [
      "value": "C:\Program Files\Puppet Labs\Puppet\bin\run_facter_interactive.bat"
      "data": "2016-04-25 20:19:03"
      "value": "C:\Windows\System32\FodHelper.exe"
      "data": "2018-04-11 23:34:32"
      "value": "C:\Windows\system32\regsvr32.exe"
      "data": "2018-04-11 23:34:34"

Yeah, I have some ideas as to how to align this output with other tools, and once I settle on the basic format, I can continue creating plugins where it makes sense to do so.

Recently, we've been seeing instances of "Scheduled Task abuse", specifically of the RegIdleBackup task. To be clear, while some have been seeing it recently, it's not "new"...Fox IT, part of the NCC Group, reported on it last year. It's also been covered more recently here in reference to GraceWire/FlawedGrace. Of all the reporting that's out there on this issue, what hasn't been addressed is how the modification to the task is performed; is the task XML file being modified directly, or is the API being used, such that the change would also be reflected in the Task subkey entry in the Registry?

From the RegIdleBackup task XML file:

<Actions Context="LocalSystem">

So, we see the COM handler listed in the XML file, and that's the same CLSID that's listed in the Task entry in the Software hive. About 2 years ago, I updated a plugin that parsed the Scheduled Tasks from the Software hive, so I went back to that plugin recently and added additional code to look up CLSIDs within the Software hive and report the DLL they point to; here's what the output looks like now (from a sample hive):

Path: \Microsoft\Windows\Registry\RegIdleBackup
URI : Microsoft\Windows\Registry\RegIdleBackup
Task Reg Time : 2020-09-22 14:34:08Z
Task Last Run : 2022-12-11 16:17:11Z
Task Completed: 2022-12-11 16:17:11Z
User   : LocalSystem
Action : {ca767aa8-9157-4604-b64b-40747123d5f2} (%SystemRoot%\System32\regidle.dll)

The code I added does the look up regardless of whether the CLSID is the only entry listed in the Action field, or if arguments are provided, as well. For example:

Path: \Microsoft\Windows\DeviceDirectoryClient\HandleWnsCommand
URI : \Microsoft\Windows\DeviceDirectoryClient\HandleWnsCommand
Task Reg Time : 2020-09-27 14:34:08Z
User   : System
Action : {ae31b729-d5fd-401e-af42-784074835afe} -WnsCommand (%systemroot%\system32\DeviceDirectoryClient.dll)

The plugin also reports if it's unable to locate the CLSID within the Software hive.

So, what this means is that the next time I see a system that's been subject to an attack that includes Scheduled Task abuse, I can check to see if the issue impacts the Software hive as well as the Task XML file, and get a better understanding of the attack, beyond what's available in open reporting.

Finally, I can quickly create plugins based on testing scenarios; for some things, like parsing unallocated space within hive files, this is great for doing tool comparison and validation. However, when it comes to other aspects of DFIR, like extracting and parsing specific data from the Registry, there really aren't many other tools to use for comparison.

Interestingly enough, if you're interested in running something on a live system, I ran across reg_hunter a bit ago; it seems to provide a good bit of the same functionality provided by RegRipper, including looking for null bytes, RLO control characters in key and value names, looking for executables in value data, etc.

Monday, November 28, 2022

Post Compilation

Investigating Windows Systems
It's the time of year again when folks are looking for stocking stuffers for the DFIR nerd in their lives, and my recommendation is a copy of Investigating Windows Systems! The form factor for the book makes it a great stocking stuffer, and the content is well worth it!

Yes, I know that book was published in 2018, but when I set out to write the book, I wanted to do something different from the recipe of most DFIR books to that point, including my own. I wanted to write something that addressed the analysis process, so the book is full of pivot and decision points, etc. So, while artifacts may change over time...some come and go, others become new and change in format over time, others suddenly's the analysis process that doesn't change.

For example, chapter 4 addresses the analysis of a compromised web server, one that includes a memory dump. One of the issues I've run into over the past couple of years, since well after the book was published, is that there more than a few DFIR analysts who seem to believe that running a text search of a memory dump for IP addresses is "sufficient"; it's not. IP addresses are not often stored in ASCII format; as such, you'd likely want to use Volatility and bulk_extractor to locate the specific structures that include the binary representation of the IP address. As each tool looks for different structures, I recommend using them both...just look at ch 4 of IWS and see how different the information is between the two tools.

There's a lot of really good content in the book, such as "file system tunneling", covered beginning on pg 101. 

While some of the images used as the basis of analysis in the book are no longer available online, several are still available, and the overall analysis process applies regardless of the image.

Speaking of analysis processes, I ran across this blog post recently, and it touched on a couple of very important concepts, particularly:

This highlights the risk of interpreting single artefacts (such as an event record, MFT entry, etc) in isolation, as it doesn't provide any context and is (potentially) subject to misinterpretation.

Exactly! When we view artifacts in isolation, we're missing critical factors such as context, and in a great many instances, grossly misinterpreting the "evidence". This misinterpretation happens a lot more than we'd like to think, not due to a lack of visibility, but due to it simply being the DFIR culture.

Another profound statement from the author was:

...instead of fumbling and guessing, I reached out to @randomaccess and started discussing plausible scenarios.

Again...exactly! Don't guess. Don't spackle gaps in analysis over with assumption and speculation. It's okay to fumble, as long as you learn from it. However, most importantly, there's no shame in asking for help. In fact, it's quite the opposite. Don't listen to that small voice insider of you that's giving you excuses, like, "...oh, they're too busy...", or "...I could never ask them...". Instead, listen the roaring Gunnery Sergeant Hartmann (from "Full Metal Jacket") who's screaming at you to reach out and ask someone, Private Joker!!

For me, it's very validating to see others within the industry advocating the same approach I've been sharing for several years. Cyber defense is a team sport folks, and going it alone just means that we, and our customers, are going to come up short.

Tools for Memory Analysis
In addition to the tools for memory analysis mentioned earlier in this blog post, several others have popped over time. For example, here're two:


Now, I haven't tried either one of these tools, but they seem pretty great. 

Additional Resources:
CyberHacktics - Win10 Memory Analysis

Proactive Defense
"Proactive defense" means moving "left of bang", taking steps to inhibit or even obviate the threat actor, before or shortly after they gain initial access. For example, TheHackerNews recently reported on the Black Basta Ransomware gang, indicating that one means of gaining access is to coerce or trick a user into mounting a disk image (IMG) file and launching the VBS script embedded within it, to initially infect the system with Qakbot. Many have seen a similar technique to infect systems with Qakbot, sending ISO files with embedded LNK files. 

So, think about your users require the ability to mount disk image files simply by double-clicking them? If not, consider taking these steps to address this issue; doing so will still allow your users to programmatically access disk image files, but will prevent them from mounting them by double-clicking, or by right-clicking and choosing "Mount" from the context menu. This quite literally cuts the head off of the attack, stopping the threat actor in their tracks. 

Taking proactive security steps...creating an accurate asset inventory (of both systems and applications), reducing your attack surface, and configuring systems beyond the default...means that you're going to have higher fidelity alerts, with greater context, which in turn helps alleviate alert fatigue for your SOC analysts. 

Open Reporting
Lots of us pursue/review open reporting when it comes to researching issues. I've done this more than a few times, searching for unique terms I find (i.e., Registry value names, etc.), first doing a wide search, then narrowing it a bit to try to find more specific information. 

However, I strongly caveat this approach, in part due to open reporting like this write-up on Raspberry Robin, specifically due to the section on Persistence. That section starts with (emphasis added by me):

Raspberry Robin installs itself into a registry “run” key in the Windows user’s hive, for example:

However, the key pointed to is "software\microsoft\windows\currentversion\runonce\". The Run key is very different from the RunOnce key, particularly regarding how it's handled by the OS. 

Within that section are two images, neither of which is numbered. The caption for the second image reads:

Raspberry Robin persistence process following an initial infection and running at each machine boot

Remember where I bolded "user's hive" above? Simply by the fact that persistence is written to a user's hive means that the process starts following the next time that user logs in, not "at each machine boot".

Open reporting can be very valuable during analysis, and can provide insight that an analyst may not have otherwise. However, open reporting does need to be reviewed with a critical eye, and not simply taken at face value.

Sunday, November 27, 2022

Challenge 7 Write-up

Dr. Ali Hadi recently posted another challenge image, this one (#7) being a lot closer to a real-world challenge than a lot of the CTFs I've seen over the years. What I mean by that is that in the 22+ years I've done DFIR work, I've never had a customer pose more than 3 to 5 questions that they wanted answered, certainly not 51. And, I've never had a customer ask me for the volume serial number in the image. Never. So, getting a challenge that had a fairly simple and straight forward "ask" (i.e., something bad may have happened, what was it and when??) was pretty close to real-world. 

I will say that there have been more than a few times where, following the answers to those questions, customers would ask additional questions...but again, not 37 questions, not 51 questions (like we see in some CTFs). And for the most part, the questions were the same regardless of the customer; once whatever it was was identified, questions of risk and reporting would come up, was any data taken, and if so, what data?

I worked the case from my perspective, and as promised, posted my findings, including my case notes and timeline excerpts. I also added a timeline overlay, as well as MITRE ATT&CK mappings (with observables) for the "case".

Jiri Vinopal posted his findings in this tweet thread; I saw the first tweet with the spoiler warning, and purposely did not pursue the rest of the thread until I'd completed my analysis and posted my findings. Once I posted my findings and went back to the thread, I saw this comment:

"...but it could be Windows server prefetching could be disabled..."

True, the image could be of a Windows server, but that's pretty trivial to check, as illustrated in figure 1.

Fig 1: RRPro plugin output

Checking to see if Prefetching is enabled is pretty straightforward, as well, as illustrated in figure 2.

Fig 2: Prefetcher Settings via System Hive

If prefetching were disabled, one would think that the *.pf files would simply not be created, rather than having several of them deleted following the installation of the malicious Windows service. The Windows Registry is a hierarchal database that includes, in part, configuration information for the Windows OS and applications, replacing the myriad configuration and ini files from previous versions of the OS. A lot of what's in the Registry controls various aspects of the Windows eco-system, including Prefetching.

In addition to Jiri's write-up/tweet thread of analysis, Ali Alwashali posted a write-up of analysis, as well. If you've given the challenge a shot, or think you might be interested in pursuing a career in DFIR work, be sure to take a look at the different approaches, give them some thought, and make comments or ask questions.

Remediations and Detections
Jiri shared some remediation steps, as well as some IOCs, which I thought were a great addition to the write-up. These are always good to share from a case; I included the SysInternals.exe hash extracted from the AmCache.hve file, along with a link to the VT page, in my case notes.

What are some detections or threat hunting pivot points we can create from these findings? For many orgs, looking for new Windows service installations via detections or hunting will simply be too noisy, but monitoring for modifications to the /etc/hosts file might be something valuable, not just as a detection, but for hunting and for DFIR work.

Has anyone considered writing Yara rules for the malware found during their investigation of this case? Are there any other detections you can think of, for either EDR or a SIEM?

Lessons Learned
One of the things I really liked about this particular challenge is that, while the incident occurred within a "compressed" timeframe, it did provide several data sources that allowed us to illustrate where various artifacts fit within a "program execution" constellation. If you look at the various artifacts...UserAssist, BAM key, and even ShimCache and AmCache artifacts...they're all separated in time, but come together to build out an overall picture of what happened on the system. By looking at the artifacts together, in a constellation or in a timeline, we can see the development and progression of the incident, and then by adding in malware RE, the additional context and detail will build out an even more complete picture.

A couple of thoughts...

DFIR work is a team effort. Unfortunately, over the years, the "culture" of DFIR has been one that has developed into a bit of a "lone wolf" mentality. We all have different skill sets, to different degrees, as well as different perspectives, and bringing those to bear is the key to truly successful work. The best (and I mean, THE BEST) DFIR work I've done during my time in the industry has been when I've worked as part of team that's come together, leveraging specific skill sets to truly deliver high-quality analysis.

Thanks to Dr. Hadi for providing this challenge, and thanks to Jiri for stepping up and sharing his analysis!

Sunday, November 20, 2022

Thoughts on Teaching Digital Forensics

When I first started writing books, my "recipe" for how to present the information followed the same structure I saw in other books at the time. While I was writing books to provide content along the lines of what I wanted to see, essentially filling in the gaps I saw in books on DFIR for Windows systems, I was following the same formula other books had used to that point. At the time, it made sense to do this, in order to spur adoption.

Later, when I sat down to write Investigating Windows Systems, I made a concerted effort to take a different approach. What I did this time was present a walk-through of various investigations using images available for download on the Internet (over time, some of them were no longer available). I started with the goals (where all investigations must start), and shared the process, including analysis decisions and pivot points, throughout the entire process.

Okay, what does this have to do with teaching? Well, a friend recently reached out and asked me to review a course that had been put together, and what I immediately noticed was that the course structure followed the same formula we've seen in the industry for years...a one-dimensional presentation of single artifacts, one after another, without tying them all together. In fact, it seems that many materials simply leave it to the analyst to figure out how to extrapolate a process out of the "building blocks" they're provided. IMHO, this is why we see a great many analysts manually constructing timelines in Excel, after an investigation is "complete", rather than building one from the very beginning to facilitate and expedite analysis, validation, etc.

Something else I've seen is that some courses and presentations address data sources and artifacts one-dimensionally. We see this not only in courses, but also in other presented material, because this is how many analysts learn, from the beginning. Ultimately, this approach leads to misinterpretation of data sources (ShimCache, anyone??) and misuse of artifact categories. Joe Slowik (Twitter, LinkedIn) hit the nail squarely on the head when he referred to IoCs as "composite objects" (the PDF should be required reading). 

How something is taught also helps address misconceptions; for example, I've been saying for sometime now that we're doing ourselves and the community a disservice when we refer to Windows Event Log records solely by their event ID; I'm not the only one to say this, Joachim Metz has said it, as well. The point is that event IDs, even within a single Windows Event Log, are NOT unique. However, it's this reductionist approach that also leads to misinterpretation of data sources; we don't feel that we can remember all of the nuances of different data sources, and rather than looking to additional data sources on which to build artifact constellations and verification, we reduce the data source to the point where it's easiest to understand.

So, we need a new approach to teaching this topic. Okay, what would this approach look like? First, it would start off with core concepts of validation (through artifact constellations), and case notes. These would be consistent throughout, and the grade for the final project would be heavily based on the existence of case notes.

This approach is similar to the Dynamics mechanical engineering course I took during my undergraduate studies. I was in the EE program, and we all had to "cross-pollinate" with both mechanical and civil engineering. The professor for the Dynamics course would give points for following the correct process, even if one variable was left out. What I learned from this was that trying to memorize discrete facts didn't work as well as following a process; it was more correct to follow the process, even if one angular momentum variable was left out of the equation. 

The progression of this "new" course would include addressing, for example, artifact categories; you might start with "process execution" because it's a popular one. You might build on something that persists via a Run key value...the reason for this will become apparent shortly. Start with Prefetch files, and be sure to include outlier topics like those discussed by Dr Ali Hadi. Be sure to populate and maintain case notes, and create a timeline from the file system and Prefetch file metadata (embedded time stamps) this from the very beginning.

Next, go to Windows Event Logs. If the system has Sysmon installed, or if Process Tracking is enabled (along with the Registry mod that enables full command lines) in the Security Event Log, add those records to the timeline. As the executable is being launched from a Run key (remember, we chose such an entry for a reason, from above), be sure to add pertinent records from the Microsoft-Windows-Shell-Core%4Operational.evtx Event Log. Also look for WER or "Application Popup" (or other errors) that may be available from the Application Event Log. Also look for indications of malware detections in logs associated with AV and other monitoring tools (i.e., SentinelOne, Windows Defender, Sophos, WebRoot, etc.). Add these to the timeline.

Moving on to the Registry, we clearly have some significant opportunities here, as well. For example, looking at the ShimCache and AmCache.hve entries for the EXE If available), we have an opportunity clearly demonstrate the true nature and value of these artifacts, correcting the misinterpretations we so often see when artifacts are treated in isolation. We also need to bring in additional resources and Registry keys, such as the StartupApproved subkeys, etc.

We can then include additional artifacts like the user's ActivitiesCache.db, SRUM.db, etc., artifacts, but the overall concept here is to change the way we're teaching, and ultimately doing DF work. Start with a foundation that requires case notes and artifact constellations, along with an understanding of how this approach leads and applies to validation. Change the approach by emphasizing first principles from the very beginning, and keeping them part of the education process throughout, so that it becomes part of the DFIR culture.

Monday, November 14, 2022

RegRipper Value Proposition

I recently posted to LinkedIn, asking my network for their input regarding the value proposition of RegRipper; specifically, how is RegRipper v3.0 of "value" to them, how does it enhance their work? I did this because I really wanted to get the perspective of folks who use RegRipper; what I do with RegRipper could be referred to as both "maintain" and "abuse". Just kidding, but the point is that I know, beyond the shadow of a doubt, that I'm not a "typical user" of RegRipper...and that's the perspective I was looking for.

Unfortunately, things didn't go the way I'd hoped. The direct question of "what is the value proposition of RegRipper v3.0" was not directly answered. Other ideas came in, but what I wasn't getting was the perspective of folks who use the tool. As such, I thought I'd try something a little different...I thought I'd share my perspective.

From my perspective, and based on the original intent of RegRipper when it was first released in 2008, the value proposition for RegRipper consists of:

Development of Intrusion Intel
When an analyst finds something new, either through research, review of open reporting, or through their investigative process, they can write a plugin to address the finding, and include references, statements/comments, etc.

For example, several years ago, I read about Project Taj Mahal, and found it fascinating how simple it was to modify the Registry to "tell" printers to not delete copies of printed jobs. This provides an investigator the opportunity to detect a potential insider threat, just as much as it provides a threat actor with a means of data collection. I wrote a plugin for it, and now, I can run it either individually, or just have it run against every investigation, automatically.

Extending Capabilities
Writing a plugin means that the capabilities developed by one analyst are now available to all analysts, without every analyst having to experience the same investigation. Keep in mind, as well, that not all analysts will approach investigations the same way, so one analyst may find something of value that another analyst might miss, simply because their perspectives and backgrounds are different.

Over the years, a number of folks in the community have written plugins, but not all of them have opted to include those plugins in the Github repo. If they had, another analyst, at another organization, can run the plugin without ever having to first go through an investigation that includes those specific artifacts. The same is true within a team; one analyst could write a plugin, and all other analysts on the team would have access to that capability, without having to have that analyst there with them, even if that analyst were on PTO, parental leave, or had left the company. 

As a bit of a side note, writing things like RegRipper plugins or Yara rules provides a great opportunity when it comes to things like performance evaluations, KPIs, etc.

Retention of "Corporate Knowledge"
A plugin can be written and documented (comments, etc.) such that it provides more than just the basic information about the finding; as such, the "corporate knowledge" (references, context, etc.) is retained and available to analysts, even when the plugin author is unavailable. The plugin can be modified and maintained across versions of Windows, if needed.

All of these value propositions lead to greater efficiency, effectiveness and accuracy of analysts, providing greater context and letting them get to actual analysis faster, and overall reducing costs. 

Now, there are other "value propositions" for me, but they're unique to me. For example, all I need to do is consult the CPAN page for the base module, and I can create a tool (or set of tools) that I can exploit during testing. I've also modified the base module, as needed, to provide additional information that can be used for various purposes.

I'm still very interested to understand the value proposition of RegRipper to other analysts.

Monday, October 31, 2022

Testing Registry Modification Scenarios

After reading some of the various open reports regarding how malware or threat actors were "using" the Registry, manipulating it to meet their needs, I wanted to take a look and see what the effects or impacts of these actions might "look like" from a dead-box, DFIR perspective, looking solely at the Registry.  I wanted to start with an approach similar to what I've experienced during my time in IR, particularly the early days, before EDR, before things like Sysmon or enabling Process Tracking in the Security Event Log. I thought that would be appropriate, given what appears to be the shear number of organizations with limited visibility into their infrastructures. For those orgs that have deployed Sysmon, the current version (v14.1) has three event IDs (12, 13, and 14) that pertain to the Registry.

The first scenario I looked at was from this Avast write-up on Raspberry Robins's Roshtyak component; in the section titled "Indirect registry writes", the article describes the persistence mechanism of renaming the RunOnce key, adding a value, then re-renaming the key back to "RunOnce", apparently in an effort to avoid rules/filters that look specifically for values being added to the RunOnce key. As most analysts are likely aware, the purpose of the RunOnce key is exactly launch executables once. When the RunOnce key is enumerated, the value is read, deleted, and the executable it pointed to is launched. In the past, I've read about malware executables that are launched from the RunOnce key, and the malware itself, once executed, will re-write a value to that key, essentially allowing the RunOnce key and the malware together to act as if the malware were launched from the Run key.

I wanted to perform this testing from a purely dead-box perspective. Using EDR tools, or relying on the Windows Event Logs. Depending upon your configuration, you could perhaps look to the Sysmon Event Log, or if the system had been rebooted, you could also look to the Microsoft-Windows-Shell-Core%4Operational.evtx Event Log and Events Ripper to percolate unusual executables.

For reference, information on the Registry file format specification can be found here.

The first thing I did was use "reg save" to create a backup of the Software hive. I then renamed the RunOnce key, and added a value (i.e., "Calc"), and renamed the key back to "RunOnce", all via RegEdit. I then closed RegEdit and used "reg save" to create a second copy of the Software hive. I then opened RegEdit, deleted the value, and saved a third copy of the Software hive.

During this process, I did not reboot the system; rather, I 'simulated' a reboot of the system by simply deleting the added value from the RunOnce key. Had the system been rebooted, there would likely be an interesting event record (or two) in the Microsoft-Windows-Shell-Core%4Operational.evtx Event Log.

Finally, I created a specific RegRipper plugin to extract explicit information about the key from the hive file.

First Copy - Software
So, again, the first thing I wanted to do was create a baseline; in this case, based on the structure for the key node itself. 

Fig 1: Software hive, first copy output

Using the API available from the Perl Parse::Win32Registry module, I wrote a RegRipper plugin to assist me in this testing. I wanted to get the offset of the key node; that is, the location within the hive file for the node itself. I also wanted to get both the parsed and raw information for the key node. This way, I could not only see the parsed data from within the structure of the key node itself, but I could also see the raw, binary structure, as well.

Second Copy - Software2
After renaming the RunOnce key, adding a value, and re-renaming the key back to "RunOnce", I saved a second copy of the Software hive, and ran the plugin to retrieve the information illustrated in figure 2.

Fig 2: Plugin output, second copy, Software hive

We can see between figures 1 and 2 that there are no changes to the offset, the location of the key within the hive file itself. In fact, the only changes we do see are the LastWrite time (which is to be expected), and the number of values, which is now set to 1.

Third Copy - Software3
The third copy of the Software hive is where I had deleted the value that had been added. Again, this was intended to simulate rebooting the system, and did not account for the malware adding a reference to itself back to the RunOnce key once it was launched.

Figure 3 illustrates the output of the plugin run against the third copy of the Software hive.

Fig 3: Plugin output, third copy, Software hive

Again, the offset/location of the key node itself hasn't changed, which is to be expected. Deleting the value changes the number of values to "0", and adjusts the key LastWrite time (which is to be expected). 

I then ran the plugin (to get deleted keys and values from unallocated space within the hive file) against the third copy of the Software hive, opened the output in Notepad++, searched for "calc", and found the output shown in figure 4 below. I could have used regslack, from Jolanta Thomassen (go here to see Jolanta's thesis from 2008), but simply chose the RegRipper plugin because I was already using RegRipper.

Fig 4: output from third copy, Software hive

Unfortunately, value nodes contain neither time stamps, nor a reference back to the original key node (parent key offset) to which they were a member, as described in sections 4.1.1 and 4.1.2 of the Registry file format specification for key nodes; value node structures are described in sections 4.4.1 and 4.4.2. 

As we can see from this testing, there's not much that we can see just from the Registry hive file that would lead us to believe that anything unusual had happened. While we might have an opportunity to see something of this activity via the transaction logs, that would depend a great deal upon how long after the activity that the incident was discovered, the amount of usage on the system, etc. It appears that the way this specific activity would be discerned would be through a combination of malware RE, EDR, Windows Event Log records, etc.

Next, I'll take a look at at least one of the scenarios presented in this Microsoft blog post.

Addendum, 1 Nov: Maxim Suhanov reached to me about running "yarp-print --deleted" to get a different view of deleted data within the hive, and I found some anomalous results that I simply cannot explain. As a result, I'm going to completely re-run the tests, fully documenting each step, and providing the results again.

Tuesday, October 18, 2022

Data Collection

During IR engagements, like many other analysts, I've seen different means of data exfiltration. During one engagement, the customer stated that they'd "...shut off all of our FTP servers...", but apparently "all" meant something different to them, because the threat actor found an FTP server that hadn't been shut off and used it to first transfer files out of the infrastructure to that server, and then from the server to another location. This approach may have been taken due to the threat actor discovering some modicum of monitoring going on within the infrastructure, and possibly being aware that FTP traffic going to a known IP address would not be flagged as suspicious or malicious.

During another incident, we saw the threat actor archive collected files and move them to an Internet-accessible web server, download the archives from the web server and then delete the archives. In that case, we collected a full image of the system, recovered about a dozen archives from unallocated space, and were able to open them; we'd captured the command line used to archive the files, including the password. As a result, we were able to share with the customer exactly what was taken, and this allowed us to understand a bit more about the threat actor's efforts and movement within the infrastructure.

When I was first writing books, the publisher wanted me to upload manuscripts to their FTP site, and rather than using command line FTP, or a particularly GUI client utility, they provided instructions for me to connect to their FTP site via Windows Explorer. What I learned from that was that the evidence of the connection to the FTP site appeared my shellbags. Very cool. 

Okay, so those are some ways to get data off of a system; what about data collection? What are some different ways that data can be collected?

Earlier this year, Lina blogged about performing clipboard forensics, which is not something I'd really thought about (not since 2008, at least), as it was not something I'd ever really encountered. MITRE does list the clipboard as a data collection technique, and some research revealed that some malware that targets crypto wallets will either get the contents of the clipboard, or replace the contents of the clipboard with their own crypto wallet address. 

Perl has a module for interacting with the Windows clipboard, as do other programming languages, such as Python. This makes it easy to interact with the clipboard, either extracting data from it, or 'pasting' data into it. You can view the contents of the clipboard by hitting "Windows Logo + V" on your keyboard.

Fig 1: Clipboard Settings
But, wait...there's more! More recent versions of Windows allow you to not only enable a clipboard history, maintaining multiple items in your clipboard, but also sync your clipboard across devices! So, if you have a Windows account, you can sync the clipboard contents across multiple devices, which is an interesting means of data exfiltration! 

Both of these settings manifest as Registry values, so they can be queried or even set (by threat actors, if the user hasn't already done so). For example, a threat actor can enable the clipboard history 

Digging into Lina's blog post led me to this ThinkDFIR post on "Clippy history", and just like it says there, once the clipboard history is enabled, the %AppData%\Microsoft\Windows\Clipboard folder is created. Much like what the ThinkDFIR post describes, if the user pins an item in their clipboard, additional data is created in the Clipboard folder, including JSON files that contain time stamps, all of which can be used by forensic analysts. The contents of the files themselves that contain the saved data are encrypted, however...there does seem to be (from the ThinkDFIR post comments) a tool available for decrypting and viewing the contents, but I haven't tried it.

Suffice to say that while the system is active, it's possible to have malware running via an autostart location or as a scheduled task, that can retrieve the contents of the clipboard as a data collection technique. Lina pointed out another means of performing clipboard "forensics"; beyond memory analysis, parsing the user's ActivitiesCache.db file may offer some possibilities.

Additional Resources
Cellebrite - Syncing Across Devices - Logging into multiple systems using the same Microsoft ID
ForensicFocus - An Investigator's Goldmine

Fig 2: Printer Properties Dialog
Printers And KeepPrintedJobs
Another means for collecting data to gain insight into an organization is by setting printers to retain copies of print jobs, rather than deleting them once the job is complete. This is a particularly insidious means of data collection, because it's not something admins, analysts, or responders usually check for, as even for some of us who've been in the industry for some time, the general understanding is the print jobs are deleted by default, once they've completed.

We say, "can be used", but has it been? According to MITRE, it has, by a group referred to as "TajMahal". This group has been observed using the Modify Registry technique as a means of data collection, specifically setting the KeepPrintedJobs attribute via the Registry. The printer properties dialog is visible in figure 2, with the KeepPrintedJobs attribute setting highlighted. 

While there isn't a great deal of detail around the Tah Mahal group's activities, Kaspersky mentioned their capability for stealing data in this manner in April 2019. The story was also picked up by Wired.comSecureList and Schneier on Security.

Fig 3: Print Job Files
The spool (.spl) and shadow (.shd) files are retained in the C:\Windows\System32\spool\PRINTERS folder, as illustrated in figure 3. The .spl file is an archive, and can be opened in archive tools such as 7-Zip. Within that archive, I found the text of the page I'd printed out (in my test of the functionality) in the file "Documents\1\Pages\1.fpage\[0].piece" within the archive.

I did some quick Googling for an SPL file viewer, more out of interest rather than wanting to actually do so. I found a few references, including an Enscript from OpenText, but nothing I really felt comfortable downloading.

There are more things in heaven and earth, dear reader, than are dreamt of in your philosophy...or your forensics class. As far as data collection goes, there are password stealers like Predator the Thief that try to collect credentials from a wide range of applications, and then there's just straight up grabbing files, including PST files, contents of a user's Downloads folder, etc. But then, there are other ways to collect sensitive data from users, such as from the clipboard, or from files they printed and then deleted...and thought were just...gone.