Showing posts sorted by relevance for query trust record. Sort by date Show all posts
Showing posts sorted by relevance for query trust record. Sort by date Show all posts

Monday, August 07, 2023

Ransomware Attack Timeline

The morning of 1 Aug, I found an article in my feed about a ransomware attack against a municipality; specifically, Montclair Township in New Jersey. Ransomware attacks against municipalities are not new, and they can be pretty devastating to staff and citizenry, as well, and this is even before a ransom is paid. Services are impacted or halted, and I've even seen reports where folks lost considerable amounts of money because they weren't able to get the necessary documentation to process the purchase of a home.

I decided to dig a bit and see what other information I could find regarding the issue, and the earliest mention I could find was this page from 6 June 2023 that includes a link to a video message from the mayor, informing everyone of a "cyber incident". I also found this article from North Jersey dot com, reporting on the mayor's message. Two days later, this article from The Record goes into a bit more detail, including a mention that the issue was not related to the MOVEit vulnerability.

At this point, it looks as if the incident occurred on 5 June 2023. As anyone who's investigated a ransomware attack likely knows, the fact that files were encrypted on 5 June likely means that the threat actor was inside the environment well prior to that...2 days, 2 weeks, 2 months. Who knows. If access was purchased from an IAB, it could be completely variable, and as time passes and artifacts oxidize and decay, as the system just goes about operating, it can become harder and harder to determine that initial access point in time. 

What caught my attention on 28 July was this article from Montclair Local News stating that had a bit of a twist on the terminology used in such incidents; rather, should I say, another twist. Yes, these are referred to many times as a "cyber incident" or "cyber attack" without specifically identifying it as ransomware, and in this instance, there's this quote from the article (emphasis added):

To end a cyber attack on the Montclair Township’s IT Department, the township’s insurer negotiated a settlement of $450,000 with the attackers.

It's not often that a ransom paid is referred to as a settlement, at least not in articles I've read. I can't claim to have seen all articles associated with such "cyber attacks", but at the same time, I haven't seen this turn of phrase to refer to the ransom payment.

Shortly after the above statement, the article goes on to say:

Some data belonging to individual users remains to be recovered...

Ah, yes...a lot of times you'll see folks say, "...don't trust the bad guy...", because there's no guarantee that even paying for the decryptor that you'll get all of your data back. This statement would lead us to believe that this is one of those instances.

Another quote from the article:

To guard against future incidents, the township has installed the most sophisticated dual authentication system available to its own system and it is currently up and running.

Does this say something about the attack? Does this indicate that the overall issue, the initial infection vector, was thought to be some means of remote access that was not protected via MFA?

Something else this says about the issue - 5 June to 28 July is almost 8 full weeks. Let's be conservative here and assume that the reporting on 28 July is not up-to-the-minute, and say that the overall time between encrypted files and ransom (or "settlement") paid is 7 weeks; that's still a long time to be down, not being able to operate a business or a government, and this doesn't even address the impacted services, and the effect upon the community.

I know that one article mentions a "settlement" or what's more commonly known as a ransom payment, but where does that money really come from?

Municipalities (local governments, police departments, etc.) getting ransomed is nothing new. Newark was hit with ransomware in April 2017; yes, that was 6 yrs ago, multiple lifetimes in Internet years, but shouldn't that have served as a warning?

Monday, April 03, 2023

On Validation

I've struggled with the concept of "validation" for some time; not the concept in general, but as it applies specifically to SOC and DFIR analysis. I've got a background that includes technical troubleshooting, so "validation" of findings, or the idea of "do you know what you know, or are you just guessing", has been part of my thought processes going back for about...wow...40 years.

Here's an example...when setting up communications during Team Spirit '91 (military exercises in South Korea), my unit had a TA-938 "hot line" with another unit. This is exactly what it sounds like...it was a directly line to that other unit, and if one end was picked up, the other end would automatically ring. Yes, a "Bat phone". Just like that. Late one evening, I was in the "SOC" (our tent with all of the communications equipment) and we got a call that the hot line wasn't working. We checked connections, checked and replaced the batteries in the phone (the TA-938 phones took 2 D cell batteries, both facing the same direction), etc. There were assumptions and accusations thrown about as to why the phone wasn't working, as my team and I worked through the troubleshooting process. We didn't work on assumption; instead, we checked, rechecked, and validated everything. In the end, we found nothing wrong with the equipment on our end; however, the following day, we did find out what the issue was - at the other end, there was only one Marine in the tent, and that person had left the tent for a smoke break during the time of the attempted calls.

We could have just said, "oh, it's the batteries...", and replaced them...and we'd have the same issue all over again. Or, we could have just stated, "...the equipment on the other end was faulty/broken...", and we would not have made a friend of the maintenance chief from that unit. There were a lot of assumptions we could have made, conclusions we could have jumped to...and we'd have been wrong. We could have made stated findings that were trusted, and resulted in decisions being made, assets and resources being allocated, etc., all for the wrong reason. The end result is that my team and I (especially me, as the officer) would have lost credibility, and the trust and confidence of our fellow team members, and our commanding officer. As it was, validating our findings led to the right decisions being made, which were again validated during the exercise after action meetings.

Okay, so jump forward 32 years to present day...how does this idea of "validation" apply to SOC and DFIR analysis? I mean, this seems like such an obvious thing, right? Of course we validate our findings...but do we, really?

Case Study #1
A while back, I attended a conference during which one of the speakers walked through a PCI investigation they'd worked on. As the speaker walked through their presentation, they talked about how they'd used a single artifact, a ShimCache entry for the malware, to demonstrate program execution. This single artifact was used as the basis of the finding that the malware had been on the system for four years.

For those readers not familiar with PCI forensic investigations, the PCI Council specifies a report format and "dashboard", where the important elements of the report are laid in a table at the top of the report. One of those elements is "window of compromise", or the time between the original infection and when the breach was identified and remediated. Many merchants track the number of credit card transactions they process on a regular basis, including not only during periods of "regular" spending habits, but also off-peak and peak/holiday seasons, and as a result, the "window of compromise" can give the merchant, the bank, and the brand an approximate number of potentially compromised credit card numbers. As you'd imagine, given any average, the number of compromised credit card numbers would be much greater over a four year span than it would for, say, a three week "window of compromise". 

As you'd expect, analysts submitting reports rarely, if ever, find out the results of their work. I was a PCI forensic analyst for about three and a half years, and neither I nor any of my teammates (that I'm aware of) heard what happened to a merchant after we submitted our reports. Even so, I cannot imagine that a report with a "window of compromise" of four years was entirely favorable.

But that begs the question - was the "window of compromise" really four years? Did the analyst validate their finding using multiple data sources? Something I've seen multiple times is that malware is written to the file system, and then "time stomped", often using time stamps retrieved from a native system file. This way, the $STANDARD_INFORMATION attribute time stamps from the $MFT record for the file appear to indicate that the file is "long lived", and has existed on the system for quite some time. This time stomping occurs before the Application Compatibility functionality of the Windows operating system creates an entry for the file, and the last modification time that's recorded for the entry is the one that's "time stomped". As a result, a breach that occurred in May 2013 and was discovered three weeks later ends up having the malware itself being reported as placed on the system in 2009. What impact this had, or might have had on a merchant, is something that we'll never know.

Misinterpreting ShimCache entries has apparently been a time-honored tradition within the DFIR community. For a brief walk-through (with reference links) of ShimCache artifacts, check out this blog post.

Case Study #2
In the spring of 2021, analysts were reporting, based solely on EDR telemetry, that within their infrastructure threat actors were using the Powershell Set-MpPreference module to "disable Windows Defender". This organization, like many others, was tracking such things as control efficacy (the effectiveness of controls) in order to make decisions regarding actions to take, and where and how to allocate resources. However, these analysts were not validating their findings; they were not checking the endpoints themselves to determine if Windows Defender had, in fact, been disabled, and if the threat actor's attempts had actually impacted the endpoints. As it turns out, that organization had a policy at the time of disabling Windows Defender on installation, as they had chosen another option for their security stack. As such, stating in tickets that threat actors were disabling Windows Defender, without validating these findings, led to quite a few questions, and impacted the credibility of the analysts

Artifacts As Composite Objects
Joe Slowik spoke at RSA in 2022, describing indicators, or technical observables, as "composite objects". This is an important concept in DFIR and SOC analysis, as well, and not just in CTI. We cannot base our findings on a single artifact, treating it as a discrete, atomic indicator, such as an IP address just being a location, or tied to a system, or a ShimCache entry denoting time of execution. We cannot view a process command line within EDR telemetry, by itself, as evidence of program execution. Rather, we need to recognize that artifacts are, in fact, composite objects; in his talk, Joe references Mandiant's definition of indicators of compromise, which can help us understand and visualize this concept. 

Composite objects are made up of multiple elements. An IP address is not just a location, as the IP address is an observable with context. Where was the IP address observed, when was it used, and how was it used? Was it the source of an RDP, or a type 3 login? If the IP address was the source of a successful login, what was the username used? Was the IP address the source of a connection seen in web server or VPN logs? Is it the C2 address? 

If we consider a ShimCache entry, we have to remember that (a) the entry itself does NOT explicitly demonstrate program execution, and that (b) the time stamp is mutable. That is, what we see could have been modified before we saw it. For example, we often see analysts hold up a ShimCache entry as evidence of program execution, often as the sole indicator. We have to understand and remember that the time stamp associated with a ShimCache entry is the last modification time for the entry, taken from the $STANDARD_INFORMATION attribute within the MFT. I've seen several instances where the file is placed on the system and then time stomped (the time stamp is easily mutable) before the entry was added to the Application Compatibility database. This is all in addition to understanding that an entry in the ShimCache does NOT mean that the file was executed. Note that the same is true for AmCache entries, as well.

We can validate indicators of compromise by including them in constellations, including them alongside other associated indicators, as doing so increases fidelity and brings valuable context to our analysis. We see this illustrated when performing searches for PCI data within acquired images; if you just search for a string of 16 characters starting with "4", you're going to get a LOT of results. If you look for strings of characters based on a bank ID number (BIN), length of the string, and if it passes the Luhn check, you're still going to get a lot of results, but not as many. If you also search for the characteristics associated with track 1 and track 2 data, your search results are going to be a smaller set, but with much higher fidelity because we've added layers of context. 

Cost
So the question becomes, what is the cost of validating something versus not validating it? What is the impact or result of either? This seems on the surface like it's a silly question, maybe even a trick question. I mean, it looks that way when I read back over the question after typing it in, but then I think back to all the times I've seen when something hasn't been validated, and I have to wonder, what prevented the analyst from validating their finding, rather than simply basing their finding on a single artifact, out of context?

Let's look at a simple example...we receive an alert that a program executed, based on SIEM data or EDR telemetry. This alert can be based on elements of the command line, process parentage, or a combination thereof. Let's say that based on a number of factors and reliable sources, we believe that the command line is associated with malicious activity.

What do you report?

Do you report that this malicious thing executed, or do you investigate further to see if the malicious thing really did execute, and executed successfully? How would be go about investigating this, what data sources would be look to? 

As you're thinking about this, as you're walking through this exercise, something I'd like you to keep in mind is that question, what would prevent you from actually examining those data sources you identify? Is there some "cost" (effort, time, other resources) that prevent you from doing so?

Tuesday, July 31, 2012

Links and Updates

Blogosphere
Cory Harrell has another valuable post up, this one walking through an actual root cause analysis.  I'm not going to steal Corey's thunder on this...the post or the underlying motive...but suffice to say that performing a root cause analysis is critical, and it's something that can be done in an efficient manner.  There's more to come on this topic, but this sort of analysis needs to be done, it can be done effectively and efficiently, and if you don't do it, it will end up being much more expensive for you in the long run.

Jimmy Weg started blogging a bit ago, and in very short order has already posted several very valuable articles.  Jimmy's posts so far a very tutorial in nature, and provide a great deal of information regarding the analysis of Volume Shadow Copies.  He has a number of very informative posts available, to include a very recent post on using X-Ways to cull evidence from VSCs.

Mari has started blogging, and her inaugural post sets the bar pretty high.  I mentioned in this blog post that blogging is a great way to get started with respect to sharing DFIR information, and that even an initial blog post can lead to further research.  Mari's post included enough information to begin writing another parser for the Bing search bar artifacts, if you're looking to write one that presents the data in a manner that's usable in your reporting format.

David Nides recently posted to his blog regarding what he discussed in his SANS Forensic Summit presentation.  David's post focuses exclusively on log2timeline as the sole means for creating timelines, as well as some of what he sees as the short-comings with respect to analyzing the output of Kristinn's framework.  However, that's not what struck me about David's post...rather, what caught my attention was statements such as, for the average DFIR professional who is not familiar with CLI.

Now, don't think that I'm on the side of the fence that feels that every DFIR "professional" must be well-versed in CLI tools.  Not at all.  I don't even think that it should a requirement that DFIR folks be able to program.  However, I do see something...off...about a statement that includes the word "professional" along with "not familiar with CLI".

I've worked with several analysts throughout my time in the infosec field, and I've taught (and teach) a number of courses.  I have worked with analysts who have used *only* Linux, and done so at the command line...and I have engaged with paid professionals tasked with analysis work who are only able to use one commercial analysis framework.  So, while I am concerned by repeated statements in a post that seem to say, "...this doesn't work, because Homey don't play that...", I am also familiar with the reality of it.

Speaking of the SANS Forensic Summit, the Volatility blog has a new post up that is something of a different perspective on the event.  Sometimes it can be refreshing to get away from the distractions of the cartoons, and it's always a good idea to get different perspectives on events. 

Tools
The folks over at TZWorks have put together a number of new tools.  Their Jump List parser works for both the *.automaticDestinations-ms Jump Lists and the *.customDestinations-ms files, as well.  There's a Prefetch file parser, USB storage parser, and a number of other very useful utilities freely available.  The utilities are very useful, and are available for a variety of platforms (Win32- and 64-bit, Linux, Mac OS X).

If you're not familiar with the TZWorks.net site, take a look and bookmark it.  If you're downloading the tools for use, be sure to read the license agreement.  Remember, if you're reporting on your analysis properly, you're identifying the tools (and the versions) that you used, and relying on these tools for commercial work may come back and bite you.

Andrew posted to the ForensicsArtifacts.com site recently regarding MS Office Trust Records, which appear to be generated when a user trusts content via MS Office 2010.  Andrew, co-creator of Registry Decoder, pointed out that Mark Woan's RegExtract parses this information, and shortly after reading his post, I wrote a RegRipper plugin to extract the information, and then created another version of that plugin to extract the data in TLN format.  This information is very valuable, as it is an indicator of explicit user activity...when opening a document from an untrusted source, the user must click the "Enable Editing" button that appears in the application in order to proceed with editing it.  Clearly, this requires some additional testing to determine actions that cause this artifact to be populated, etc., but for now, it clearly demonstrates user access to resources (i.e., network drives, external drives, files, etc.).  In the limited testing that I've done so far, the time stamp associated with the data appears to be when the document was created on the system, not when the user clicked the "Enable Editing" button.  What I've done is downloaded a document (MS Word .docx) to my desktop via Chrome, recorded the date and time of the download, and then opened the file.  When the "Enable Editing" button is present in the warning ribbon at the top of the document, I will wait up to an hour (sometimes more) to click the button and record the time I did so.  Once I do, I generally close the document.  I then reboot the system and use FTK Imager to get a copy of the NTUSER.DAT hive, and run the plugin.  In every case so far, what I've seen is that the time stamp associated with the values in the key correlate to the creation time of the file, further evidenced by running "dir /tc".

Tuesday, January 01, 2013

BinMode

I've recently been working on a script to parse the NTFS $UsnJrnl:$J file, also known as the USN Change Journal.  Rather than blogging about the technical aspects of what this file is, or why a forensic analyst would want to parse it, I thought that this would be a great opportunity to instead talk about programming and parsing binary structures.

There are several things I like about being able to program, as an aspect of my DFIR work:
- It very often allows me to achieve something that I cannot achieve through the use of commercially available tools.  Sometimes it allows me to "get there" faster, other times, it's the only way to "get there".
- I tend to break my work down into distinct, compartmentalized tasks, which lends itself well to programming (and vice versa).
- It gives me a challenge.  I can focus my effort and concentration on solving a problem, one that I will likely see again and will already have an automated solution for solving when I see it.
- It allows me to see the data in its raw form, not filtered through an application written by a developer.  This allows me to see data within the various structures (based on structure definitions from MS and others), and possibly find new ways to use that data.

One of the benefits of programming is that I have all of this code available, not just as complete applications but also stuff I've written to help me perform analysis.  Stuff like translating time values (FILETIME objects, DOSDate time stamps, etc.), as well as a printData() function that takes binary data of an arbitrary length and translates it into a hex editor-style view, which makes it easy to print out sections of data and work with them directly.  Being able to reuse this code (even if "code reuse" is simply a matter of copy-paste) means that I can achieve a pretty extensive depth of analysis in fairly short order, reducing the time it takes for me to collect, parse, and analyze data at a more comprehensive level than before.  If I'm parsing some data, and use the printData() function to display the binary data in hex at the console, I may very well recognize a 64-bit time stamp at a regular offset, and then be able to add that to my parsing routine.  That's kind of how I went about writing the shellbags.pl plugin for RegRipper.

I've also recently been looking at IE index.dat files in a hex editor, and writing my own parser based on the MSIE Cache File Format put together by Joachim Metz.  So far, my initial parser works very well against the index.dat file in the TIF folder, as well as the one associated with the cookies.  But what's really fascinating about this is what I'm seeing...each record has two FILETIME objects and up to three DOSDate (aka, FATTime) time stamps, in addition to other metadata.  For any given entry, all of these fields may not be populated, but the fact is that I can view them...and verify them with a hex editor, if necessary.

As a side note regarding that code, I've found it very useful so far.  I can run the code at the command line, and pipe the output through one or more "find" commands in order to locate or view specific entries.  For example, the following command line gets the "Location : " fields for me, and then looks for specific entries; in this case, "apple":

C:\tools>parseie.pl index.dat | find "Location :" | find "apple" /i

Using the above command line, I'm able to narrow down the access to specific things, such as purchase of items via the Apple Store, etc.

I've also been working on a $UsnJrnl (actually, the $UsnJrnl:$J ADS file) parser, which itself has been fascinating.  This work was partially based on something I've felt that I've needed to do for a while now, and talking to Corey Harrell about some of his recent findings has renewed my interest in this effort, particularly as it applies to malware detection.

Understanding binary structures can be very helpful.  For example, consider the target.lnk file illustrated in this write-up of the Gauss malware.   If you parse the information manually, using the MS specification...which should not be hard because there are only 0xC3 bytes visible...you'll see that the FILETIME time stamps for the target file are nonsense (Cheeky4n6Monkey got that, as well).  As you parse the shell item ID list, based on the MS specification, you'll see that the first item is a System folder that points to "My Computer", and the second item is a Device entry whose GUID is "{21ec2020-3aea-1069-a2dd-08002b30309d}".  When I looked this GUID up online, I found some interesting references to protecting or locking folders, such as this one at LIUtilities, and this one at GovernmentSecurity.org.  I found this list of shell folder IDs, which might also be useful.

The final shell item, located at offset 0x84, is type 0x06, which isn't something that I've seen before.  But there's nothing in the write-up that explains in detail how this LNK file might be used by the malware for persistence or propagation, so this was just an interesting exercise for me, as well as for Cheeky4n6Monkey, who also worked on parsing the target.lnk file manually.  So, why even bother?  Well, like I said, it's extremely beneficial to understand the format of various binary structures, but there's another reason.  Have you read these posts over on the CyanLab blog? No?  You should.  I've seen shortcut/LNK files with no LinkInfo block, only the shell item ID list, that point to devices; as such, being able to parse and understand these...or even just recognize them...can be very beneficial if you're at all interested in determining USB storage devices that had been connected to a system.  So far, most of these devices that I have seen have been digital cameras and smart phone handsets.

Everything
Okay, right about now, you're probably thinking, "so what?"  Who cares, right?  Well, this should be a very interesting, if not outright important issue for DFIR analysts...many of whom want to see everything when it comes to analysis.  So the question then becomes...are you seeing everything?  When you run your tool of choice, is it getting everything?

Folks like Chris Pogue talk a lot about analysis techniques like "sniper forensics", which is an extremely valuable means for performing data collection and analysis.  However, let's take another look at the above question, from the perspective of sniper forensics...do you have the data you need?  If you don't know what's there, how do you know?

If you don't know that Windows shortcut files include a shell item ID list, and what that data means, then how can you evaluate the use of a tool that parses LNK files?  I'm using shell item ID lists as an example, simply because they're so very pervasive on Windows 7 systems...they're in shortcut files, Jump Lists, Registry value data.  They're in a LOT of Registry value data.  But the concept applies to other aspects of analysis, such as browser analysis.  When you're performing browser analysis in order to determine user activity, are you just checking the history and cookies, or are you including Registry settings ("TypedURLs" key values for IE 5-9, and "TypedURLsTimes" key values on Windows 8), bookmarks, and session restore files?  When performing USB device analysis on Windows systems, are you looking for all devices, or are you using checklists that only cover thumb drives and external hard drives?

I know that my previous paragraph covers a couple of different levels of granularity, but the point remains the same...are you getting everything that you need or want to perform your analysis?  Does the tool you're using get all system and/or user activity, or does it get some of it?

Can we ever know it all?
One of the aspects of the DFIR community is that, for the most part, most of us seem to work in isolation.  We work our cases and exams, and don't really bother too much with asking someone else, someone we know and trust, "hey, did I look at everything I could have here?" or "did I look at everything I needed to in order to address my analysis goals in a comprehensive manner?"  For a variety of reasons, we don't tend to seek out peer review, even after cases are over and done.

But you know something...we can't know it all.  No one of us is as smart or experienced as several or all of us working together.  This can be close collaboration, face-to-face, or online collaboration through blogs, or sites such as the ForensicsWiki, which makes a great repository, if it's used.

Choices
Finally, a word about choices in programming languages to use.  Some folks have a preference.  I've been using Perl for a long time, since 1999.  I learned BASIC in the '80s, as well as some Pascal, and then in the mid-'90s, I picked up some Java as part of my graduate studies.  I know some folks prefer Python, and that's fine.  Some folks within the community would like to believe that there are sharp divides between these two camps, that some who use one language detest the other, as well as those who use it.  Nothing could be further from the truth.  In fact, I would suggest that this attempt to create drama where there is none is simply a means of masking the fact that some analysts and examiners simply don't understand the technical aspects of the work that's actually being done.

Resources
Forensics from the Sausage Factory - USN Change Journal
Security BrainDump - Post regarding the USN Change Journal
OpenFoundry - Free tools; page includes link to a Python script for parsing the $UsnJrnl:$J file