Thursday, May 22, 2014

Book Writing: To Self-Publish, or Not

The CEIC Conference is going on as I write this, and Suzanne Widup's author panel went on yesterday.  I'm not at the conference, so like many others, I live vicariously through what gets Tweeted about the conference, as well as about specific portions of the conference, such as the panel.

I saw a question posted to Twitter, in which the tweeter asked, "for the panel, why not self-publish like RTFM?"

My initial thought was, you need to consider the members of the panel and the books they've written or co-authored; those titles really don't lend themselves too well to a format similar to RTFM, which, in some cases, is described as a collection of notes and tips bound into a book.  For example, I don't think I could see Hacking Exposed Computer Forensics in a format similar to RTFM.  As such, the question is essentially an apples-to-oranges comparison.  While self-publishing definitely has it's place, but may not always the best option for the material, nor for the author.  But that doesn't mean that there aren't publishing possibilities out there that would be very well suited to a format similar to RTFM.

I've addressed this topic before, but it is a good question, and certainly bears addressing.  Essentially, the choice of whether to go with a publisher or to self-publish comes down to how much time and effort you do want to invest in getting the book published...at least, that's my perspective.  Other perspectives might be about what the author gets out of it, or how much someone has to pay for the book.  Writing a book is tough enough as it is...having someone there to do some of those things that need to be done (i.e., formatting, illustrations, copy editing, printing, etc.) in order for the book to be available to others means that the author can focus time and effort on writing the book, and not have to stop and figure out how get something done, or find a resource.

A friend of mine told me that her husband publishes CDs for his band on his own, rather than going through a production firm.  That means that he does everything himself; in some cases this makes perfect sense.  In others, such as writing a DFIR book, maybe not so much.  Most of us don't like to write as it is; if you self-publish, would you have someone review your materials for grammar, consistency, and technical accuracy?  If so, would it be someone you pay for that service?  Where's that money going to come from?  How will you handle illustrations or figures?

As such, consider this... if self-publishing were the sole route available, we'd likely have far fewer books available in the DFIR field.  Or, maybe another way to put it is that if self-publishing really were that easy, we'd have more books.  In the years that I've been involved in writing books, I've seen a fairly good number of folks start down the road and not make it very far, for a variety of reasons.  In some cases, it's due to the realization that there's much more to writing a book than simply having an idea.  When the publisher comes back and gives you a bunch of forms to fill out, and requests a market analysis and a detailed outline, with a swag on a word count, the reality of the situation becomes readily apparent.  I've seen people stop there.  I've also seen one instance where the author got past the point of signing a contract, and the publisher came back later and modified the contract, almost doubling the word count for the final manuscript, but made no other changes to contract, including the delivery date. The author simply walked away.

I've read a number of the reviews for RTFM, and to be honest, the book sounds like a fantastic idea; it was apparently originally intended to be an accumulation of someone's notes to be passed on to their team.  In the right hands, something like that can be extremely useful, and I can relate; when I was in grad school in '95-ish, I taught myself Java programming and relied heavily on O'Reilly's "...in a Nutshell" books for tips and guidance.  I found it very useful, because I wasn't looking for the basics of programming, and the basics of Java programming to be explained to me...I just wanted the bare bones stuff, with no fluff.  Material that might be better suited to an RTFM-like format might be something like what's found here.

Self-publishing simply isn't for everyone...the audience for a book like this is pretty limited.  I can see books like this for using other tools, but I think that one of the strengths of RTFM is that there's the base assumption that anyone purchasing the book is familiar with both operating at the command line, and with Linux.  While there are certain segments of the DFIR community that would strongly suggest that that's exactly how it should be, the fact is that this is far from reality.

Resources
Self-publishing a book: 25 things you need to know - I strongly suggest that you read them all
Lulu.com - self-publishing company
How to self-publish - a guide, with pictures

Saturday, May 17, 2014

Artifacts

I received a request right before WFA 4/e hit the streets...after the writing and editing was complete and while the printed book was being shipped...to "talk about anti-forensics".  Unfortunately, at this point, I still haven't heard any more than just that, but I've had more than a couple of instances where knowledge of artifacts and Windows structures has allowed me to gather valuable data for analysis, even when the bad guy took steps, however unknowingly, to remove other artifacts.  I say "unknowingly" because sometimes the steps taken may not specifically be intended to be "anti-forensic" in nature, but may still have that effect.

Something that I've found over the years is that even when steps are taken to remove indications of activity, there may still be artifacts available that can prove valuable to an analyst.  While the analyst may not be able to answer THE question that they have, there may be data available that will still provide insight into the case and allow other questions to be answered.  For example, if the intruder accessed a system via RDP and removed or obscured some valuable data source (i.e., cleared the Windows Event Log, etc.) and the question you have is, "where did they access this system from?", you might not be able to answer that question.  However, using other data, you would be able to show when they were active on the system, what they were doing at the time, and even demonstrate access to other systems.

To quote Blade: "When you understand the nature of a thing, you know what it's capable of."  I know, I know...but I really wanted to work that quote into this post.  ;-)  What I'll do now is take a look at some of the things I and others have seen, and provide some thoughts as to other data sources that would be of value.

FTP Via Windows Explorer
I've seen the native ftp.exe client used on systems in a variety of cases, and not only to exfiltrate data.  Back when I was doing PCI forensic analysis, we saw a good deal of SQL injection activity, some of which would use echo to create an FTP script on a system, and then launch that script using the -s switch with ftp.exe.  The use of ftp.exe to infiltrate or exfil data can leave artifacts in the Registry, and for workstation systems, there will be an application prefetch file created.  On XP systems, the last accessed time on the file will be updated, and there will very likely be a value created in the user's MUICache key for ftp.exe.

However, you can use Windows Explorer to connect to an FTP server.  My publisher used to have me do this in order to transfer chapters, and I've seen this used a number of times on various cases.  The interesting thing about this is that while it involves interaction via the GUI shell, it leaves far fewer artifacts than using the command line utility.  In fact, having looked at several cases where this technique is used, the only place that I've found artifacts of this activity is in the user's shellbag artifacts.  I've discussed these artifacts before, so I won't go into a great deal of detail here.  Suffice to say, shellbags can be a great resource, demonstrating access to network resources such as shares (even C$ shares), MTP devices (digital cameras and smartphones), FTP sites, etc., providing artifacts of activity that you might not find anywhere else on the system.

Clearing Windows Event Logs
Ever access a system and find out that the records in the Security and System Event Logs only go back a day or two?  One of the things I talk about in my books and presentations is that while it's easy to assume that the default configuration of the Event Logs caused them to roll over, it's also pretty trivial to check and see if there was some other reason for this, such as a user clearing the Event Logs.  If this happens, you'll likely see a record in the Security Event Log indicating that this happened, so look for the appropriate event ID (517 or 1102).  I've seen intruders do this, and I've seen admins who are responding to and troubleshooting an "incident" do this, as well.  Many times when the Event Log is cleared, you'll see a user accessing the Event Viewer (usually visible via the UserAssist data) just prior to that time.

When the Event Log is cleared, that doesn't mean that all data goes away.  You can try to recover Windows Event Log records using Willi Ballenthin's EVTXtract, or depending on what you're trying to illustrate, you can look to other data.  For example, I've had instances when the Windows Event Logs have been cleared, but I've been able to demonstrate a user's windows of activity over time using other sources of data, such as the Registry, VSCs, etc.

The Power of Mini-, Micro-, and Nano-Timelines
Daniel Garcia recently added a review of WFA 4/e to the Amazon page for the book (thanks again, Daniel, for taking the time do to that, I greatly appreciate it); in that review he mentioned mini-timelines.  Interestingly enough, I use this technique all the time.  Many times, I'll grab some information and start putting together a timeline from a small subset of data sources, in order to get an idea of what's going on, and then once I have that info, I'll kick off a heftier process and let that run while I'm analyzing what I have.  Or, as is often the case, the results of analyzing the mini-timeline will provide me with the direction for my next steps.  This allows me to see things that I might have missed had I included voluminous amounts of file system metadata, Windows Event Logs, etc., and goes back to the technique of using overlays that I mentioned over two years ago.  This technique has provided useful in a number of cases.  For example, if someone is in a data center acquiring data, I can send them a batch script (similar to auto_rip) that runs various tools (RegRipper, etc.) and have them ship me the output of the tools. This allows me to start analysis while the bulk of the data is in transit, and when it shows up, I'm ready to start my focused analysis.  Or, they can acquire the data and once it's been verified, send me subsets of the data (Registry hive files, Windows Event Logs, etc.) in a secure archive, allowing me to begin analysis on a few KB of data while the full archive (several hundred GB of data) is enroute.

Not long ago, I collected the NTUSER.DAT, USRCLASS.DAT and index.dat files from three user profiles within an image.  These profiles were thought to be active during the time of the incident, so I parsed the Registry hives with RegRipper, and the index.dat files with a custom tool, and created a micro-timeline that showed me not just times of activity, but patterns of activity that I would have missed had I included all of the data (file system, WEVTX, Registry hives, other user profiles, etc.) available within the image.  The results of this analysis allowed me to then focus my analysis on the more inclusive timeline and develop a much clearer picture of the activity that was the focus of my interest.

Browser Analysis
When we hear 'browser analysis', most of us think about data sources such as index.dat files or SQLite databases, and tools like IEF.  But there are other potentially valuable data sources available to us, such as cookie files, bookmarks/favorites, and session recovery files.

If the user is using IE and you're interested in their activity during a specific point in time, you may have options available to you to get the information you're looking for.  For example, the TypedURLs key (and TypedURLsTime key, if they're using Windows 8) may prove fruitful, particularly when used in conjunction with VSCs.  If IE crashed (for whatever reason) while the user was browsing the web, you'll have the Travelog files available, and these can provide much more insight into what the user was doing than an index.dat record would.

The IE session restore files are structured storage/OLE format, and Yogesh has an EnScript available for parsing them.  I've used strings to get the data that I want, and MiTeC's Structured Storage Viewer to view the contents of individual streams within the file.  Python has a good module for parsing OLE files (I really haven't found anything that works as well in Perl, and have written some of my own stuff), and it shouldn't take too much effort to put a parser together for these files.  What's really fascinating about these files is that within a timeline, you may see where the user launched IE (UserAssist data), accessed a particular site (TypedURLs key, index.dat data), but at that point, you really can't tell too much about what they did, or what sort of interaction they had with the page, or pages, that they visited.  If you're lucky and there's a session saved in a Travelog file, then you can see what they were doing at the time of the crash.  I've seen commands sent to database servers via default stored procedures.  So, these files can be a rich source of data.

For other browsers, here's information on session restore functionality:
Chrome User Data Directory  (here's a tip for restoring the last session from the command line)
Firefox - Mozilla Session Restore

Summary
My point in all this is that while in most cases we really want to see all of the data, there are times when we either don't need everything, or as is often the case, everything simply isn't available and we have to make the best use of what we have.  For example, if I simply want to see when a user was active on a system, over time, I wouldn't need everything from the system, and I wouldn't need everything from the user profile.  All I'd need to get started are the two Registry hives, browser history files, and maybe the Jump Lists.  The total size of this data is much less than the full image, and it's even smaller if I can get someone on site to run the tools and just send me the data.

Friday, May 16, 2014

Updates

Exploit Artifacts
Corey is back with yet another of his amazing exploit artifacts blog posts!  This time around, the post has to do with Silverlight exploits from 2013; even so, this is something (providing exploit artifacts) that's been talked about for a long time, and Corey's one of the only ones (THE only one that I know of) who crosses the boundaries and self-imposed obstacles between vuln devs/pen testers and the IR community.  I know that there are others who are documenting artifacts associated with exploits with a particular focus on exploit kits, but Corey is the only one that I'm aware of that's doing as complete a job, and sharing that information publicly.

What I really like about Corey's approach is that he's targeting those areas most likely to yield results.  Looking into the USN change journal entries has been fruitful for many an investigator, particularly if they can get access to the system within a relatively short time after the incident occurs.

I also like that fact that Corey clearly documents what he did, as well as what he found.  He also points out a couple of "tidbits" that he found that require further examination and testing before they're discussed a bit more fully.

WFA 4/e Reviews
I saw recently that there is another review of WFA 4/e posted to Amazon; thanks to Daniel for sharing his thoughts!  I greatly appreciate all reviews, regardless of how they're perceived, and I really appreciate reviews like Daniel's because he took the time to actually read some of the new information that was added to this edition.

Volatility Plugin Manager
Via David Cowen's blog, it seems that there's a GUI plugin manager available for Volatility now, written by Andrew Nind.  This looks like a great idea, and I look forward to seeing what people think of it.

RegRipper Updates
I've been working on some updates to RegRipper, not so much based on community-wide input but more based on some discussions that I've had with a very few (like, 3 or 4) folks within the DFIR community.  There've been no comments (that I've seen) regarding the usefulness or value of the alerting function that was added to the tool last year.

Output Formats
Earlier this year, I exchanged some emails with Willi Ballenthin, as he had put the effort in to look at the code for RegRipper, and he came up with a means for allowing users to create their own output formats.

Ultimately, given the wide variation in available data and possible formats, I just thought that it would be easier to provide a switch to allow users to select the 'regular' default output format that is available now, or choose between CSV, TLN, or bodyfile output formats.  The switch will be incorporated into the CLI tool, and for the time being, nothing will change with GUI...the default output format will be the 'regular' default output that we see now.  However, that may change in the future, once I get the necessary modifications completed.

The move to provide bodyfile output was based solely on discussions with two people, but I hope that others will find it useful.

Adding .csv output is something that has been asked for in the past, and the caveat for this output format is that it's going to be very different across the various plugins. As anyone who's looked at the output of the plugins knows, what RegRipper extracts from the Run key is different from what it extracts from the UserAssist subkeys.

One thing I've wanted to do for sometime is consolidate plugin output formats within the plugin, rather than create new plugins for each output format.  I've found a number of plugins to be extremely valuable during targeted threat engagements, particularly when anti-forensics measures have been employed.  I've used this combination of plugins to develop mini- or even micro-/nano-timelines that have lead to significant findings with respect to intruder activity and the scope of their reach, even when other potentially valuable resources were not available.  By combining the output of these plugins from various systems using the TLN format, I'm able to see a clear progression of intruder activity across multiple systems and user accounts, and achieve a significant level of visibility.  However, as things are now, if I modify one plugin, I have to be sure to incorporate that modification into the version of the plugin that produces TLN output, if there is one.  By incorporating the various output formats into a single plugin, the entire process of maintaining the plugins is simplified.

Is there any value in incorporating the l2t_csv format, as well?

Incorporate Artifact Categories
I know that others have talked about artifact categories before, and that one variation was included in the original SANS DFIR poster that was published.  However, I've taken a slightly different approach to the identification of artifact categories, one that is more along the lines of what Corey discussed when he released auto_rip, and what I listed in the HowTo blog posts from last July.  With respect to the Registry specifically, I'm thinking that it would be very useful to group artifacts within categories, so that they're more easily understood and remembered.

Multiple Hives
Something that I've already started incorporating into plugins is combining functionality within a single plugin to query multiple hives.  Within RegRipper, this doesn't happen automatically; what this means is that there are some plugins that can be run against the NTUSER.DAT and Software hives, or as with a new plugin I wrote recently (and haven't included in the plugin archive yet) to address Adam's latest autostart discovery, the same plugin could be run against the NTUSER.DAT or System hive.

To be clear, this does not mean that RegRipper will correlate data from across multiple hives...RegRipper's foundational design doesn't allow for that.  What it means is that one plugin can be run against several hives.

Artifacts
I ran across an instance recently where Yogesh Khatri's TraveLog research proved to be very beneficial.  Someone was using IE to perform lateral movement, and we weren't sure what they were actually doing...but IE had crashed, and we were able to "see" what was visible in each tab when the browser had it's issue.  Very cool.  Thanks to Yogesh for sharing his research...you never know when you're going to be able to make good use of information like that, but it's unlikely that if he hadn't shared the information, that we would've even known to look at the files.

Tuesday, May 13, 2014

Links

OpenLiveView
Tim Vidas has posted OpenLV, an update to the popular LiveView tool that many of use have used before. When conducting an investigation, there are a number of ways to access acquired images, such as via any number of analysis frameworks (DFF, ProDiscover, Autopsy, etc.) that provide a great deal of functionality for interacting with data.  There are tools for mounting an acquired image as a read-only volume (FTK Imager, etc.), but OpenLV allows you to boot the acquired image.  This can provide a great deal of visibility into the system, allowing the investigator to see what the intruder saw, interact with the system the way the intruder interacted with it, and even verify malware autostart functionality.

Be sure to check out the DFRWS Proceedings, written by Tim, Matthew Geiger, and Brian Kaplan.

EVTXtract
The other day I was answering a question about Windows Event Log analysis, and I ran across Willi Ballenthin's tool, EVTXtract (PDF here).  This tool allows an analyst to recover deleted Windows Event Log records.  The Windows Event Log (.evtx) files follow a binary structure that's much different from the Event Log (.evt) files on Windows XP and 2003, but deleted records can apparently be recovered, at least in some cases.

ThunderBird Parser
Mari has shared her ThunderBird Parser.  Her blog post has some great information...she talks about what issue she faced and how she chose to address it by writing her own code.  Doing this not only helped her understand the underlying data on much more intimate level, but it also opened that understanding up to other analysts.

Conferences
My conference attendance changed recently, and I am no longer a member of Suzanne Widup's author panel at the SANS DFIR Summit in Austin, TX.  I was really looking forward to speaking on the panel (I've written a book or two), and discussing various topics around writing DFIR books.  In fact, we'd already started addressing some questions in my blog, and I was really looking forward to hearing and addressing other questions.

My not attending the summit has nothing whatsoever to do with any review of my book, and honestly, I'm more than a little shocked that someone would think that, let alone say it out loud to others.

Brian Carrier has opened up the call for papers for the OSDFCon, to be held in Herndon, VA, on 5 Nov. This has always been a great conference to attend (see here), and needs more practitioners to submit presentations.  In fact, I've recommended to Mari that she submit to the conference to give a presentation on the ThunderBird email parser, or any of the other tools she's written.  I've already submitted two presentation ideas.

I'm also looking for thoughts and ideas for other conferences to which I can submit to the CfP.  CEIC is out because it's already come and gone.  If anyone has any thoughts regarding a conference (or conferences) that are specific to DFIR, and include topics on addressing targeted threats, I'd greatly appreciate it if you'd comment here or drop me an email.  Thanks.

Tuesday, May 06, 2014

New Stuff

RegRipper Plugins
Corey's busy this week attending Volatility training, but last night sent me a couple of RegRipper plugins he wrote, inspired by what he was learning in the training.  He'd also sent me a third one, which I got the okay to include right after I'd posted the newest release of RegRipper, so I'm including it now.

I've added  processor_architecture.plwinevt.pl, and an updated pagefile.pl to the download archive.  However, I have not updated the appropriate profiles to include the two new plugins, nor have I changed the version number for the download.  Many thanks to Corey for sending those plugins in!  Keep 'em coming!

Malware Hiding Techniques
I've had another article posted on the Dell SecureWorks Research blog.  Part of the purpose of this post was to illustrate how sometimes we make assumptions about how malware (or other artifacts) may have ended up on the system, and there while there are times that the assumptions may be correct, when they aren't, the actual method of infection can be a game-changer. The analysis that resulted in these findings was fascinating, to say the least.

After you've read the post, something else to consider from the examples is how they circumvent protections.  For the first example, the assumption many analysts have with respect to the deployed RAT is that it gets on systems as a result of a spearphishing attack.  As such, protections against this infection vector would include email filtering and user education; however, both of these protections are obviated if the user is capable of disabling protections (AV, etc.) and installing the RAT.  As mentioned in the article, network monitoring flagged the system based on C2 communications, and efforts to install endpoint detection technologies were...again...obviated by the user.

With the second example, the malware file itself used an interesting technique to hide itself from casual view on the system, which also worked equally well against some digital analysis techniques.  The carrier file was identified as Vercuser.B, which "cleared the way" for the Poison Ivy infection, by checking for various protection mechanisms (VM, running AV software, etc.).

Something else that isn't mentioned in the blog post is that I initially ran into some analysis roadblocks, or so I thought...but after reaching out to Jamie Levy for some input, she pointed me in the right direction and the analysis went really smooth.  A good bit of what she helped me with was described in this blog post.  I didn't crowd-source this one, because to be honest, I didn't want to hear what a lot of folks thought, I wanted to hear what an expert knew.

My previous articles published to the Dell SecureWorks Research blog are here, and here.

WFA 4/e
There have been additional reviews of WFA 4/e posted on Amazon; again, thank you to everyone who's taken the time to share their thoughts...I greatly appreciate it.

There have been some discussion on social media regarding the edition number for this book.  While I understand the issue that was raised, I do not control what the publisher chooses to do with respect to numbering the editions. I did, however, get several folks that I trust to look at my outline and planned updates, and give their opinions as to the proposed content.  The book was tech edited by someone knowledgeable and known within the DFIR community. Further, I have asked for feedback on the third edition (as I have for my other books), as well as gone to the community to ask for input regarding what they'd like to see in the next edition.  In both cases, there have been very little of either.  I did receive a request to talk about "anti-forensics", but after I asked the person who asked for that to elaborate and expand on that a bit, I have yet to hear back.

I have to say, I have asked Syngress about their color scheme for the books.  Digital Forensics with Open Source Tools, Windows Forensic Analysis 2/e and 3/e, and Windows Registry Forensics all have the same color scheme, and the same shade of green.  I've been to conferences and given presentations, during which I've stated at the beginning of the presentation (when I have a copy of one of my books on the who am I slide) that the books all have the same color scheme and it confuses people.  Then, at the end of the presentation, I ask a question, offering to give away a copy of one of my books to whomever gets the right answer...and inevitably, the winner immediately states that they already have the book, only to find out that they thought they did because they only looked at the color scheme.  That happened at the USA CyberCrime Conference just last week.  So, it does confuse people.

My point is simply this...there's a great deal authors do not control when it comes to working with a publisher.  However, I have tried to address the content issue by reaching out to the community, particularly while developing the outline for the next book or edition, and I have received little input.  I tried to address one of the first questions I received regarding the content for this edition in this blog post, although that came after this book was actually published.

One thing that I hope folks consider doing before commenting or writing a review (good or bad) is actually reading the content of the book.  The two chapters at the end of the book are new material.  In the third edition, ch 8 was "Application Analysis", and this edition, it's "Correlating Artifacts", which includes information similar to what I posted to this blog in July, 2013.  Chapter 9, "Reporting", is entirely new.