Friday, April 22, 2011

Extending RegRipper (aka, "Forensic Scanner")

I'll be presenting on "Extending RegRipper" at Brian Carrier's Open Source Digital Forensics Conference on 14 June, along with Cory Altheide, and I wanted to provide a bit of background with regards to what my presentation will cover...

In '98-'99, I was working for Trident Data Systems, Inc., (TDS) conducting vulnerability assessments for organizations.  One of the things we did as part of this work was run ISS’s Internet Scanner (now owned by IBM) against the infrastructure; either a full, broad-brush scan or just very specific segments, depending upon the needs and wants of the organization.  I became very interested in how the scanner worked, and began to note differences in how the scanner would report its findings based on the level of access we had to the systems within the infrastructure.  Something else I noticed was that many of the checks that were scanned were a result of the ISS X-Force vulnerability discovery team.  In short, a couple of very smart folks would discover a vulnerability, add a means of scanning for that vulnerability via the Internet Scanner framework, and roll it out to thousands of customers.  Within fairly short order, this check can be rolled out to hundreds or thousands of analysts, none of whom have any prior knowledge of the vulnerability, nor have had to invest the time to investigate it.  This became even more clear as I started to create an open-source (albeit proprietary) scanner to replace the use of Internet Scanner, due in large part to significant issues with inaccurate checks, and the need to adapt the output.  I could create a check to be run, and give it to an analyst going on-site, and they wouldn't need to have any prior knowledge of the issue, nor would they have to invest time in discovery and analysis, but they could run the check and easily review and understand the results.

Other aspects of information security also benefit from the use of scanners.  Penetration testing and web application assessments benefit from scanners that include frameworks for providing new and updated checks to be run, and many of the analysts running the scanners have no prior knowledge of the checks that are being run.  Nessus (from Tenable) is a very good example of this sort of scanner; the plugins run by the scanner are text-based, providing instructions for the scanner.  These plugins are easy to open and read, and provide a great deal of information regarding how the checks are constructed and run.

Given all of the benefits derived from scanners in other disciplines within information security, it just stands to reason that digital forensic analysis would also benefit from a similar framework.

The forensic scanner is not intended to replace the analyst; rather, it is intended as a framework for documenting and retaining the institutional knowledge of all analysts on the team, and remove the tedium of looking for that "low-hanging fruit" that likely exists in most, if not all, exams.

A number of commercially available forensic analysis applications (EnCase, ProDiscover) have scripting languages and scanner-like functionality; however, in most cases, this functionality is based on proprietary APIs, and in some cases, scripting languages (ProDiscover uses Perl as it's scripting language, but the API for accessing the data is unique to the application). 

A scanner framework is not meant to replace the use of commercial forensic analysis applications; rather, the scanner framework would augment and enhance the use of those applications, by providing an easy and efficient means for educating new analysts, as well as "sweeping up" the "low-hanging fruit", leaving the deeper analysis for the more experienced analysts.

This scanner framework would be based on easily available tools and techniques.  For example, the scanner would be designed to access acquired images mounted read-only via the operating system (Linux mount command) or via freely available applications (Windows - FTK Imager v3.0, ImDisk, vhd/vmdk, etc.); that way, the scanner can make use of currently available APIs (via Perl, Python, etc.) in order to access data within the acquired image, and do so in a "forensically sound manner" (i.e., not making any changes to the original data).

The scanner is not intended to run in isolation; rather, it is intended to be used with other tools (here, here) as part of an overall process.  The purpose of the scanner is to provide a means for retention, efficient deployment, and proliferation of institutional digital forensic knowledge.

Some benefits of a forensic scanner framework such as this include, but are not limited to, the following:

1.  Knowledge Retention - None of us knows everything, and we all see new things during examinations.  When an analyst sees or discovers something new, a plugin can be written or updated.  Once this is done, that knowledge exists, regardless of the state of the analyst (she goes on vacation, leaves for another position, etc.).  Enforcing best practice documentation of the plugin ensures that as much knowledge as possible is retained along with the application, providing an excellent educational tool, as well as a ready means for adapting or improving the plugin.

2.  Establish a career progression - When new folks are brought aboard a team, they have to start somewhere.  In most cases, particularly with consulting organizations, skilled/experienced analysts are hired, but as the industry develops, this won't always be the case.  The forensic scanner provides an ancillary framework for developing "home grown" expertise where inexperienced analysts are hired.  Starting the new analysts off in a lab environment and having them begin learning the necessary procedures by acquiring and verifying media puts them in an excellent position to run the scanner.  For example, the analyst either goes on-site and conducts acquisition, or acquires media sent to the lab, and prepares the necessary documentation.  Then, they mount the acquired image and run the scanner, providing the more experienced analyst with the path to the acquired image and the report.

This framework also provides an objective means for personnel assessment; managers can easily track the plugins that are improved or developed by various analysts.

3.  Teamwork - In many environments, development of plugins likely will not occur in a vacuum or in isolation.  Plugins need to be reviewed, and can be improved based on the experience of other analysts.  For example, let's say an analyst runs across a Zeus infection and decides to write a plugin for the artifacts.  When the plugin is reviewed, another analyst mentions that Zeus will load differently based on the permissions of the user upon infection.  The plugin can them be documented and modified to include additional conditions.

New plugins can be introduced and discussed during team meetings or through virtual conferences and collaboration, but regardless of the method, it introduces a very important aspect of forensic analysis...peer review.

4.  Ease of modification - One size does not fit all.  There are times when analysts will not be working with full images, but instead will only have access to selected files from systems.  A properly constructed framework will provide the means necessary for accessing and scanning these limited data sets, as well.  Also, reporting of the scanner can be modified according to the needs of the analyst, or organization.

5.  Flexibility - A scanner framework is not limited to just acquired images.  For example, F-Response provides a means of access to live, remote systems in a manner that is similar to an acquired image (i.e., much of the same API can be used, as with RegRipper), so the framework used to access images can also be used against systems accessed via F-Response.  As the images themselves would be mounted read-only in order to be scanned, Volume Shadow Copies could also be mounted and scanned using the same scanner and same plugins.

Another means of flexibility comes about through the use of "idle" resources.  What I mean by that is that many times, analysts working on-site or actively engaged in analysis may be extremely busy, so running the scanner and providing the output to another, off-site analyst who is not actively engaged frees up the on-site team and provides answers/solutions in a timely and efficient manner.  Or, data can be provided and the off-site analyst can write a plugin based on that data, and that plugin can be run against all other systems/images.  In these instances, entire images do not have to be sent to the off-site analyst, as this takes considerable time and can expose sensitive data.  Instead, only very specific data is sent, making for a much smaller data set (KB as opposed to GB).


Alex said...

Great post. RegRipper is an amazing tool, and this really highlights the value of it's modularity.

Corey Harrell said...

Great post and this sheds some light on what you had in mind when you mentioned the forensic scanner in the Win4n6 group. Having the ability to scan a system (or image) could help locate artifacts of interest quicker or even provide answers to initial questions. Regripper has become one of the initial tools I run and it has dramatically reduced the amount of time I need to perform registry analysis. Extending the tool by parsing other areas on a system is a great idea and I can see how this would be beneficial to different types on investigations.

It’s too bad that I won’t be able to make the trip to the conference thus missing your presentation on it.

H. Carvey said...


I'm thinking about posting my slides...but the problem with that is that it won't show my demo, and I don't put everything in my slides (I don't read from them like a script)...they're just place holders.

I'm hoping to get to PFIC 2011, as there's another opportunity.

H. Carvey said...

I guess my biggest concern is, does any of this make sense? I mean, does what I describe in the blog post seem useful to you?

Anonymous said...

Harlan it's a great idea. I am looking forward to the release of the tool.

I am just curious how easy/difficult it is to create a new plugin?

H. Carvey said...

I am just curious how easy/difficult it is to create a new plugin?

If you know Perl, it's pretty easy. The whole point of what I was referring to in the post is that once you mount the image as a volume, it becomes just as if it's part of the file system, so all of the usual Perl stuff works, without using any proprietary APIs.

For example, opening a file is as easy as using the Perl open() function. Need it opened to read it in binary mode? No problem...binmode(). Need to get a directory listing? It's all done through the native Perl API.

I've already got several plugins working just fine, so there are already examples, just as I did with RegRipper. In fact, that's the real kicker here...this is just taking the idea of RegRipper and extending it from just single hive files to an entire image.

Other Perl modules can be added, as well. For example, let's say that you want to parse through IE index.dat files; simply use the Win32::UrlCache module.

H. Carvey said...

I am looking forward to the release of the tool.

That's something I really haven't figured out if, how, or when I would do. This is a lot of work and development, to bring something like this into use by a more general population outside of

One thing I'd been thinking of is simply releasing the engine and showing the API, and then consulting for plugin development.

We'll have to see how it goes, and what kind of interest there is in this...

Kalyan said...

Any information on when the engine will be release to public

H. Carvey said...


See my last comment just above yours...I'd appreciate your thoughts on how this might be done.

Corey Harrell said...

> I guess my biggest concern is, does any of this make sense? I mean, does what I describe in the blog post seem useful to you?

I’ve been using vulnerability scanners for years so I understand some of their features. The scanners verify the existence of vulnerabilities (missing patches, configuration errors) through the use of plugins or verifying the security configuration of a system using baselines. Extending this ability to digital forensics makes a lot of sense and I agree with the benefits you outlined. Another benefit I see is being able to reduce the amount of data thus allowing you to find your answers quicker.

Let’s take the example of determining the files accessed by a person. Having a framework to scan a system would be very useful. You could enable the plugins to parse certain registry keys in the ntuser.dat hive while at the same time performing checks on the system such as the contents of the Recent folder. On top of this, the tool could be used to scan through the VSCs which would further show the users’ activity. This would quickly reduce the amount of data in an image I would have to go through and allow me to focus on the activity of interest. This is just one example but the same concept applies to other types of investigations including malware infections, fraud cases, policy violations, etc…

I would find a lot of use in having this ability in a tool.

Anonymous said...

This idea makes a lot of sense - I've recently started work on something similar for our IR team, taking a standard set of triage data from tools (MIR and Encase Enterprise, mostly, but extensible) and having modules to compartmentalize the analysis of each type of data (registry, event log, pre-fetch, file system lists, etc). The frame work will produce a list of actionable ("known malware match") and interesting findings ("here are the scheduled jobs..."), and it will incorporate both so-called best practices ("look at data x,y,z") and specific known indicators of compromise (hashes, file lists, etc).

I'm exited to see you, and I presume Cory, taking on something similar, turning it into a community effort.


H. Carvey said...


Thanks for the comment, and for letting me know that others are thinking about this, as well as putting effort toward something like this.

Because of the flexibility, I'm taking a similar approach as you are, in that some plugins will look for specific things (indications of a Zeus infection) while others will be more general (list event ID 7035 event records with user instead of System SIDs...).

I'm exited to see you, and I presume Cory, taking on something similar, turning it into a community effort.

Not sure why you think Cory's involved in this (although that would be awesome!). Also, I'm not sure how this would be a "community effort", as RegRipper seems to have some use, but there are very, very few who've written their own plugins, and even fewer who've provided them to the community.

Anonymous said...

My mistake - I misread the beginning where you said you'd be presenting "along with".

H. Carvey said...

Cory's doing his own presentation...we're just presenting at the same conference...

Girl, Unallocated said...

I wish I could attend the conference. I am a fan (is that the right term here?) and really appreciate all your work. Like others have mentioned, I use RegRipper and it has proved one of the more useful tools in my arsenal. You have inspired this non-programmer to seriously dive into Perl scripting.

H. Carvey said...

Thanks. Too bad about the conference...not that I'd know who you are or recognize you if you did show up... ;-)

Hans Heins said...

Great idea.

It shall be important that all the scripts dealing with different artifacts are well documented and maybe it is necessary to have on-line documentation centers for the different kinds of artifacts. I think that will help developing a state of the art 'forensic analysis factory'.

Hans Heins

H. Carvey said...


While I fully agree, that's really up to the author.

In many of my RegRipper plugins, I write what it does, with a bit more detail that the short description available via plugin API itself. I also include references, particularly to applicable MS KB articles.