Thursday, July 11, 2013

Programming and DFIR

I was browsing through an online list recently and I came across an older post that I'd written, that had to do with tools.  In it, I'd made the statement, "Tweaked my browser history parser to add other available data to the events, giving me additional context."  This brought to mind just how valuable even the smallest modicum of programming skill can be to an analyst.

This statement takes understanding data structures a step further because we're not simply recognizing that, say, a particular data structure contains a time stamp.  In this case, we're modifying code to meet the needs of a specific task.  However, simply understanding basic programming principles can be a very valuable skill for DFIR work, in general, as the foundational concepts behind programming teach us a lot about scoping, and programming in practice allows us to move into task automation and eventually code customization.

David Cowen has been doing very well on his own blog-a-day-for-a-year challenge, and recently posted a blog regarding some DFIR analyst milestones that he outlined. In this post, David mentions that milestone 11 includes "basic programming".  This could include batch file programming, which is still alive and well, and extremely valuable...just ask Corey Harrell.  Corey's done some great things, such as automating exploiting VSCs, through batch files.

My programming background goes back to the early '80s, programming BASIC on the Timex-Sinclair 1000 and Apple IIe.  In high school, I learned some basic Pascal on the TRS-80, and then in college, moved on to BASIC on the same platform.  Then in graduate school, I picked up some C (one course), some M68K
assembly, and a LOT of Java and MatLab, to the point that I used both in my thesis.  This may seem like a lot, but none of it was really very extensive.  For example, when I was programming BASIC in college, my programs included one that displayed the Punisher skull on the screen and played the "Peter Gunn theme" in the background, and another one interfaced with a temperature sensor to display fluctuations on the screen.  In graduate school, the C programming course required as part of the MSEE curriculum really didn't have us to much more than open, write to or read from, and then close a file.  Some of the MatLab stuff was a bit more extensive, as we used it in linear algebra, digital signal processing and neural network courses.  But we weren't doing DFIR work, nor anything close to it.

The result of this is not that I became an expert programmer...rather, take a look that something David had said in a recent blog post, specifically that an understanding of programming helps you put your goals into perspective and reduce the scope of the problem you are trying to solve.  This is the single most valuable aspect of programming experience...being able to look at the goals of a case, and break them down into compartmentalized, achievable tasks.  Far too many times, I have seen analysts simply overwhelmed by goals such as, "Find all bad stuff", and even when going back to the customer to get clarification as to what the goals of the case should be, they still are unable to compartmentalize the tasks necessary to complete the examination.

Task Automation
There's a lot that we do that is repetitive...not just in a single case, but if you really sit down and think about the things you do during a typical exam, I'm sure that you'll come across tasks that you perform over and over again.  One of the questions I've heard at conferences, as well as while conducting training courses, is, "How do I fully exploit VSCs?"  My response to that is usually, "what do you want to do?"  If your goal is to run all the tools that you ran against the base portion of the image against the available VSCs, then you should consider taking a look at what Corey did early in far as I can see, and from my experience, batch scripting such as this is still one of the most effective means of automating tasks such as this, and there is a LOT of information and sample code freely available on the Interwebs for automating an almost infinite number of tasks.

If batch scripting doesn't provide the necessary flexibility, there are scripting languages (Python, Perl) that might be more suitable, and there are a number of folks in the DFIR community with varying levels of experience using these don't be afraid to reach out for assistance.

Code Customization
There's a good deal of open source code out there that allows us to do the things we do.  In other cases, a tool that we use may not be open source, but we do have open source code that allows us to manipulate the output of the tool into a format that is more useful, and more easily incorporated into our analysis process.  Going back to the intro paragraph to this post, sometimes we may need to tweak some code, even if it's to simply change one small portion of the output from a decimal to hex when displaying a number.  Understanding some basic coding lets us not only be able to see what a tool is doing, but it also allows us to adjust that code when necessary.

Being able to customize code as needed also means that we can complete our analysis tasks in a much more thorough and timely manner.  After all, for "heavy lifting", or highly repetitive tasks, why not let the computer do most of the work?  Computers are really good at doing the same thing, over and over again, really why not take advantage of that?

While there is no requirement within the DFIR community (at large) to be able to write code, programming principles can go a long way toward developing our individual skills, as well as developing each of us into better analysts.  My advice to you is:

Don't be overwhelmed when you see code...try opening the code in a text viewer and just reading it.  Sure, you may not understand Perl or C or Python, but most times, you don't need to understand the actual code to figure out what it's doing.

Don't be afraid to reach out for help and ask a question.  Have a question about some code?  Reach out to the author.  Many times, folks crowdsource their questions, reaching to the "community" as a whole, and that may work for some.  However, I've had much better success by reaching directly to the coder...I can usually find their contact info in the headers of the code they wrote.  Who better to answer a question about some code than the person who wrote it?

Don't be afraid to ask for assistance in writing or modifying code.  From the very beginning (circa 2008), I've had a standing offer to modify RegRipper plugins or create custom plugins...all you gotta do is ask (provide a concise description of what's needed, and perhaps some sample data...).  That's it.  I've found that in most cases, getting an update/modification is as simple as asking.

Make the effort to learn some basic coding, even if it's batch scripting.  Program flow control structures are pretty consistent...a for loop is a for loop.  Just understanding programming can be so much more valuable than simply allowing you to write a program.


Gorecki said...


I’m a person who is looking from the outside in, meaning, I’m *just* beginning to explore the computer forensics world. Being a person with nearly 30 years of development and networking background largely windows based (as early as Windows 3.0) what I do every day within a govt facility is extract information from thousands of users and their systems assuring/enforcing security compliance and looking for bad stuff, much of strongly resembles things an examiner/analyst does, I think?

The reason I’m posting is I find it interesting that you seem to focus on the use of languages that aren’t indigenous to the Windows platform but your focus is Windows. Granted on a disk/image that isn’t the running system I can see that as I’ve found myself mounting windows partitions on Linux boxes to find/recover information as the tools available out of the box are very robust.

Examples would be built into pretty much every remotely modern system is Windows Scripting Host allowing the use of vbscript/jscript affording massive amounts of WMI (Windows Management Instrumentation) classes to dig very deeply into a system and relatively simple languages to learn. Powershell which is growing in use and extremely powerful leads me to my favorite for years now (since Beta 1) the .NET Framework. As a developer I get huge amounts of mileage as it affords some relatively easy coding as well as allowing a developer to get as deep in the weeds as they can manage. Every system that has any version of the .NET Framework Runtime on it already has a command line compiler sitting there ready to use.

So leading back to my original statement, I’m very new to this entire field in scope and trying to figure out what’s what would like to understand if there is particular reason for your preferences? In windows I write/use Windows stuff, in Linux I write/use Linux stuff, but that’s me.

Hopefully it’s clear I’m trying to figure out the lay of the land as I’m just trying to figure out if this field has something to offer me. So far my research has been relative to walking around a dark room looking for a light switch. ;)


Anonymous said...

Harlan, for a non-programmer, can you recommend a book or on-line resource to get started in batch file scripting? Something along the lines that can teach the individual to get to a place where Corey is with his recent posts about auto_rip. I understand the importance of what you Are saying and would like some guidance into the correct direction. Who better to ask other than Mr. Reg Ripper!
Thank you for your time.

H. Carvey said...


...recommend a book or on-line resource...

In all seriousness, Google. Start by considering what it is that you want to do...what are your goals? What do you want to achieve?

At the beginning of the post, I mentioned that one of the benefits of understanding programming is that you learn to break things down into discrete tasks. Start that way...what do you want to do?

From there, search for what you need.


H. Carvey said...


I originally taught myself Perl because I was working as an infosec consultant, and we had a lull in work, but the networking guys had a lot of work, and were looking for someone to code Perl.

As time passed and I began doing more work specific to Windows, I found that there were a number of libraries that afforded me access to a wide range of artifacts and structures on live Windows systems.

As you can see from some of my books, including "Perl Scripting for Windows Security", I'm adept at WMI coding, albeit via Perl.

I've stuck with Perl simply because it works for me. In the past, I started down the road of expanding my horizons to learn Python, but those efforts were overcome by events...but now, I'm back on that path.

In short, my response is I simply use what works for me.

Gorecki said...


In short, my response is I simply use what works for me.

Ah, clear as a bell!

If you the time, look for a C/C++ beginning class/tutorial. The C language is what the majority of languages syntax structure is based on. Java, Perl, PHP, javascript, C#, the list goes on and on. Once you have a grasp of the foundation it's relatively easy to transition to different languages.

Anonymous said...

While programming knowledge is useful to make tools or scripts, I would like to suggest that the benefit for the CF analyst is more fundamental.

Programming is error-prone, and it takes special effort to identify and eradicate or mitigate those errors. (Weinberg's Psychology of Computer Programming is one useful source for non-technical information on this topic.) This affects code written by an intruder, by the computer-forensic analyst himself *as*well*as* the programmer who wrote the tools this CFA is using to do his job. (This includes libraries and other code APIs.) Add to that the experience of bug fixing, and fixing bug fixes -- and with that the knowledge that software X v. 2.1 may easily do things differently than X v. 2.2
A sound knowledge of programming and the difficulties and realities associated with it brings healthy scepticism into the life of every CFA.

Further experience will probably help get passed the endianness barrier to data understanding, as well as give familiarity with systems APIs: to know where to go for information about what a computer program can (legitimately) do on the system. This affects timestamp behaviour, user and resource access rights, and much else that is of forensic importance.

Corey Harrell said...


Outstanding post and I couldn't agree more. I don't come from a programming background since networking was my focused. My initial steps into scripting was when I needed to automate a task. It took a little bit of work but since that time I think I grew so much professionally due to learning how to script.


I started with batch scripting since it's extremely easy and meet my needs. Here are a few references

Ken Pryor said...

Another great post, Harlan. I did a small amount of BASIC programming and wrote quite a few batch files back in the DOS 5-6 days, but nothing in recent times. I just seem to have a problem wrapping my head around some of the concepts. I do intend to try learning Python again sometime, but I have too many other things on my plate at the moment.