Tuesday, February 28, 2012

More Win

A good friend of mine reached out to me last week, in order to ask for my assistance with something.  They had an issue that they were working on, and it turned out that Facebook chat artifacts played an important role in that issue.  Fortunately, I had taken a personal day, and had some time to devote to this interesting issue; interesting because I'd never had to parse FB chat messages before.

So, armed with a description of what my friend was looking at, and what they wanted to see with respect to the output, I started coding.  I know that Facebook chat messages are saved in JSON format, which is easily parsed via Perl.  I did have a little bit of an issue parsing the time stamp associated with the chat messages, and I reached to Andrew Case, who provided me with some valuable insight.  Apparently, the chat time stamps are in Unix epoch format, with milliseconds.  The resulting 10.3 floating point format (ten digits followed by a decimal and three more digits) is multiplied by 1000 to get a 13 digit string.  As such, the conversion of the time stamp into something readable was pretty straightforward, and Andrew was able to provide some valuable information and resources.

I had an initial script working pretty quickly, and got that script out the door to my friend in just over an hour...I then cleaned up the code a bit, added some comments and documentation, and had a cleaner version sent off about 90 minutes after I started writing code.  The script (fb.pl) is posted here.

The script was run against over 1400 Facebook chat messages that had been exported via EnCase.  The script accessed each file, with each file containing one chat message, and parsed out the pertinent data into CSV format.  The results were available in minutes...or more accurately, "about a minute".  How long would it have taken to do this by hand?  More importantly, how boring would that have been, and how much else could you have accomplished instead of doing all of that by hand?

So how is this useful?  Well, for me, I had an opportunity to work on something interesting, that helped someone else out.  I didn't have to view all of the data...nor did I want to, to be honest.  I just saw enough to allow me to write code that would parse the data.  And I have a neat little piece of code that I can reuse in the future.

My friend got to work on something else while I was writing the code, and then once they had the code, got what could have been hours of tedious, boring work that they'd've had to have done by hand done in "about a minute".  And they have code that they can reuse when similar issues pop up in the future.

Lessons
If you're faced with something that's a lot of a tedious, manual work, it's likely to be boring and result in mistakes...so automate it if you can.  Computers are really good at doing stuff...particularly boring, repetitive stuff...really fast.  People aren't.  Oh, and the code can be reused later.

If you don't know how to automate something, or if it would take you a while to figure it out, try reaching out to another analyst.  You'd be surprised at what some folks have already done, or what they're willing to do.

Get more done faster...by asking for assistance, you can offload something; in this case, that 'something' was developing an automated approach to processing large (potentially massive) amounts of data.  Getting someone's help means that you can often focus on something else while they work on that problem.

Banging your head against the wall on a problem, or "chewing on it" in an attempt to force that problem into submission is a really good way to waste a lot of time.  Sure, you get the joy of knowing you accomplished something yourself, but what's the likelihood that another analyst, perhaps even one you know, has already gone down that road, or something close to it?  Or, maybe has an idea that help you get from point A to point B faster?  After all, what do you have to loose?

Hopefully what this means is that at some point in the near future, my friend and I will be able sit down and share a moment over a nice micro-brew.

By the way, in case you were wondering...the image I included in this blog post has absolutely nothing to do with anything...I just thought it was funny!  ;-)

6 comments:

Jimmy_Weg said...

It would be interesting to see the format of the EnCase output. Evidently, it wasn't terribly user friendly, though I know of no forensic "suite" that handles Facebook chats, per se. You can carve them into virtual files, but that doesn't clean them of code or interpret dates at all (including presenting the correct date), nor does it correctly parse out message direction.

On the other hand, Internet Evidence Finder handles them exceptionaly well, though it's rather expensive. A script in 90 minutes? I'm impressed!

Anonymous said...

Harlan,

Nice insight of the power of automation on tedious tasks. I liked the fact that you showed if you don't know how to do something there will be someone else in the community that can provide valuable information and resources. Some times we forget that there are numerous collaborators in the community that can help us with difficult tasks.

Andrew Case said...

"If you're faced with something that's a lot of a tedious, manual work, it's likely to be boring and result in mistakes..."

the large amount of time is definitely mind numbing, but the mistakes are worse. One of the main reasons I try to automate everything, from analysis/parsing to a report of some type, is simply to remove the chance of human error as its so easy to copy/paste wrong or manually enter in something wrong from a tool that doesn't have export capabilities.

H. Carvey said...

Jimmy,

The output looks like (redacted):

for (;;);{"t":"msg","c":"p_","ms":[{"type":"msg","msg":{"text":
"","time":1271214259989,"clientTime":1271214259501,"msgID":""},
"from":,"to":,"from_name":"",
"to_name":"","from_first_name":"","to_first_name":""}]}

Anonymous said...

Did your friend try IEF v5 first? I've had a lot of success w/ it. www.jadsoftware.com

still interesting and cool to do on your own instead of using COTS. For those who may be inspired into attempting this. what's a good place to start learning how to attack these problems with perl?

Anonymous said...

http://www.howtogeek.com/102420/geeks-versus-non-geeks-when-doing-repetitive-tasks-funny-chart/