I ran across an interesting post yesterday on the Offensive Computing blog about YARA, a malware ID and classification framework. Interestingly enough, it ships for Linux, Windows, and as a version you can run via Python. The user manual that's available with YARA is short enough to be a quick read, and clear enough to give you a pretty good idea of how to get started using it.
Is anyone using it? How are you using it? The interesting thing is that YARA seems to use almost Snort-like rules for classifying files, which intuitively leads to some pretty incredible flexibility. For example, perhaps with some more detail as to where to look in files for some of these "signatures" and what different values can mean, YARA looks like a very good way to get one step ahead of AV products, even though we'd still be one step behind the malware itself. What a lot of folks are seeing these days is that their commercial grade AV products are capable of protecting themselves from variants A through F of a particular piece of malware, but then they get hit with variant G, or AB.
While YARA won't quarantine or delete the malware, it will help you classify it.
Another means of using YARA would be in conjuction with some of the new modules for Volatility that allow you to extract executable images from memory dumps. Extract the image, create a signature, and share it. You never know who else is using Volatility (or any of the other memory analysis tools) that may run across something similar.
Finally, I thought about perhaps turning YARA around and using it as means of going beyond file hashes. It's very hard to keep up with the latest versions of file hashes, particularly when so many things can change when MS releases patches. Using YARA, perhaps we can extend file signature analysis, and use this to perform data reduction...instead of asking for all "bad files" and relying on a perhaps incomplete list of rules, we could ask for all "good files" and then look at what's left over...
Thoughts?
Addendum: It looks like Jamie over at Mandiant is going to be doing something similar to Yara with Memoryze, by adding the ability to use Snort rules to detect malware in memory. Speaking of malware analysis, check out ZeroWine...this looks REALLY cool!
Harlan,
ReplyDeleteAre you reading my email again?
I agree with you on using Snort-like rules for classification. We need faster ways to filter/classify all the data.
Peter has been doing some work in this area for several months. Our first goal is to classify something in memory as good or bad. To do this, we are leveraging what people already know how to use. For example, when classifying certain parts of a file we use PEiD's public database. When it comes to memory, we are going to release a tool at Blackhat DC to use Snort signatures as a filter for strings in memory. You can read about that here: http://blog.mandiant.com/archives/133
Now I am going to go change all my passwords. ;-)
How would you determine known good programs using YARA? I'd be interested in knowing if an exe is packed, considering how common it is for malware, while at the same time it doesn't seem to be too common for legitimate software.
ReplyDeleteHow would you determine known good programs using YARA?
ReplyDeleteThe same way you would to find the bad stuff...write rules.
I'd be interested in knowing if an exe is packed, considering how common it is for malware, while at the same time it doesn't seem to be too common for legitimate software.
That's pretty easy, in a number of ways. This is described in Chapter 6, Executable File Analysis, of my book, "Windows Forensic Analysis".
Jamie,
ReplyDeleteThe only issue I see with this type of approach is, who writes and maintains rules? How do you know how valid the rules are? I'm not saying that there should be on main repository...I'm saying that this could become an issue where there are rules posted that do not have...shall we say, the "rigor" put into their development and they've ineffective. Ineffective tools can quickly end up gathering dust on the shelf.
In a lot of ways, I see this going something like this...some consultant company has a handful of consultants that use the tool, and one or two folks who maintain the signatures. They stay plugged into the community so that some information is shared but no one shares 100%. Every now and then, someone may see what tools the consultant is using, think, "oh, cool" and give it a shot themselves...but beyond that, I'm afraid that a lot of very useful tools end up not being deployed where they need to be due to that initial knowledge hurdle.
We can work to make it NOT happen, though...
Yes Harlan, but exactly what kind of rules? There are obviously far more legitimate files than malicious, so I would think you'd have your work cut out for you. That is unless there was something unique to good files. The only thing I can think of is searching for files with strings like:
ReplyDeleteCompanyName
Microsoft Corporation
Is that what you had in mind, or is there some other efficient (and perhaps more reliable) way to detect known good files with YARA? Or are you going to save that for your WFA2. :)
Anonymous,
ReplyDeleteCan I call you "Anonymous"?
There's really so much more to it than that. If you understand the PE file format, there are other things, too. What about Entry Point Analysis, such as what's done w/ PeID, as Jamie mentioned?
The more checks you have, the fewer false positives you have. Sure, you can search for compressed EXEs using section names like UPX0 and UPX1. But you can also weed out legitimate code using things like entry point analysis, etc.
I'm glad to see some discussion about YARA here :)
ReplyDeleteRegarding to the topic of packer detection, one thing that I have in mind is to port PEiD's public database to YARA. It would be very easy to do, let's see if I manage to make some time.