I've discussed LNK files in this blog a number of times over the years, mostly focusing on the file format. In one instance (from 5 years ago), the focus was on parsers, and specifically about determining devices that had been connected to the system, as when it comes to illicit images/videos, such things can determine the difference between "possession" and "production". This sort of information can also be used to address such topics as data exfiltration.
Last year, I started looking at LNK files (here, and then shortly thereafter, here...) from a different perspective; that of LNK files being sent as email attachments, or somehow being delivered to a "victim".
So, what? Why does this matter? Most users of Windows systems are very familiar with LNK files, or "Windows shortcuts", as they pertain to user actions on a system, or software installations. Both of these generally result in an LNK file being created. And if you follow malware write-ups or threat intel blog posts, you'll very likely see mention of some malware variant or targeted adversary creating an LNK file in a user's StartUp folder as a means of remaining persistent on the system. These LNK files are usually created on the compromised system, through the use of the local API, so other than their location, there's nothing too terribly interesting about them.
However, when an adversary sends a Windows shortcut file as an email attachment (or through some other mechanism) that LNK file contains information about their (the adversary's) system. The folks at JPCERT/CC mentioned this about a year and half ago (the folks at NViso Labs said something similar, as well); if the LNK file was created (via the MS API) on the adversary's system, then it likely contains the NetBIOS name of the system, the MAC address, the volume serial number, and the code page number (although in the research I've done so far, I have yet to find an LNK file with the code page number populated). While anyone of these values may be seen as trivial, all of them together can be employed to great effect by researchers. For example, using the NetBIOS name, MAC address, and VSN (such as illustrated in this blog post), I wrote a Yara rule that I deployed as a VirusTotal retro-hunt, in order to look for other examples; I ended up with several dozen responses in very short order. An example of such a rule is the "shellphish" Yara rule illustrated in this blog post. In the case of the "shellphish" rule, the LNK file that I initially looked at also included a SID in the Property Store Data Block, and including it in the retro-hunt served to minimize false positives (i.e., hits on files that contained the other byte sequences, but were not what I was looking for).
Using an LNK file parser available online, I parsed a LNK file with an embedded, base64-encoded Powershell command, and saw the following:
[Distributed Link Tracker Properties]
NetBIOS name: RAA6AFwAYQBhAC4Aã\ J G
Droid volume identifier: 69426764-4841-414d-0000-000000000032
Droid file identifier: e09ff242-e6d1-9945-a75d-f512f84510dc
Birth droid volume identifier: 8e3691fc-e7be-bc11-6200-1fe2f6735632
Birth droid file identifier: e09ff242-e6d1-9945-a75d-f512f84510dc
MAC address: f5:12:f8:45:10:dc
UUID timestamp: 03/17/3700 (13:54:42.551) [UTC]
UUID sequence number: 10077
The file was moved to a new volume.
Okay, so, this is very different. Usually, the "NetBIOS name" is 16 bytes, which is what my own parser looks for; however, in this case, what is the "machine ID" consumes a good bit more than 16 bytes, as illustrated in figure 1.
|Fig 1: Hex dump from LNK file; Tracker Data Block|
In this case, as you can see in figure 1, the "machine ID" is 31 bytes in size, with the first 24 bytes being "RAA6AFwAYQBhAC4AdgBiAHM", and the last 7 being all zeros. As the size of the Tracker Data Block (4 bytes at 0xca3) remains consistent at 96 bytes, the larger-than-expected machine ID presents a parsing problem. The "droids" start at offset 0xcd2, and are identical, so due to the larger machine ID, the Tracker Data Block itself "ends" part way through that section of data, at offset 0xd02. This leaves the remaining 15 bytes of the final droid (which ends at 0xd11) unparsed...notice how the math works out? The machine ID is 15 bytes larger than it should be, and as such, everything is "off" (per the MS file format documentation) by 15 bytes.
Normally, my parser returns output that appears as follows:
Machine ID : admin-pc
New Droid ID Time : Tue Sep 19 12:57:20 2017 UTC
New Droid ID Seq Num : 11190
New Droid Node ID : 08:00:27:3a:ce:83
Birth Droid ID Time : Tue Sep 19 12:57:20 2017 UTC
Birth Droid ID Seq Num: 11190
Birth Droid Node ID : 08:00:27:3a:ce:83
The above output is from an LNK file that fully meets the file format specification. During my research into some of these malicious LNK files, I added the ability to provide debugging information so that anomalies would be more apparent, rather than "hidden" behind the parsing process.
From the LNK file from which the hex dump in figure 1 was extracted, we can still extract and manually parse the droids; as the ObjectIDs are based on the UUID version 1 format (discussed in this blog post), we can use this information to our advantage in our research, extracting the node IDs (MAC addresses), etc. Interestingly enough, though, the larger machine ID doesn't seem to have an effect on the command embedded within the LNK file; having sections of the file format that do not comply with the specification doesn't seem to affect the performance of the LNK file.
I began wondering if the larger machine ID field might be a "toolmark" of some kind, an artifact of whatever process or application that had been used to create the LNK file. I also began looking at a range (I hesitate to qualify the range with "wide"...) of LNK files found online, particularly those that ran external commands (encoded Powershell commands, bitsadmin commands, etc.), as well as looking for tools, applications, or processes that would allow someone to create arbitrary, malicious LNK files. At this point in time, I haven't yet found one that produces "toolmarks" similar to what's seen in figure 1.
The machine ID/NetBIOS name, user SID, VSN, and MAC address are all artifacts from the adversary's system that remain within the LNK file when it is sent to a target. The folks at ProofPoint discussed one or two means of deploying LNK files, which could result in some very interesting artifacts. I'm sure that if enough specifically-malicious LNK files were examined, we'd be able to pull out "toolmarks" from such things as specific applications/tools or processes used to create (or modify) the LNK files, or perhaps even identify misbehaving applications that create LNK files on non-Windows systems.