I've posted on the topic of data exfiltration before (here, etc.) but often it's a good idea to revisit the topic. After all, it was almost two years ago that we saw the first instance of ransomware threat actors stating publicly that they'd exfiltrated data from systems, using this a secondary means of extortion. Since then, we've continued to see this tactic used, along with other tertiary means of extortion based on data exfiltration. We've also seen several instances where the threat actor ransom notes have stated that data was exfiltrated but the public "shaming" sites were noticeably empty.
As long as I've been involved in what was first referred to as "information security" (later referred to as "cyber security"), data exfiltration has been a concern to one degree or another, even in the absence of clearly-stated and documented analysis goals. With the advent of PCI forensic investigations (circa 2007-ish), "data exfiltration" became a formalized and documented analysis goal for every investigation, whether the merchant asked for it or not. After all, what value was the collected data if the credit card numbers were extracted from memory and left sitting on the server? Data exfiltration was/is a key component necessary for the crime, and as such, it was assumed often without being clearly identified.
One of the challenges of determining data exfiltration is visibility; systems and networks may simply not be instrumented in a manner that allows us to determine if data exfiltration occurred. By default, Windows systems do not have a great deal of data sources and artifacts that demonstrate data exfiltration in either a definitive or secondary manner. While some do exist, they very often are not clearly understood and investigated by those who then state, "...there was no evidence of data exfiltration observed..." in their findings.
Many years ago, I responded to an incident where an employee's home system had been compromised and a keystroke logger installed. The threat actor observed through the logs that the employee had remote access to their work infrastructure, and proceeded to use the same credentials to log into the corporate infrastructure. These were all Windows XP and 2003 systems, so artifacts (logs and other data sources) were limited in comparison to more modern versions of Windows, but we had enough indicators to determine that the threat actor had no idea where they were. The actor conducted searches that (when spelled correctly) were unlikely to prove fruitful...the corporate infrastructure was for a health care provider, and the actor was searching for terms such as "banking" and "password". All access was conducted through RDP, and as such, there were a good number of artifacts populated when the actor accessed files.
At that point, data exfiltration could have occurred through a number of means. The actor could have opened a file, and taken a picture or screen capture of their own desktop...they could have "exfiltrated" the data without actually "moving" it.
Jump forward a few years, and I was working on an APT investigation when EDR telemetry demonstrated that the threat actor had archived files...the telemetry included the password used in the command line. Further investigation led us to a system with a publicly-accessible IIS web server, albeit without any actual formal web sites being served. Web server logs illustrated that the threat actor downloaded zipped archives from that system successfully, and file system metadata indicated that the archive files were deleted once they'd been downloaded. We carved unallocated space and recovered a dozen accessible archives, which we were able to open using the password observed in EDR telemetry.
In another instance, we observed that the threat actor had acquired credentials and was able to access OWA, both internally and externally. What we saw the threat actor do was access OWA from inside the infrastructure, create a draft email, attach the data to be exfiltrated to the email, and then access the email from outside of the infrastructure. At that point, they'd open the draft email, download the attachment, and delete the draft email.
When I first began writing books, my publisher had an interesting method for transferring manuscript files. They sent me instructions for accessing their FTP site via Windows Explorer (as opposed to the command line), which left remnants on the system well beyond the lifetime of the book itself.
My point is that there are a number of ways to exfiltrate data from systems, and detecting data exfiltration can be extremely limited without necessary visibility. However, there are data sources on Windows systems that can provide definitive indications of data exfiltration (i.e., BITS upload jobs, web server logs, email, network connections/pcaps in memory dumps and hibernation files, etc.), as well as potential indications of data exfiltration (i.e., shellbags, SRUM, etc.). These data sources are relatively easy (almost trivial) to check, and in doing so, you'll have a comprehensive approach to addressing the issue.