Javascript must be enabled to download our products and perform other essential functions on the website.

Buy Now Download Free Trial

File Copy Detection Limits and Nuances

File copy detection is a hot topic today, and PA File Sight is a great solution. However, it's important to understand the limitations to the technology. Note that the discussion below assumes the optional Endpoint is installed on client computers.

Copy Operation Doesn't Exist

At the file system level, copying a file involves reading file data from a source file and writing it back out to a destination file - there is no "copy" function at this level. Tracking the file data while inside a different process' memory is not possible with today's operating systems.

Watch Input and Output

Since the file data can't be seen or tracked while it is inside a process, the best that can be done is watching what files a process reads and writes. If a file named A is read from one location and then a file named A is written out to a different location a short time later, we can assume that this was a file copy operation.

For example, if we see Explorer.exe reading \\Server\Share\FileA.txt and then writing out C:\MyFiles\FileA.txt, we would say FileA.txt was copied. The same assumptions apply to XCOPY.EXE, ROBOCOPY.EXE, the command prompt (CMD.EXE), etc. Using this technique we can assume file copying is taking place.


However, consider these scenarios:

1. A process named PKZip.exe or 7-Zip.exe that reads in \\Server\Share\FileA.txt and \\Server\Share\FileB.txt and writes out C:\MyFiles\ Since we (humans) have experience with compression applications like PKZip and 7-Zip we would assume this was also a file copy, but what if it was a process named F1234.exe and it writes out C:\MyFiles\ABCD.efg? Was that a file copy?

2. Chrome.exe reads \\Server\Shared\File.docx but then doesn't write anything back out to disk? Chrome doesn't allow external applications (like PA File Sight or others) to see what URL it is using, so it could be using, or others to exfiltrate the file.

3. Word.exe reads in \\Server\Share\File.docx and then an hour later writes out C:\MyFiles\File.docx. Was that a copy? What if the content of the file was edited before saving? What if a different save filename was used instead of File.docx?

Using the Trusted Application rules could help in these scenarios. For example:

Solution for 1 - Use application whitelisting to control which applications can run, and don't add 7-Zip.exe to the list.

Solution for 2 - Don't allow Chrome.exe, Brave.exe, FireFox.exe, Edge.exe, etc. to read .docx (and others like .xlsx) files.

Solution for 3 - Don't allow Word.exe to write .docx files to a local drive (C:, D:, etc) or an External Drive (E: - USB drive for example), thus only allowing saving back to the server.

In the end, these ideas and techniques can probably solve 95% of scenarios, but it highlights that digital content is very difficult to corral.

PAM (as we call it) is one of the best tools in our IT toolbelt and consistently helps me to stay ahead of the curve regarding our datacenter. Thanks for a great product.

Jason M., Image API, Inc., USA ionicons-v5-b