Content Scanning

You can use content scanning to scan files in order to detect and prevent exfiltration of data, such as credit card information, banking routing numbers and national identity numbers.

From Agent version 3.4.x, the content scanning component requires DLL files from the Microsoft redistributable package (2022). If Microsoft redistributable package, with the C:\Windows\system32\vcruntime140.dll file is not already installed on your computer, the agent bundle installation process will deploy the necessary DLL files silently, and a system restart may be necessary.

If a Microsoft redistributable package is partially installed or an older version of the package is installed, it is advisable to install the most recent package from the Microsoft website (https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170) before initiating the agent installation.

You must enable content scanning at the Realm level. (Agent RealmAdvanced SettingsInteractionsEnable Content Scanning).

From version 2.7.0,x, Content Scanning is also supported for Mac.

Some of the features are available on request. Contact your Proofpoint representative.

Content Scanning and Detector Sets

You can choose what you want to scan from the list of Detectors. For example, there are Detectors to identify Personally Identifiable Information (PII), such as name, address, credit/debit card number and Social Security number.

(Agent RealmAdvanced SettingsInteractionsDetector SetsChoose Values)

Detectors, Detector Sets and Classifiers are defined and maintained in the Data Loss Prevention application.

To use Detectors in content scanning, in the Agent Realm, you must select the Detector Set with the Detectors you want. Detector Sets are groups of detectors that are assigned to a specific Agent Realm.

You select the content you want to scan using Indicator/Detector Name. Indicator/Detector Name is an attribute of the Indicator entity and is derived from the Detector Set created in Data Loss Prevention application.

The number of Detectors you select impacts the endpoint memory and CPU. It is suggested that you enable only the Detectors you'll need.

Exact Matching Data (EDM) detectors are not supported.

CPU Resources

Scanning time impacts the endpoint CPU resources. The higher the scan time the higher the impact on CPU. You can optimize scan time with the Use of Content Scanning CPU Resources options:

  • Scan time optimized: Fastest scanning time with highest impact on CPU

  • Scan time favor: Fast scan time with high impact on CPU

  • Balanced: Long scanning time with low impact on CPU

  • CPU optimized: Longest scanning time with lowest impact on CPU

This option is configured from the Advanced Settings of the Agent Realm in the Admin application. Select InteractionEnable Content ScanningUse of Content Scanning CPU Resources.

Content Scanning Snippets

Snippets contain the matched content detected, plus 20 characters before and after. This additional information helps you understand the context of the scanned content and is useful for validation. Snippets are reported as part of Activity in Explorations.

(Agent RealmAdvanced SettingsInteractionsEnable Snippets )

Snippets might be included as metadata if Activity data is exported to a SIEM.

Content Scanning Thresholds

Thresholds and applied actions that allow you to set the limits in order to have control over user experience. You can configure what the Agent will do when Content Scanning fails because thresholds were exceeded or other content scanning related failures occurred. (See Thresholds.)

If a timeout occurs, the content scan will not be performed. You can see the reason for the timeout in the Exploration view in the Internal field.

You can select one of the following options if the timeout occurs because the file size limit was exceeded. This option is set in the Advanced Settings of the Agent Realm.

These options are currently available on request. Contact your Proofpoint representative.

  • File Size Limit: You can limit the size of a file scanned.

    If a file exceeds the file size defined in the File Size Limit option for content scanning, then content is not scanned. The action defined at the Realm level is used. You can select one of the following: 

    • Apply Action from Prevention Rule: The action defined at the Rule Level will be applied
    • Block and Assign End User Notification: The file is blocked when the file size is exceeded and you can assign a specific End User Notification to display when this occurs.

    • Prompt and Assign End User Notification: The user is prompted to provide a response when the file size is exceeded and you can assign a specific End User to Notification to display when this occurs.

  • Number of files in Bulk: You can limit the number of files scanned in a bulk copy/move. This option is available with content scanning prevention rules only.

  • Time Extraction Limit: You can set the amount of time allowed for the file text to be extracted from a file.

  • Text Analysis Time Limit: You can set the amount of time allowed for the text to be analyzed.

If you have previously enabled content scanning, you must check the new options and make sure the values are set as shown in the table below.
If content scanning fails due to file size limit, check the values in the Agent Realm are set according to the table below.
It is recommended that you use the default values.

This table describes the values.

Option Default Min Max
File Size Limit 30MB 1MB 1GB
Number of Files in Bulk 100 1 5000
Time Extraction Limit (in minutes) 3 1 10
Text Analysis Time Limit (in minutes) 3 1 10

Content Scanning and Detection Rules

Content Scanning for detection is supported for Windows from version 1.2.4.4 and for Mac Agents from version 2.0.0.142

To set up content scanning for detection rules:

  • Make sure Content Scanning is enabled (Agent RealmAdvanced SettingsInteractionsEnable Content Scanning).

  • Select the triggers to scan are turned on for the Agent Realm (Agent RealmAdvanced SettingsInteractionsScan Triggers for Detection RulesChoose Values). (See Content Scanning Detectors.)

  • Set up a detection rule with what you want to scan for.

Content Scanning Detectors

You select activity categories that will trigger a content scan for reporting with detection rules.

These settings do not control the content scanning within prevention rules. Content scanning for prevention rules depends on the DLP detector configuration within prevention rules.

  • Copy to USB: Detect Copy/Move a file (or folder with files) to a USB device. (Exit point for data exfiltration.)

  • Web File Sync: Detect Copy/Move a file (or folder with files) to one of the supported local sync folders (Dropbox, Google Drive, Box, iCloud and OneDrive). (Exit point for data exfiltration.)

  • Web File Upload: Detect file upload via browser. (Exit point for data exfiltration.)

  • Document Open: Detect when file is opened.

  • Web File Download: Detect file download via browsers. (This is an entry point, after which all downloaded files are tracked.)

  • Print: Content is scanned and printing is detected according to the detection rule you define.

Content Scanning and Prevention Rules

To set up content scanning for prevention rules:

  • Make sure Content Scanning is enabled and the triggers and detectors you want to scan are turned on for the Agent (Agent RealmAdvanced SettingsInteractionsEnable Content Scanning).

  • Set up a prevention rule with what you want to scan for.

  • Add the prevention rule to the relevant Agent Policy.

  • Assign this Agent Policy to the Agent Realm.

From Mac Agent 3.0.0.98, Content Scanning for prevention is also supported for Mac and available upon request. Contact your Proofpoint representative.

Content scanning for prevention scans source files to determine which files to block. When there is no source file available for scanning, the file will be blocked to prevent exfiltration of data.

Content scanning for prevention is supported for:

  • USB

  • Cloud Sync Folder

  • Web File Upload (Windows only)

  • Local Printer (Windows only), from Windows version 3.7: Print to File, Network Printer

  • Printer (printer name, printer type, user name) (Windows only)

To enable content scanning for prevention rules with MIP, from the Agent realm, select Advanced SettingsMIP Integration.

Content Scanning in Explorations

You can create exploration that let you review when scanned content is detected or blocked

When a file is blocked, there is an indication, Blocked next to the activity, in the Analytics application dashboards.

If a file was uploaded because of user justification in the end user notification, there is an indication, Prompted next to the activity, in the Analytics application dashboards.