Understanding DSPM Scanning Compute Costs

Overview

When DSPM scans your environment, it deploys short-lived compute resources directly into your cloud infrastructure to perform the scan. These resources run inside your own account and are billed to you by your cloud provider as part of your normal IaaS usage. This document explains why that compute exists, what affects how long it runs, and how to monitor those costs.

Why DSPM Creates Compute in Your Environment

DSPM scans data at rest (files, database tables, object storage, and file shares) by reading and classifying content to identify sensitive data. To do this without your data leaving your environment, DSPM deploys temporary scanner infrastructure within your own cloud account.

This is by design. Your data never transits to Proofpoint’s infrastructure, classification work happens inside your environment, and only metadata (findings, inventory records, and risk signals) is returned to DSPM.

Scanner infrastructure is provisioned at the start of a scan and deprovisioned when the scan completes. You are only billed for the time those resources are running.

What Resources Are Created

The resources created depend on your cloud provider and environment configuration:

  • AWS:  EC2 instances, Lambda functions, and supporting networking resources provisioned in your AWS account.
  • Azure:  Container Apps instances running in a dedicated resource group within your Azure subscription.
  • GCP:  Google Compute and Cloud Run functions provisioned within your Google cloud project.
  • On-premises:  A persistent scanner container deployed in your environment. Runtime costs depend on your infrastructure model.

If you have configured AWS, Azure, or GCP as your sidecar provider, you will also host scanner containers in that cloud account for SaaS application scanning.

All cloud resources are tagged and grouped so they can be tracked separately from the rest of your cloud spend.

How This Appears on Your Bill

Scanning compute shows up as standard IaaS charges from your cloud provider, the same as any other workload running in your account. There is no separate DSPM line item; charges appear under the resource types that were provisioned (e.g., Container Apps, EC2, Lambda, VPC egress).

Because DSPM resources are consistently named and tagged, you can isolate them in your cloud provider’s cost tooling. The following step-by-step guides for Azure and AWS walk through how to set up monitoring and alerts so you are notified before spend reaches a threshold you define:

What Drives Your Costs

Compute cost is a function of how long the scanners run, which is determined by how much work they need to do. Two categories of variables drive this: the data itself and your scan configuration.

Your Data

Factor

Why It Matters

Cost Impact

Total files

Each file requires processing time.

Linear

File sizes

Larger files take longer to process.

Low–Moderate

File types

PDFs and images with OCR require more processing than plain text. Structured formats like .xls are also more resource-intensive.

High

Daily changes

More changes mean more incremental scan volume.

Linear

Your Configuration

Setting

Trade-off

Cost Impact

Full scan frequency

More frequent scans provide fresher data but increase cost.

High

OCR enabled

Scans text in images and scanned documents but adds significant processing time.

High

Memory allocation

Higher memory speeds up scans but increases the per-second cost.

Varies

AI-based classifiers

More accurate than pattern-based detection but more compute-intensive.

High

Concurrent workers

More workers finish scans faster but consume more compute simultaneously.

Varies

Scan scope

Filtering by path, file type, or date reduces total work done.

Reduces cost

Infrastructure Factors

The following factors are outside your direct DSPM configuration but can still affect how long compute runs.

  • Source system throughput:  Scan speed is bounded by how quickly data can be read from the source. Throttling by the data source (e.g., Microsoft Graph API rate limits for M365) can extend the wall-clock time that compute resources remain active.
  • Source system load:  On-premises file shares or databases under heavy load may respond more slowly, extending scanner runtime.

Cost Is Based on Runtime, Not Data Volume

A common assumption is that scanning cost scales linearly with the number of files or total data volume. That is not how DSPM scanning cost works.

Cost is determined by how long the compute resources run, not by a per-file or per-TB metric. This means:

  • A scan of fewer, larger files with OCR enabled may cost more than a scan of many small plain-text files.
  • Two scans of the same file count but different content types can have meaningfully different costs.
  • Scan configuration choices (AI classifiers, OCR, full vs. incremental) have a larger impact on cost than raw file count alone.

There is no fixed cost calculator for DSPM scanning because the combination of data characteristics and configuration choices is too variable to produce a reliable estimate in advance. The best way to understand your scanning costs is to run an initial scoped scan, observe the actual charges in your cloud provider’s billing tools, and use that baseline to calibrate future scan configurations.

Managing and Monitoring Costs

To stay informed of scanning costs and avoid surprises:

Recommendation

Details

Set up budget alerts before your first large scan

Configure a budget alert scoped to the DSPM resource group or tag so you are notified if charges approach a threshold you define.

Start with a scoped scan

For large environments, run an initial scan against a subset of data (a single site, folder path, or data store) before scanning the full environment. This gives you a cost baseline before committing to full scope.

Review scan configuration choices

OCR and AI classifiers are the highest-cost options. Evaluate whether they are needed for every data store or scan type.

Use incremental scans where possible

After an initial full scan establishes your baseline inventory, incremental scans only process new or changed files and typically require substantially less compute.

Monitor post-scan

Review your cloud provider’s cost tools after each scan to understand what drove the charges and whether configuration adjustments are warranted.

Please refer to the following cloud budget guides: