Supported File Types

DSPM supports a broad set of file types for content extraction, classification, and scanning. This document lists all supported formats, configuration-file coverage, and unsupported source-code types.

Standard File Types Supported

Documents & Text Files

File Types

Notes

.txt, .html, .xhtml, .xml, .pdf, .doc, .docx, .rtf, .odt, .epub, .pages, .wpt, .abw, .sxw, .wpd

Fully supported for content extraction

Spreadsheets

File Types

Notes

.xls, .xlsx, .csv, .tsv, .ods, .numbers, .xlsm, .xlt, .xltx, .sxc

Fully supported

Presentations

File Types

Notes

.ppt, .pptx, .odp, .key, .pps, .ppsx, .sxi

Fully supported

Email Files

File Types

Notes

.msg, .eml, .mbox, .pst, .ost

Metadata + message body extraction supported

Structured & Markup Data

File Types

Notes

.json, .yaml, .yml, .xml, .html, .xhtml, .rss, .atom, .svg

Common for configuration files; fully supported

Images

File Types

Notes

.jpg, .png, .gif, .tiff, .tif, .bmp, .webp, .ico, .psd, .svg

OCR/extraction supported where applicable

Archives / Compressed Files

File Types

Notes

.zip, .tar, .gz, .bz2, .7z, .jar, .rar, .xz, .lzma, .z

DSPM recursively scans supported file types inside archives

Other / Dynamic Extensions

Description

Notes

Logs and other text-based dynamic extensions

Supported when text extraction is possible

Source Code File Support

DSPM scans configuration-oriented file types for secrets, credentials, and sensitive values. It does not perform full static code analysis.

Supported (Configuration-Oriented) Source Code Formats

File Type

Description

.xml

XML configuration/markup

.yaml, .yml

YAML configuration files

.json

JSON configuration and policy files

.html

HTML files

.env

Environment variable files

Source Code Formats Not Supported

These programming-language file types are not analyzed:

Language

File Types

Python

.py

JavaScript / TypeScript

.js, .ts

Java

.java

C / C++

.c, .cpp

Go

.go

Ruby

.rb

PHP

.php

React

.jsx, .tsx

Additional Notes

  • File support is based on Apache Tika extraction capabilities.
  • Archives are scanned recursively.
  • Unsupported binary formats may be ingested but not analyzed.