Enterprise DLP
Enable Optical Character Recognition
Table of Contents
Expand All
|
Collapse All
Enterprise DLP Docs
Enable Optical Character Recognition
Enable optical character recognition (OCR) to scan images for sensitive information
with Enterprise Data Loss Prevention (E-DLP).
On May 7, 2025, Palo Alto Networks is introducing new Evidence Storage and Syslog Forwarding service IP
addresses to improve performance and expand availability for these services
globally.
You must allow these new service IP addresses on your network
to avoid disruptions for these services. Review the Enterprise DLP
Release Notes for more
information.
| Where Can I Use This? | What Do I Need? |
|---|---|
|
Or any of the following licenses that include the Enterprise DLP license
|
Strengthen your security posture to prevent accidental data misuse, loss, or theft by
enabling optical character recognition (OCR) for Enterprise Data Loss Prevention (E-DLP). Enabling
OCR allows Enterprise DLP to scan files with images containing sensitive
information that match your Enterprise DLP data profiles.
Enterprise DLP supports detection of sensitive data in images containing
alphanumeric English characters, including non-English languages written in
alphanumeric English characters. Additionally, Enterprise DLP supports
inspection of image files where the image includes handwritten text and inspection
of images scanned by another device. Enterprise DLP measures images using
pixels and achieves optimal performance when the image resolution is 50 x 50 - 1,800
x 1,800 pixels.
Due to the nature of OCR technology, detection accuracy relies on the quality of the
image being inspected. The image properties and clarity have a direct impact on Enterprise DLP detection accuracy that can result in both false positive and
false negative detections. OCR detection works best when images have:
- High Resolution and Pixel Density (DPI)—Lack of image clarity can prevent the accurate distinguishing or identification of characters.Example—cl being interpreted as d or the letters o and O being interpreted as a zero (0).
- No Image Noise and Artifacts—Random speckles, dust, or digital grain often found in scanned documents or compressed JPEGs can distort how Enterprise DLP interprets characters in images.Example—Black speckles or dust next to a P character causing it to look like an R or a period (.) looking like a comma (,). Digital
- High Contrast and No Background Interferences—OCR detection works best with high contract images where the text being inspected clearly stands out from its background. Enterprise DLP can't effectively scan image text when there is a colored background, watermarks, or poor lighting.Example—Background and text color are too similar. Shadows and glares obfuscating text in an image.
- No Image Skew and Orientation Distortion—OCR scans in horizontal and vertical rows. Image skews and orientation distortions can cause row misalignment that disrupts the logical flow of data within the image. This can result in text being cut and combined in unexpected ways. Skews and distortions greater than 15° typically result in detection issues.Example—Image of a credit card at a 45° angle might prevent Enterprise DLP from accurately detecting the full credit card number.
| OCR Support | |
|---|---|
|
Maximum Image Size
|
10 MB
|
|
File Inspection Limitations
|
Enterprise DLP inspects the first 5 images per inspected
file.
|
|
Supported Image File Types
|
Supports the OCR inspection for all supported
Image File Types.
|
|
Supported File Types
| |
Enterprise DLP does not support OCR for Microsoft Visio XML drawing (.vdx)
files that need rendering to display. For example, OCR can't inspect a .vdx file
if the XML is the drawing representation.
- Log in to Strata Cloud Manager.Select ConfigurationData Loss PreventionDetection MethodsOptical Character Recognition.Enable Optical Character Recognition (OCR).You can enable OCR for Prisma Access, NGFW, SaaS Security, Email DLP, and Endpoint DLP.