Enterprise DLP
About Enterprise DLP
Table of Contents
Expand All
|
Collapse All
Enterprise DLP Docs
About Enterprise DLP
Enterprise Data Loss Prevention (E-DLP) is a set of tools and processes to protect sensitive information
from exfiltration.
Where Can I Use This? | What Do I Need? |
---|---|
|
Or any of the following licenses that include the Enterprise DLP license
|
Enterprise Data Loss Prevention (E-DLP) is a set of tools and processes that allow you to protect
sensitive information against unauthorized access, misuse, extraction, or sharing.
Enterprise DLP is a cloud-based service that uses supervised machine learning algorithms to
sort sensitive documents into Financial, Legal, Healthcare, and other categories for
document classification to guard against exposures, data loss, and data exfiltration.
These patterns can identify the sensitive information in traffic flowing through your
network and protect them from exposure.
Enterprise DLP allows you to protect sensitive data in the following ways:
- Prevent uploads and downloads of file and non-file based traffic from leaking to unsanctioned web application—Discover and conditionally stop sensitive data leaks to untrusted web applications.
- Monitor uploads and downloads to sanctioned web applications—Discover and monitor sensitive data when it’s uploaded to sanctioned corporate applications.
To help you inspect content and analyze the data in the correct context so that you can
accurately identify sensitive data and secure it to prevent incidents, Enterprise DLP is enabled through a cloud service. Enterprise DLP supports over 1,000
predefined data patterns and 20 predefined data profiles. Enterprise DLP is
designed to automatically make new patterns and profiles available to you for use in
Security policy rules as soon they’re added to the cloud service.
- Data Patterns—Help you detect sensitive content and how that content is being shared or accessed on your network.Predefined data patterns and built-in settings make it easy for you to protect data that contain certain properties (such as document title or author), credit card numbers, regulated information from different countries (such as driver’s license numbers), and third-party DLP labels. To improve detection rates for sensitive data in your organization, you can supplement predefined data patterns by creating custom data patterns that are specific to your content inspection and data protection requirements. In a custom data pattern, you can also define regular expressions and data properties to look for metadata or attributes in the file’s custom or extended properties and use it in a data profile.
- Data Profiles—Power the data classification and monitor capabilities available on your managed firewalls to prevent data loss and mitigate business risk.Data profiles are a collection of data patterns used to scan for a specific object or type of content. To perform content analysis, the predefined data profiles have data patterns that include industry-standard data identifiers, keywords, and built-in logic in the form of machine learning, regular expressions, and checksums for legal and financial data patterns. When you use the data profile in a Security policy rule, the firewall can inspect the traffic for a match and take action.After you use the data patterns (either predefined or custom), you manage the data profiles from the Panorama™ management server or Strata Cloud Manager. You can use a predefined data profile, or create a new profile, and add data patterns to it. You then create security policies and apply the profiles you added to the policy rules you create. For example, if a user uploads a file and data in the file matches the criteria in the policy rules, the managed firewall either creates an alert notification or blocks the file upload.
Enterprise DLP generates a DLP incident when traffic matches a data
profile associated with a Security policy rule. The log entry contains detailed
information regarding the traffic that matches one or more data patterns in the data
profile. The log details enable forensics by allowing you to verify when a matched data
generated an alert notification or when Enterprise DLP blocks traffic.
You can view the snippets in the data filtering logs. By default, data masking partially
masks the snippets to prevent the sensitive data from exposure. You can completely mask
the sensitive information, unmask snippets, or disable snippet extraction and
viewing.
Data Classification with Large Language Models (LLM) and Context-Aware Machine Learning
Sensitive data exfiltration can manifest in diverse formats and traverses numerous
channels within an organization's infrastructure. Traditional data loss prevention
solutions adopt a one-size-fits-all approach to preventing exfiltration of sensitive
data that often proves insufficient for organizations aiming to ensure comprehensive
security. This creates noise and distraction; impacting your security
administrators' ability to investigate and resolve real security incidents when they
occur.
Enterprise DLP uses a various artificial intelligence (AI) and machine learning
(ML) driven methods to improve detection accuracy for different file formats and
techniques.
- Regex Data Patterns Enhanced With Large Language Models (LLM) and ML Models to Improve Detection AccuracyEnterprise DLP augments data patterns traditionally reliant on regular expression matching with ML classifiers. These data patterns undergo training using diverse data sets, using LLMs to establish ground truth. This integration significantly enhances accuracy and reduces false positives across 350+ classifiers to detect PII, GDPR, Financial, and many other categories. Predefined regex data patterns enhanced with ML capabilities marked as Augmented with ML. Additionally, users can report false positive detections against the DLP incident where the false positive detection occurred to facilitate model retraining for improved accuracy.For example, patterns like credit card numbers or bank account numbers can vary in length and pose a challenge for strict content-matching approaches, often yielding to a large number of false positive detections. In such cases all pattern matches, such as the detection of a 12-digit credit card number, undergo further processing by specialized ML models designed to comprehend the context of sensitive data occurrences. LLMs enable the generation of high-quality training and testing data, resulting in best-in-class detection accuracy.
- Predefined AI-Powered Document and Image ClassifiersEnterprise DLP uses Deep Neural Network (DNN) based document classifiers to interpret the semantics of inspected documents to analyze their context and accurately classify them across financial, healthcare, legal and source code categories of documents across all potential data loss vectors. When you enable Optical Character Recognition (OCR) you can use the predefined data patterns that are Augmented with ML, which use DNN-based models for image classification, to immediately start driving better detection accuracy across categories such as Driver’s Licenses, Passports, and National ID to protect sensitive information.
- Train Your Own AI-Powered ML ModelsYour organization might have customized documents that pose a significant risk of exfiltration. For example, Merger & Acquisition documents or proprietary source code might demand unique detection models specific to your organization. Enterprise DLP lets you train your own AI model by uploading custom document types. This allows your organization to curate an ML detection model that accurately identifies documents specific to your organization. This privacy-preserving algorithm ensures that your sensitive information isn't used to train any predefined AI-powered document type detections. All custom documents you upload to Enterprise DLP, and subsequent training of the AI-powered ML model, are specific and unique to your organization.
Additional Detection Accuracy
To further improve detection accuracy and reduce false positives, you can also
specify:
- Proximity keywords—An asset is assigned a higher accuracy probability when a keyword is within a 200-character distance of the expression. If a document has a 16-digit number immediately followed by Visa, that's more likely to be a credit card number. But if Visa is the title of the text and the 16-digit number is on the last page of the 22-page document, that's less likely to be a credit card number.Proximity keywords are not case-sensitive. Multiple proximity keywords for a single data pattern are supported.
- Confidence levels—The confidence level reflects how confident Enterprise DLP is when detecting matched traffic. Enterprise DLP determines the confidence level by inspecting the distance of regular expressions to proximity keywords.
- Low—Proximity keyword included in the custom or predefined regex data pattern isn't found within 200 characters of the regular expression match, or if a proximity keyword is included but isn't present in the inspected traffic.When the match criteria specify a Low confidence level match criteria, Enterprise DLP still inspects for up to three matches with a High confidence level.
- High—Proximity keyword included in the custom or predefined regex data pattern is within 200 characters of the regular expression match.When the match criteria specify a High confidence level match criteria, Enterprise DLP still inspects for up to three matches with a Low confidence level.
Additionally, custom data patterns that don't include any proximity keywords to identify a match always have both Low and High confidence level detections. - Basic and weighted regular expressions—A regular expression (regex for short) describes how to search for a specific text pattern and then display the match occurrences when a pattern match is found. There are two types of regular expressions—basic and weighted.
- A basic regular expression searches for a specific text pattern. When a pattern match is found, the service displays the match occurrences.
- A weighted regular expression assigns a score to a text entry. When the score threshold is exceeded, the service returns a match for the pattern.To reduce false-positives and maximize the search performance of your regular expressions, you can assign scores using the weighted regular expression builder when you create data patterns to find and calculate scores for the information that’s important to you. Scoring applies to a match threshold, and when a score threshold is exceeded, such as enough expressions from a pattern match an asset, the asset will be indicated as a match for the pattern.For more information, including a use case and best practices, see Configure Regular Expressions.