About Custom Document Types
Focus
Focus
Enterprise DLP

About Custom Document Types

Table of Contents

About Custom Document Types

Learn more about how Enterprise Data Loss Prevention (E-DLP) uses custom documents you upload to prevent exfiltration of sensitive data.
Where Can I Use This?What Do I Need?
  • NGFW (Managed by Panorama or Strata Cloud Manager)
  • Prisma Access (Managed by Panorama or Strata Cloud Manager)
  • Enterprise Data Loss Prevention (E-DLP) license
    Review the Supported Platforms for details on the required license for each enforcement point.
Or any of the following licenses that include the Enterprise DLP license
  • Prisma Access CASB license
  • Next-Generation CASB for Prisma Access and NGFW (CASB-X) license
  • Data Security license
Enterprise Data Loss Prevention (E-DLP) supports upload and detection of custom documents containing intellectual property for which you want to prevent exfiltration. You can upload a custom document type to Enterprise DLP, or used a predefined document type, to classify and detect standardized documents and prevent exfiltration of sensitive data. Custom document types uploaded to Enterprise DLP are used in data profiles as match criteria and can be used along with predefined Machine Learning-based data patterns to apply additional ML-based detection algorithms complimented by confidential or sensitive data specific to your organization.
Enterprise DLP uses Indexed Document Matching and Trainable Classifiers to fingerprint and index uploaded custom documents to scan for and detect documents that completely or partially match what you have already uploaded.
  • Indexed Document Matching (IDM)—Used to fingerprint documents and create a document type for documents commonly used by your organization. Uploading multiple documents allows you to create a custom document repository that you can use in a data profile.
  • Trainable Classifiers—Supervised machine learning model that analyzes document types for classifications. As you upload more custom documents as types, Enterprise DLP is able to continuously train the ML model to accurately detect sensitive data matches to inspect for and prevent exfiltration (Positive Training Documents) and those to ignore (Negative Training Set). The upload of set of custom documents using Trainable Classifiers is referred to as a custom document model.
Using IDM and Trainable Classifiers for detection of sensitive data is powerful enables Enterprise DLP to continuously improve its detection capabilities by indexing unstructured text in your documents.
  • IDM Examples
    • Examples of different types of custom documents where IDM can be successfully applied are:
      • Standardized forms or documents specific to your business or organization
      • Patent documents
      • Specific business agreements
      • Specific intellectual property documents
    • Examples of different types of custom documents where IDM is less successful because they are too generic or not specific to your organization
      • Generic whitepapers
      • Generic datasheets
      • Image or graphic-heavy documents with little text.
  • Trainable Classifier Examples
    • Examples of different types of custom Positive Training Documents:
      • Proprietary product source code
      • Proprietary product formulas
      • Pre-release earnings, sales estimates, or accounting documents
      • Confidential marketing plans
      • Patient medical records
      • Customer purchasing documents and patterns
      • Confidential legal documents, and Merger & Acquisition documents
      • Proprietary manufacturing methods
    • Examples of different types of custom Negative Training Documents:
      • Proprietary code from open source projects
      • Non-proprietary product information
      • Details of published annual accounts
      • Published marketing collateral and advertising copy
      • Healthcare documents
      • Publicly available consumer data
      • Publicly available materials and press releases
      • Industry standards and research
For example, your organization both buys and sells software. You want to only detect instances of sensitive customer data contained in invoices for software that you sell. In this case, you can upload a copy of your organization's invoice as a custom document types for fingerprinting.
However, custom document types will be less effective if you wanted to detect receipts for software your organization purchases. This is because there is too much variance in format between the various software vendors your organization purchases from. Greater document variance results in less accurate detection of matched traffic.