Prisma SaaS uses supervised machine learning
algorithms to sort sensitive documents into Financial, Legal and
Healthcare top-level categories for document classification and
categorization. These top-level categories may contain documents
that also classify into sub-categories, such as a financial accounting
document classifies as a sub-category to the financial top-level
The Palo Alto Networks Data Science team collects
large numbers of documents for each category that serve as the foundation
for classification. The labeled data is then split into train, test,
and verify data sets. The training data set is used to learn the
classification model, the testing data set was used to tune the
model, and the verification data set was used to evaluate the model.
the labeled training data generates features and the feature text
is tokenized into n-gram words for processing to remove stop words,
special characters, punctuations, etc. The classifier converts the
features using a vector space model and generates a high-dimension
document-feature matrix that identifies significant features to
reduce the matrix dimension. For each significant feature, Prisma
SaaS computes a term frequency-inverse document frequency (TF-IDF)
weight, and the weight is normalized to remove the effects due to
different document lengths. At the end of the data preprocessing,
labeled documents then transform into labeled feature vectors for
feeding into supervised machine learning algorithms.
detection rates for sensitive data in your organization, you can
define the machine learning data pattern match criteria to identify these
sensitive assets in your cloud apps and protect them from exposure.
By default, the machine learning category is always enabled and
is applied to all your cloud apps. To change this setting, you must
be an administrator with a Super Admin role or an Admin with access
to All Apps.
Enable or disable the machine learning data pattern.
By default, the machine learning data pattern is always
enabled. If you have Super Admin account or an Admin account with
access to All Apps, you can disable a machine learning data pattern
Enable the data pattern by clicking the on/off toggle.