Use the following workflow to configure a Data Filtering profile. This example shows a Data Filtering profile for detecting Social Security Numbers and a custom pattern in .doc and .docx documents.
Data Filtering Configuration Example
Create a Data Filtering security profile.
Objects > Security Profiles > Data Filtering
for the profile. In this example the name is DF_Profile1 with the description
Detect Social Security Numbers
(Optional) If you want to collect data that is blocked by the filter, select the
You must set a password as described in
Step 2 if you are using the data capture feature.
(Optional) Secure access to the data filtering logs to prevent other administrators from viewing sensitive data.
When you enable this option, you will be prompted for the password when you view logs in
Monitor > Logs > Data Filtering.
Device > Setup > Content-ID.
Manage Data Protection
in the Content-ID Features section.
Set the password that will be required to view the data filtering logs.
Define the data pattern that will be used in the Data Filtering Profile.
In this example, we will use the keyword
and will set the option to search for SSN numbers with dashes (Example - 987-654-4320).
It is helpful to set the appropriate thresholds and define keywords within documents to reduce false positives.
From the Data Filtering Profile page click
drop-down. You can also configure data patterns from
Objects > Custom Signatures > Data Patterns.
For this example, name the Data Pattern signature Detect SS Numbers and add the description Data Pattern to detect Social Security numbers.
enter 3. See
Weight and Threshold Values for more details.
(Optional) You can also set
that will be subject to this profile. In this case, you specify a pattern in the custom patterns
field and set a weight. You can add multiple match expressions to the same data pattern profile. In this example, we will create a
named SSN_Custom with a custom pattern of confidential (the pattern is case sensitive) and use a weight of 20. The reason we use the term confidential in this example is because we know that our social security Word docs contain this term, so we define that specifically.
Specify which applications to filter and set the file types.
This will detect any supported application such as: web-browsing, FTP, or SMTP. If you want to narrow down the application, you can select it from the list. For applications such as Microsoft Outlook Web App that uses SSL, you will need to enable decryption. Also make sure you understand the naming for each application. For example, Outlook Web App, which is the Microsoft name for this application is identified as the application outlook-web in the PAN-OS list of applications. You can check the logs for a given application to identify the name defined in PAN-OS.
to only scan doc and docx files.
Specify the direction of traffic to filter and the threshold values.
Both. Files that are uploaded or downloaded will be scanned.
In this case, an alert will be triggered if 5 instances of Social Security Numbers exist and 1 instance of the term confidential exists. The formula is 5 SSN instances with a weight of 3 = 15 plus 1 instance of the term confidential with a weight of 20 = 35.
50. The file will be blocked if the threshold of 50 instances of a SSN and/or the term confidential exists in the file. In this case, if the doc contained 1 instance of the word confidential with a weight of 20 that equals 20 toward the threshold, and the doc has 15 Social Security Numbers with a weight of 3 that equals 45. Add 20 and 45 and you have 65, which will exceed the block threshold of 50.
Attach the Data Filtering profile to the security rule.
Policies > Security
and select the security policy rule to which to apply the profile.
Click the security policy rule to modify it and then click the
tab. In the
drop-down, select the new data filtering profile you created and then click
to save. In this example, the data filtering rule name is
Test the data filtering configuration.
If you have problems getting Data Filtering to work, you can check the Data Filtering log or the Traffic log to verify the application that you are testing with and make sure your test document has the appropriate number of unique Social Security Number instances. For example, an application such as Microsoft Outlook Web App may seem to be identified as web-browsing, but if you look at the logs, the application is
outlook-web. Also increase the number of SSNs, or your custom pattern to make sure you are hitting the thresholds.
When testing, you must use real Social Security Numbers and each number must be unique. Also, when defining Custom Patterns as we did in this example with the word confidential, the pattern is case sensitive. To keep your test simple, you may want to just test using a data pattern first, then test the SSNs.
Access a client PC in the trust zone of the firewall and send an HTTP request to upload a .doc or .docx file that contains the exact information you defined for filtering.
Create a Microsoft Word document with one instance of the term confidential and five Social Security numbers with dashes.
Upload the file to a website. Use an HTTP site unless you have decryption configured, in which case you can use HTTPS.
Monitoring > Logs > Data Filtering
Locate the log that corresponds to the file you just uploaded. To help filter the logs, use the source of your client PC and the destination of the web server. The action column in the log will show
reset-both. You can now increase the number of Social Security Numbers in the document to test the block threshold.