Reports
Focus
Focus
Prisma AIRS

Reports

Table of Contents

Reports

Learn about AI Red Teaming reports using Prisma AIRS.
Where Can I Use This?What Do I Need?
  • Prisma AIRS (AI Red Teaming)
  • Prisma AIRS AI Red Teaming License
  • Prisma AIRS AI Red Teaming Deployment Profile
Every AI Red Teaming scan on completion will generate a report. The report will contain a Risk Score, overall metrics and all successful attack prompts along with the compromised response. Supported scans include:

Red Teaming using Attack Library Report

Red Teaming using Attack Library reports will give you the following information for each scan:
  • Overall Attack Success Rate—This chart will show the percentage of total attacks that were successful.
  • Risk Score—This is the overall risk score assigned to the AI system based on the findings of the attack library scan. It points to the safety and security risk susceptibility of the system. A higher risk score indicates that the AI system is more vulnerable to safety and security attacks. Risk Score ranges from 0-100, 0 being practically no risk and 100 being very high risk. The number of successful attacks and their severity determine the risk score.
  • Attacks by Severity—This chart will show you the split of successful attack by severity. Each attack prompt is run multiple times to test for the probabilistic nature of LLMs. Even if the same attack is successful on multiple attempts, it will be counted only once for all metrics.
  • Attacks by Category—This table and chart will show you the success rate of attacks across the categories you picked when starting the scan.
  • Attack Details—All attack prompts that have one or more compromised responses will be shown in this table along with the severity and category. You can click on "View Details" to look at all the response for that attack and ones that got marked as a compromised response.

Red Teaming using Agent Report

Red Teaming using Agent reports will give you the following information for each scan:
  • Overall Attack Success Rate—This chart will show the percentage of total attacks that were successful.
  • Risk Score—Similar to Red Teaming using Attack Library Reports, these reports also have an overall Risk Score pointing to the safety and security risk susceptibility of the AI system. The Risk Score is calculated based on the number of attack goals crafted by the agent which were successful and the number of techniques which had to be used to achieve them. The Agent always starts with simpler techniques to attack and progressively makes the attacks more sophisticated. The level of complexity that was needed for a goal to succeed is also accounted for in the risk score.
  • Goals and Attack Metrics—Next to the Risk Score you will be able to see the number of unique attack goals that the agent attempted to achieve and how many were successful. For each Goal, the agent will try multiple attack trees and the total attacks and successful number of attacks are shown as well.
  • Attack Details—In this section you will be able to see conversation that the agent has with the target in order to achieve the goal. All compromised responses are also marked in the conversation.

Red Teaming using Custom Prompt Sets Report

Red Teaming using Custom Prompt Sets reports will give you the following information for each scan:
  • Overall Attack Success Rate—This chart will display the percentage of total attacks that were successful.
  • Attack Success Rate by Prompt Set—This table will display the total prompts in the prompt set along with the number of total attacks, successful attacks, and failed attacks for selected prompt sets.
  • Successful Attacks by Severity—This chart will display you the split of successful attack by severity. Each attack prompt is run multiple times to test for the probabilistic nature of LLMs. Even if the same attack is successful on multiple attempts, it will be counted only once for all metrics.
  • Successful Attacks by Semantic Category—This chart will display the distribution of successful attacks across semantic category.
  • Attack Details—All attack prompts that have one or more compromised responses will be shown in this table along with the severity and category. View Details to look at all the response for that attack and ones that got marked as a compromised response.
The image below illustrates a custom attack report:
For each report, you can access Conversation Details:
For each report, you can access details about the scan. Select Scan Details in the upper right portion of the scan to display additional information: