Learn about AI Red Teaming reports using Prisma AIRS.
Where Can I Use This?
What Do I Need?
Prisma AIRS (AI Red Teaming)
Prisma AIRS AI Red Teaming License
Prisma AIRS AI Red Teaming Deployment Profile
Every AI Red Teaming scan on completion will generate a report. The report will contain a
Risk Score, overall metrics and all successful attack prompts along with the compromised
response. Supported scans include:
In addition to the completed scan reports, AI Red Teaming also helps you to view the
Error
Logs and Scan Reports for in progress, failed, and partially completed
scans.
Red Teaming using Attack Library Report
Red Teaming using Attack Library reports will give you the following information for
each scan:
Overall Attack Success Rate—This chart will show the percentage of total
attacks that were successful.
Risk Score—This is the overall risk score assigned to the AI system based
on the findings of the attack library scan. It points to the safety and security
risk susceptibility of the system. A higher risk score indicates that the AI
system is more vulnerable to safety and security attacks. Risk Score ranges from
0-100, 0 being practically no risk and 100 being very high risk. The number of
successful attacks and their severity determine the risk score.
Attacks by Severity—This chart will show you the split of successful
attack by severity. Each attack prompt is run multiple times to test for the
probabilistic nature of LLMs. Even if the same attack is successful on multiple
attempts, it will be counted only once for all metrics.
Attacks by Category—This table and chart will show you the success rate
of attacks across the categories you picked when starting the scan.
Attack Details—All attack prompts that have one or more compromised
responses will be shown in this table along with the severity and category. You
can click on "View Details" to look at all the response for that attack and ones
that got marked as a compromised response.
Following is the sample attack library report for Brand Reputation
Risk scan category:
Red Teaming using Agent Report
Red Teaming using Agent reports will give you the following information for each
scan:
Overall Attack Success Rate—This chart will show the percentage of
total attacks that were successful.
Risk Score—Similar to Red Teaming using Attack Library Reports, these
reports also have an overall Risk Score pointing to the safety and security
risk susceptibility of the AI system. The Risk Score is calculated based on
the number of attack goals crafted by the agent which were successful and
the number of techniques which had to be used to achieve them. The Agent
always starts with simpler techniques to attack and progressively makes the
attacks more sophisticated. The level of complexity that was needed for a
goal to succeed is also accounted for in the risk score.
Goals and Attack Metrics—Next to the Risk Score you will be able to
see the number of unique attack goals that the agent attempted to achieve
and how many were successful. For each Goal, the agent will try multiple
attack trees and the total attacks and successful number of attacks are
shown as well.
Attack Details—In this section you will be able to see conversation
that the agent has with the target in order to achieve the goal. All
compromised responses are also marked in the conversation.
Red Teaming using Custom Prompt Sets Report
Red Teaming using Custom Prompt Sets reports will give you the following information
for each scan:
Overall Attack Success Rate—This chart will display the percentage of
total attacks that were successful.
Attack Success Rate by Prompt Set—This table will display the total
prompts in the prompt set along with the number of total attacks, successful
attacks, and failed attacks for selected prompt sets.
Successful Attacks by Severity—This chart will display you the split of
successful attack by severity. Each attack prompt is run multiple times to test
for the probabilistic nature of LLMs. Even if the same attack is successful on
multiple attempts, it will be counted only once for all metrics.
Successful Attacks by Semantic Category—This chart will display the
distribution of successful attacks across semantic category.
Attack Details—All attack prompts that have one or more compromised
responses will be shown in this table along with the severity and category.
View Details to look at all the response for that
attack and ones that got marked as a compromised response.
The image below illustrates a custom attack report:
For each report, you can access Conversation Details:
For each report, you can access details about the scan. Select Scan
Details in the upper right portion of the scan to display additional
information:
Partial Scan Reports
The AI Red Teaming partial scan reports provide you with enhanced
visibility and control over your AI red teaming operations when scans encounter
issues or complete with limited responses. This feature addresses common enterprise
scenarios where target AI systems may experience intermittent availability, network
connectivity issues, or rate limiting that prevents complete scan execution.
When you initiate a scan, AI Red Teaming tracks the simulation progress and
response collection separately.
It distinguishes between four completion states that determine the actions
available to you.
Failed scans occur when the simulation cannot complete due
to critical errors or target unavailability.
Partially complete scans represent successful simulations
where the target system responded to a subset of attacks above your
configured threshold.
In Progress scans that are currently executing their attack
simulation against the target system, with real-time error visibility
available on the progress page.
Complete scans indicate full simulation and response
collection without significant errors.
When the status of a scan is Partially Completed,
Generate Report which will provide you two options: to
View Error Logs and Generate
Report.
When you choose to generate reports from partially complete
scans, AI Red Teaming presents you with credit cost information before processing,
enabling you to review error logs and make informed decisions about report
generation value. Once you confirm report generation, AI Red Teaming deducts credits
according to your scan type configuration and produces a comprehensive report
clearly marked with partial completion indicators and response percentage
metrics.
Generate Report to generate a detailed report.
Error Logs
Error logs integrate directly into your existing scan workflow through two
primary access points within the AI Red Teaming interface.
During active scan execution, you can monitor real-time error
information on the Scans page, enabling you to make informed
decisions about continuing or terminating problematic scans.
After scan completion, comprehensive error logs become accessible
through the Scans page, providing detailed diagnostic information for
troubleshooting and root cause analysis.
When the status of a scan is Failed, then you can
View Error Logs from the Actions
column in the Scans page.
When the status of a scan is Partially Complete,
Generate Report which will provide you two options: to
View Error Logs and Generate
Report. Select Error Logs to review the error
logs.
You can also view the Error Logs for a scan that is In progress from the
Scans page.