Red Teaming using Custom Prompt Sets Report
Red Teaming using Custom Prompt Sets reports will give you the following information
for each scan:
- Overall Attack Success Rate—This chart will display the percentage of
total attacks that were successful.
- Attack Success Rate by Prompt Set—This table will display the total
prompts in the prompt set along with the number of total attacks, successful
attacks, and failed attacks for selected prompt sets.
- Successful Attacks by Severity—This chart will display you the split of
successful attack by severity. Each attack prompt is run multiple times to test
for the probabilistic nature of LLMs. Even if the same attack is successful on
multiple attempts, it will be counted only once for all metrics.
- Successful Attacks by Semantic Category—This chart will display the
distribution of successful attacks across semantic category.
- Attack Details—All attack prompts that have one or more compromised
responses will be shown in this table along with the severity and category.
View Details to look at all the response for that
attack and ones that got marked as a compromised response.
The image below illustrates a custom attack report:
For each report, you can access Conversation Details:
For each report, you can access details about the scan. Select Scan
Details in the upper right portion of the scan to display additional
information: