Incidents and Alerts
Focus
Focus
Prisma SD-WAN

Incidents and Alerts

Table of Contents

Incidents and Alerts

Learn about the incidents and alerts managed in Prisma SD-WAN. Generate alerts and incidents when the system reaches system-defined or customer-defined thresholds or there is a fault in the system.
Where Can I Use This?What Do I Need?
  • Prisma SD-WAN
  • Prisma SD-WAN license
Prisma SD-WAN generates alerts and incidents when the system reaches system-defined or customer-defined thresholds or there is a fault in the system. You will see the Overview tab that lists the Category-wise events that are Critical, Warning, or Informational in nature. It also displays the Incidents by Priority, Your Top Incidents, and Your Top Alerts.
Use the Incidents and Alerts to troubleshoot the system.
An alert may or may not be an indication of a fault in the network. An alert is raised when the system reaches system-defined or customer-defined thresholds.
An incident is an indication of a fault in the system. Incidents are raised and cleared and vary in severity:
  • Critical—Whole or part of a network is down and requires immediate action.
  • Warning—Impacts the network and needs immediate attention.
  • Informational—Network is degraded and needs attention soon.
Use the Settings tab to Setup Incident Policies to manage event code suppression based on the specified classifications and action attributes configured. You can use incident policy rules to suppress or escalate incidents that arise during a scheduled time period. In addition, you can also change the default priority of system generated incidents to a priority level that is more aligned with your business requirements.
Learn about the incidents and alerts generated in the Prisma SD-WAN system.

Filter Alerts and Incidents

Filter and sort alerts and incidents by various parameters so that you can take appropriate action on the events that require attention. Select the Filter widget on the Troubleshooting page to filter alerts and incidents.
Filter and sort alerts and incidents based on the following criteria:
Acknowledge indicates that you are aware of the incident but may not be taking any action at this time. You Acknowledge only unresolved incidents. Acknowledging an incident enables you to display and focus on incidents that require attention. You can select one or more incidents (bulk acknowledge) for Acknowledge.
Unacknowledge indicates that you are aware of the incident but may not be taking any action at this time. You Unacknowledge only acknowledged incidents. You can select one or more incidents for Unacknowledge.
  • Filter By—Filter alerts and incidents by their status:
    • Show Resolved—Displays only resolved incidents when the fault causing the incident is removed.
    • Include Acknowledged—Displays acknowledged and unacknowledged incidents.
    • Show Only Acknowledged—Displays only acknowledged incidents.
    • Show Only Suppressed—Displays only suppressed incidents.
    • Include Suppressed—Displays suppressed and unsuppressed incidents.
      Only incidents are filtered as acknowledged and suppressed. Only Acknowledged incidents are filtered and you can unacknowledge those incidents.
  • Sort By—Sort alerts and incidents by time or severity to display the latest alerts and incidents first.
  • Sites—Sort alerts and incidents by sites to display based on:
    • Site—Name or address search.
    • Viewing—Traffic volume, initiation failure, transaction failure.
    • Site type—Branch or data center.
    • Admin state of the site—Active, monitor or disabled.
  • Severity—Sort alerts and incidents based on the following severity categories:
    • Critical—Whole or part of a network is down and requires immediate action.
    • Warning—Impacts the network and needs immediate attention.
    • Informational—Degrades the network and needs attention soon.
  • Priority—Sort alerts and incidents based on the priority level:
    • Priority 1 (P1)
    • Priority P2 (P2)
    • Priority P3 (P3)
    • Priority P4 (P4)
    • Priority P5 (P5)
  • Category—Sort alerts and incidents based on the following options:
    • Network—Indicates network faults.
    • Device—Indicates device hardware, software, interface, or registration issues.
    • Cellular—Indicates cellular issues.
    • Application—Indicates application issues.
    • Policy—Indicates policy issues.
    • Branch HA—Indicates spoke HA issues.
    • Authentication—Indicates authentication failures.
    • User ID—Indicates User ID issues.
  • Code—Sort alerts and incidents based on the alert and incident event codes.
  • Time—Sort alerts by time to display the latest alerts and incidents first.
  • Correlation ID—Correlation ID is a system-generated ID for a raised incident. An incident is associated with raise and clear states. There can be multiple incidents with the same event code in either a raised or cleared state at any given time. Using the correlation ID, you may distinguish between incidents with the same event code. When an incident is cleared, the correlation ID indicates that the specific incident is cleared. This ID is always associated with an incident even if the incident is cleared or resolved.

Event Correlation of Incidents

The event engine performs multiple functions such as incident correlation, suppression, and escalation depending on the network conditions and the administrator configured event policy rules. This improves the operational efficiency of the app-fabric by automatically correlating incidents into an event and the comprehensive event framework control granted by setting the event policies.
The controller analyzes the incoming incidents from the ION devices to determine if they are related and then it aggregates the incidents into a single incident in real time. For example, if the controller receives multiple VPN down incidents, the controller analyzes the incident in real time, determines if they are related, and generates a single Secure Fabric Link incident for the event, while suppressing the original list of incidents.