Incident Customization for Raise and Clear Conditions
Focus
Focus
Strata Cloud Manager

Incident Customization for Raise and Clear Conditions

Table of Contents

Incident Customization for Raise and Clear Conditions

Learn about how to customize incidents for raise and clear conditions.
Where Can I Use This?What Do I Need?
  • Prisma Access
The Unified Incident Framework provides detailed management over the detection, raising, and clearing of network incidents across your Prisma Access environments. This feature allows you to define custom thresholds and monitoring parameters for critical network events, ensuring that your monitoring strategy aligns with your operational requirements and service level objectives.
You can now define specific conditions that must be met before an incident is raised or cleared, including customizable time windows, event frequency thresholds, and state persistence requirements. This dynamic approach minimizes incident volume by preventing transient network anomalies from triggering unnecessary incidents, while simultaneously ensuring sustained critical issues receive immediate, high-priority attention.
For tunnel and BGP connection monitoring, you can specify how long a resource must remain in a down state before an alert is raised, and similarly, how long it must be operational before an existing alert is cleared. This helps you prevent incident fatigue by filtering out brief connectivity interruptions that may resolve automatically without intervention. For site long duration monitoring, you can customize the parameters for both raising and clearing incidents related to prolonged site capacity overutilization. For example, an incident is raised if the capacity utilization exceeds a set threshold for a specified Minimum Breach Hours Per Day over a set number of Minimum Breach Days. To clear an incident, utilization must remain below the threshold for a required amount of Minimum Breach Hours Per Day across a designated number of Minimum Days to Clear. Both conditions allow for granular control over alert sensitivity by adjusting the duration and frequency required to transition between states.
This framework uses the longest-match algorithm to determine which incident settings apply to a particular resource, enabling you to create a hierarchy of monitoring policies that range from global defaults to site-specific or tunnel-specific configurations. This hierarchical approach provides the flexibility to implement monitoring for critical infrastructure while maintaining more relaxed thresholds for less sensitive components. See Incident Setting Resolution for more information.
Here are the Prisma Access incident codes that support customization:
  • Tunnel down:
    • INC_RN_PRIMARY_WAN_TUNNEL_DOWN
    • INC_RN_SECONDARY_WAN_TUNNEL_DOWN
    • INC_RN_ECMP_TUNNEL_DOWN
    • INC_SC_PRIMARY_WAN_TUNNEL_DOWN
    • INC_SC_SECONDARY_WAN_TUNNEL_DOWN
  • Tunnel flaps:
    • INC_RN_PRIMARY_WAN_TUNNEL_FLAP
    • INC_RN_SECONDARY_WAN_TUNNEL_FLAP
    • INC_RN_ECMP_TUNNEL_FLAP
    • INC_SC_PRIMARY_WAN_TUNNEL_FLAP
    • INC_SC_SECONDARY_WAN_TUNNEL_FLAP
  • BGP down:
    • INC_RN_PRIMARY_WAN_BGP_DOWN
    • INC_RN_SECONDARY_WAN_BGP_DOWN
    • INC_RN_ECMP_BGP_DOWN
    • INC_SC_PRIMARY_WAN_BGP_DOWN
    • INC_SC_SECONDARY_WAN_BGP_DOWN
  • BGP flaps:
    • INC_RN_PRIMARY_WAN_BGP_FLAP
    • INC_RN_SECONDARY_WAN_BGP_FLAP
    • INC_RN_ECMP_BGP_FLAP
    • INC_SC_PRIMARY_WAN_BGP_FLAP
    • INC_SC_SECONDARY_WAN_BGP_FLAP
  • Site Long Duration:
    • INC_RN_SITE_LONG_DURATION_CAPACITY_EXCEEDED_THRESHOLD
    • INC_SC_SITE_LONG_DURATION_CAPACITY_EXCEEDED_THRESHOLD

Configure Incident Raise and Clear Conditions

  1. Navigate to Incidents > Incidents > Settings.
  2. Select New Custom to create a new incident setting.
  3. Enter the Setting Name and Description.
  4. Select the Product, Incident Category, Incident Subcategory, and Incident Code.
  5. After selecting the fields, if you change the product, then the other fields will be reset.
  6. Select the Object Type and the condition associated with it.
  7. Select the actions that Strata Cloud Manager has to take when the above conditions are met. Select Raise or Suppress and set the priority. Severity of the incident is derived from the incident code.
  8. Configure the raise and clear conditions of an incident.
    If an incident supports customization, you can configure the specific conditions that must be met before the incident is raised or cleared, including customizable time windows, event frequency thresholds, and state persistence requirements.
    Here is an example of the INC_RN_PRIMARY_WAN_BGP_DOWN incident that supports customization of raise and clear conditions.
    In this example, the incident is configured to be raised when the BGP status remains 'down' for a minimum of 1 minute. Conversely, the incident is cleared once the BGP status is consistently 'up' for a period of at least 5 minutes.
    Clicking Revert to default values restores the raise and clear conditions to their original settings. For instance, the default configuration dictates that this incident is raised when BGP remains down for at least 10 minutes and cleared when BGP is up for a minimum of 8 minutes.
    Here is another example of the incident: INC_RN_SITE_LONG_DURATION_CAPACITY_EXCEEDED_THRESHOLD. In this example, this incident is raised when the capacity utilization exceeds 80% for at least 1 hour per day consistently for 1 day within a 2 day evaluation window. You can customize all the parameters. The breach days should be less than or equal to the evaluation window.
    Let us take an example of the incident INC_RN_PRIMARY_WAN_TUNNEL_FLAP. This incident triggers when a primary WAN tunnel transitions from an UP to a DOWN state for a specific number of occurrences within a defined window. In this example, Strata Cloud Manager raises this incident if the tunnel flaps three times within ten minutes, counting only from when the tunnel is initially UP. You can upgrade these thresholds; for instance, a 15-minute duration allows a flap count between 3 and 9. Note that the clear time interval must always be greater than or equal to the raised time interval—if the incident raise window is increased to 18 minutes, the clear interval must be adjusted to at least 18 minutes as well.
  9. Select the notification profile.
  10. Save Setting.