: Best Practices and Recommendations
Focus
Focus

Best Practices and Recommendations

Table of Contents

Best Practices and Recommendations

Some best practices and recommendations are listed for applying performance policy SLAs when configuring performance policy.
Where Can I Use This?What Do I Need?
  • Prisma SD-WAN
  • Active Prisma SD-WAN license.
  • Physical and virtual ION devices running software version 6.3.1 and higher.
Performance Policy provides a flexible framework for the assurance of Application and Network SLAs. In this section we will review sample policy rules for several common use cases along with general guidelines for implementation. Performance Policy is supported on ION device versions 6.3.1 and higher. The following are the recommended best practices when configuring Performance Policy:
  1. Simple Policy Sets: Use simple policy stacks unless the modular flexibility of advanced stacks is required.
  2. Rule Order: As Performance Policy uses an explicit order, more specific (app match, path match, DC Group, etc) rules must be placed at the top of the policy set and less specific rules at the bottom. Any match field left empty will be considered a match all.
  3. Migration of LQM and APT thresholds from Advanced Menu: Prior to the availability of Performance Policy in 6.3.1, the configuration governing performance-based path selection was configured through the Advanced menu. As of 6.3.1 this configuration is longer used by the device and the rules must be configured in a performance policy set applied to the site.
  4. Functional Limits for Forward Error Correction (FEC) and Packet Duplication: FEC and Packet Duplication are adaptive and will only invoke when a Prisma SD-WAN VPN path exceeds the packet loss threshold specified in the SLA. As FEC or Packet Duplication is invoked, additional resources are required for processing the packet recovery information. The maximum VPNs actively encoding recovery information per platform are listed below:
    ION ModelMax VPNs BranchMax VPN DC
    10008N/A
    12008N/A
    1200-S8N/A
    20008N/A
    30001632
    32001632
    520032128
    700032128
    900064256
    920064256
    • The branch ION determines if the SLA will be met in both the inbound and outbound direction on a per path basis. In the case that inbound (from the Data Center) loss exceeds the SLA, the branch ION sends an in-band instruction attached to a packet to the Data Center ION instructing it to invoke FEC for the affected flow.
    • If the number of VPNs actively invoking FEC and Packet Duplication meets the platform limit (above) then no further VPNs will be able to encode or decode recovery information.
    • When an ION simultaneously applies Forward Error Correction (FEC) and Packet Duplication on traffic from the same VPN, this counts as a single VPN instance.
    • ION Device version 6.3.2 or higher is recommended when using Forward Error Correction.
    • ION Device version 6.4.1 or higher is required when using Packet Duplication.
  5. Policy Rule Configuration Limits: Each ION device model varies in system resources depending on the targeted use case for the appliance.
    • For Performance Policy there are two important metrics to consider; the total number of rules and the number of specific application ID that matches per rule.
    • Multiply the total number of rules by the total number of application IDs matched.
    • The table below is a reference for the maximum validated and recommended rule configurations:
      ION ModelRule CountMax Rules x Apps
      100030150
      120050250
      1200-S2001275
      2000501275
      30002551275
      32002551275
      52002551275
      90002551275
      92002551275
  6. Prerequisites: Ensure that Use LQM on non-hub paths is configured on each of the circuit categories used in the network.
    Circuit specific overrides may be configured.
  7. Application & Network Performance and Reachability Information in Prisma SD-WAN : Prisma SD-WAN uses a combination of real user traffic, reachability probes, service health probes, and link quality monitoring to form an accurate picture of the application and network performance landscape. These perspectives include:
    • Real User Traffic: Prisma SD-WAN measures numerous parameters of each application session including:
      • Init Success / Failure Rate - TCP 3-way Handshake
      • Transaction Success / Failure Rate - TCP Retransmission
      • RTT - Application Round Trip Time
      • SRT - Application Server Response Time
      • NTTn - Time for TCP Window Completion
      • DNS Transaction Time - Round Trip Time
      • Voice MOS
      • Voice and Video Packet Loss
      • Voice and Video Jitter
    • App Reachability Probe: When the system detects a 3-way handshake failure for LAN initiated traffic, the ION crafts a special synthetic probe packet to mimic the original failed TCP SYN on that specific path. If the synthetic probe fails to establish a TCP connection, the path is automatically marked as unusable due to App Unreachable for that App/Path/Prefix combination. This probe continues to generate every 1 minute to verify the application reachability status. If the probe is successful, the path is then considered for path selection for that App/Path/Prefix combination.
    • L3 Reachability: If all VPNs on a WAN interface go down and there is no inbound traffic, the ION automatically generates traffic to verify the true usability status of the circuit. By default, these endpoints are:
      • Ping 8.8.8.8
      • Ping 8.8.4.4
      • Ping 208.67.222.222
      • HTTPS GET for captive.apple.com
      • HTTPS GET for captive.google.com
      Starting from release 6.4.1, the L3 Reachability probes can optionally be configured to use the results of Service Health Probes to determine the L3 Reachability status of the circuit.
    • Standard VPN Endpoint Liveliness Probes: This is an optional configuration that enables the system to generate probes through a standard VPN tunnel after it is created. There are two types of probes:
      • ICMP
        • Interval between 1 to 30 seconds.
        • Failure Count between 3 to 300; how many consecutive failures before the Standard VPN is marked as down.
        • IP Address
      • HTTP
        • Interval between 10 to 3600 seconds.
        • Failure Count between 3 to 300; how many consecutive failures before the Standard VPN is marked as down.
        • HTTP Status Codes; A matched HTTP status code response will be considered as up. A failure to match the HTTP status code will mark the Standard VPN as down.
        • URL of the HTTP content.
    • Standard VPN IKE DPD: DPD or Dead Peer Detection is a keepalive method used to determine the liveliness of the IKE peer.
    • VPN Keep-Alives: Prisma SD-WAN VPNs utilize VPN Keep-Alives to ascertain their up/down status. The default configuration generates a Keep-Alive every second and identifies a VPN as down when it loses 3 consecutive Keep-Alives. This can be tuned to an aggressive 100 ms Keep-Alive interval with a minimum failure count of 3, resulting in 300 ms to detect a down path.
    • Link Quality Monitoring: Link Quality Monitoring (LQM) provides automatic and continuous path monitoring for Branch to Data Center and Branch to Branch Gateway VPN connections, assessing Latency, Loss, Jitter, and link MOS. LQM results are visible in the user interface and can serve as App/Network SLA criteria in Performance Policy, enabling performance-based path selection, FEC or Packet Duplication, and incident generation. LQM can be disabled at the circuit category or site circuit definition.
    • ADEM: Autonomous Digital Experience Monitoring (ADEM) provides always on monitoring for business critical applications using the ION as a remote network sensor.
    • Service Health Probes: Introduced in release 6.4.1, Service Health Probes provide the capability to configure health checks for specific endpoints and monitor performance metrics across the underlay, Prisma SD-WAN VPN overlay, and Standard VPNs. Each circuit can monitor up to 8 health probe endpoints simultaneously across all path types. The results of these health probes are monitored under the circuit health, with optional incident generation. These metrics can also influence path selection and be utilized in a performance policy rule (with failover time as low as 1000ms) under the Probe SLA type, as well as to determine the L3 Reachability status of the circuit. The supported probe configurations are:
      • HTTP/S
        • HTTP/S Transaction Time; Includes content download
        • HTTP/S Transaction Failure Rate
        • HTTP/S Code Response
        • HTTP/S Content Validation
        • HTTPS Allow Invalid Certificate
      • DNS
        • DNS Transaction Response Time
        • DNS Transaction Failure Rate
      • ICMP
        • Round-trip Latency
        • Round-trip Loss
        • Round-trip Jitter