Configure Autoscale for AI Runtime Security Firewalls
Focus
Focus
Prisma AIRS

Configure Autoscale for AI Runtime Security Firewalls

Table of Contents

Configure Autoscale for AI Runtime Security Firewalls

Learn how to configure autoscale for AI Runtime Security Firewalls.
Where Can I Use This?What Do I Need?
  • Prisma AIRS AI Runtime Security
Configuring Autoscaling allows your software firewalls to automatically adjust based on real-time traffic demands. This ensures your security posture remains robust during traffic surges while maintaining cost efficiency during periods of low demand.
How it Works
The firewall publishes specific performance metrics to your cloud provider (such as Amazon CloudWatch), which then triggers scaling events.
  • Scale-Out: When traffic increases and thresholds are met, new firewall instances are provisioned to handle the load.
  • Scale-In: When traffic decreases and firewalls are deactivated, the system automatically removes them from your inventory and reclaims licenses for the global pool, making them available for future scaling events.
Scaling Models
During deployment, you can choose between two scaling models:
ModelDescription
StaticMaintains a fixed number of firewall instances regardless of traffic fluctuations.
DynamicAutomatically adjusts the number of instances based on selected performance metrics, providing fine-grained control over your infrastructure.
The firewall publishes autoscaling metrics to the respective cloud. Within SCM, this implementation allows you to choose one or more autoscaling metric and the corresponding threshold to trigger scale-in, scale-out actions.
Metric
Description
Dataplane CPU Utilization (%)
Monitors dataplane CPU usage and measures the traffic load on the firewall.
Dataplane Packet Buffer Utilization (%)
Monitors dataplane buffer usage and measures buffer utilization. If you have a sudden burst in traffic, monitoring your buffer utilization allows you to ensure that the firewall does not deplete the dataplane buffer, which results in dropped packets.
GlobalProtect™ Gateway Active Tunnels
Monitors the number of active GlobalProtect sessions on a firewall deployed as a GlobalProtect gateway. Use this metric if you use this VM-Series firewall as a VPN gateway to secure remote users. Check the datasheet for the maximum number of active tunnels supported for your firewall model.
GlobalProtect Gateway Tunnel Utilization (%)
Monitors the active GlobalProtect tunnels on a gateway and measures tunnel utilization. Use this metric if you use this VM-Series firewall as a VPN gateway to secure remote users.
panSessionConnectionsPerSecond
Monitors the new connection establish rate per second.
panSessionThroughputKbps
Monitors the throughput in Kbps.
panSessionThroughputPps
Monitors the number of packets per second.
Sessions Active
Monitors the total number of sessions that are active on the firewall. An active session is a session that is in the flow lookup table for which packets will be inspected and forwarded, as required by policy.
Session Utilization (%)
Monitors the TCP, UDP, ICMP and SSL sessions that are currently active and the packet rate, new connection establish rate, and firewall throughput to determine session utilization.
SSLProxyUtilization (%)
Monitors the percentage of SSL forward proxy sessions with clients for SSL/TLS decryption.

Configure Autoscaling in Strata Cloud Manager (SCM)

For new deployments using the SCM workflow, follow these steps to enable autoscaling:
  1. Access the Workflow: Log into Strata Cloud Manager and navigate to AI Security > AI Runtime Firewall.
  2. Add Firewall: Click the Add Firewall (+) icon and select your Cloud Service Provider (AWS, Azure, or GCP).
  3. Define Parameters: Proceed through the workflow until you reach the Parameters section.
  4. Set Scaling Type: In the Firewall Scaling subsection, select Dynamic.
  5. Configure Dynamic Metrics:
    • Instance Range: Specify the minimum and maximum number of firewalls to deploy.
    • CloudWatch Namespace: Enter the custom namespace for metric reporting.
    • Update Interval: Set the frequency (1–60 minutes) for autoscaling checks.
    • Select Metrics: Choose the metrics (e.g., Dataplane CPU Utilization, Session Throughput) and thresholds that will trigger scale-in or scale-out actions.
  6. Apply and Deploy: Click Apply and complete the Terraform template generation to finalize the setup.

Managing Brownfield Deployments via API

If you have an existing (brownfield) deployment that does not rely on the current Terraform workflow, you can configure autoscaling using the Strata Cloud Manager API.
Prerequisites
  • The firewall must be successfully onboarded and attached to SCM.
  • You must have a valid OAuth 2.0 access token for API authentication.
  1. Update Candidate Configuration (POST): Use the API to define your CloudWatch or Azure Advanced metrics for a specific folder.
    • Endpoint: https://api.strata.paloaltonetworks.com/config/device/v1/autoscale.
    • Payload: Include your cloud-specific settings (namespace, timeout/update intervals) in the JSON body.
  2. Verify Settings (GET): Before making changes live, use a GET request to confirm that SCM has registered the updates correctly in the candidate configuration.
  3. Push Configuration: API updates only change the candidate configuration. You must log into the SCM interface, navigate to Configuration > Folders and Snippets, and click Push Config to make the changes operational on your firewalls.

Important Licensing Considerations

For autoscaling to properly manage licenses in brownfield AWS deployments, SCM must be able to monitor the associated resources.
  • Requirement: You must apply a specific metadata tag to your AWS Auto Scaling Group (ASG) or individual EC2 instances.
  • Tag Key/Value: paloaltonetworks.com-monitored : enable.
This tag allows the system to automatically release licenses back to your available pool whenever an instance is terminated during a scale-in event.