WildFire Appliance Cluster Management

To manage a WildFire appliance cluster, you need to know the capabilities of clusters and management recommendations.
Category
Description
Cluster operation and configuration
Configure all cluster nodes identically to ensure consistency in analysis and appliance-to-appliance communication:
  • All cluster nodes must run the same version of PAN-OS (PAN-OS 8.0.1 or later). Panorama must run the same software version as the cluster nodes or a newer version. Firewalls can run the same software versions that enable them to submit samples to a WildFire appliance. Firewalls do not require a particular software version to submit samples to a WildFire appliance cluster.
  • Cluster nodes inherit their configuration from the controller node, with the exception of interface configuration. Cluster members monitor the controller node configuration and update their own configurations when the controller node commits an updated configuration. Worker nodes inherit settings such as content update server settings, WildFire cloud server settings, the sample analysis image, sample data retention time frames, analysis environment settings, signature generation settings, log settings, authentication settings, and Panorama server, DNS server, and NTP server settings,
  • When you manage a cluster with Panorama, the Panorama appliance pushes a consistent configuration to all cluster nodes. Although you can change the configuration locally on a WildFire appliance node, Palo Alto Networks does not recommend that you do this, because the next time the Panorama appliance pushes a configuration, it replaces the running configuration on the node. Local changes to cluster nodes that Panorama manages often cause Out of Sync errors.
  • If the cluster node membership list differs on the two controller nodes, the cluster generates an Out of Sync warning. To avoid a condition where both controller nodes continually update the out-of-sync membership list for the other node, cluster membership enforcement stops. When this happens, you can synchronize the cluster membership lists from the local CLI on the controller and controller backup nodes by running the operational command request high-availability sync-to-remote running-configuration. If there is a mismatch between the primary controller node’s configuration and the configuration on the controller backup node, the configuration on the primary controller node overrides the configuration on the controller backup node. On each controller node, run show cluster all-peers and compare and correct the membership lists.
  • A cluster can have only two controller nodes (primary and backup); attempts to locally add a third controller node to a cluster fail. (The Panorama web interface automatically prevents you from adding a third controller node.) The third and all subsequent nodes added to a cluster must be worker nodes.
  • A characteristic of HA configurations is that the cluster distributes and retains multiple copies of the database, queuing services, and sample submissions to provide redundancy in case of a cluster node failure. Running the additional services required to provide redundancy for HA has a minimal impact on throughput.
  • The cluster automatically checks for duplicate IP addresses used for the analysis environment network.
  • If a node belongs to a cluster and you want to move it to a different cluster, you must first remove the node from its current cluster.
  • Do not change the IP address of WildFire appliances that are currently operating in a cluster. Doing so causes the associated firewall to deregister from the node.
Cluster data retention policies
Data retention policies determine how long the WildFire appliance cluster stores different types of samples.
  • Benign and grayware samples—The cluster retains benign and grayware samples for 1 to 90 days (default is 14).
  • Malicious samples—The cluster retains malicious samples for a minimum of 1 day (default is indefinite—never deleted). Malicious samples may include phishing verdict samples.
Networking
No communication between WildFire appliance clusters is allowed. Nodes communicate with each other within a given cluster, but do not communicate with nodes in other clusters.
All cluster members must:
  • Use a dedicated cluster management interface for cluster management and communication (enforced in Panorama).
  • Have a static IP address in the same subnet.
  • Use low-latency connections between cluster nodes. The maximum latency for a connection should be no greater than 500 ms.
Dedicated cluster management interface
The dedicated cluster management interface enables the controller nodes to manage the cluster and is a different interface than the standard management interface (Ethernet0). Panorama enforces configuring a dedicated cluster management interface.
If the cluster management link goes down between two controller nodes in a two-node configuration, the controller backup node services and sample analysis continue to run even though there is no management communication with the primary controller node. This is because when the cluster management link goes down, the controller backup node does not know if the primary controller node is still functional, resulting in a split-brain condition. The controller backup node must continue to provide cluster services in case the primary controller node is not functional. When the cluster management link is restored, the data from each controller node is merged.
DNS
You can use the controller node in a WildFire appliance cluster as the authoritative DNS server for the cluster. (An authoritative DNS server serves the actual IP addresses of the cluster members, as opposed to a recursive DNS server, which queries the authoritative DNS server and passes the requested information to the host that made the initial request.)
Firewalls that submit samples to the WildFire appliance cluster should send DNS queries to their regular DNS server, for example, an internal corporate DNS server. The internal DNS server forwards the DNS query to the WildFire appliance cluster controller (based on the query’s domain). Using the cluster controller as the DNS server provides many advantages:
  • Automatic load balancing—When the cluster controller resolves the service advertisement hostname, the host cluster nodes are in a random order, which has the effect of organically balancing the load on the nodes.
  • Fault tolerance—If one cluster node fails, the cluster controller automatically removes it from the DNS response, so firewalls send new requests to nodes that are up and running.
  • Flexibility and ease of management—When you add nodes to the cluster, because the controller updates the DNS response automatically, you don’t need to make any changes on the firewall and requests automatically go to the new nodes as well as the previously existing nodes.
Although the DNS record should not be cached, for troubleshooting, if the DNS lookup succeeds, the TTL is 0. However, when the DNS lookup returns NXDOMAIN, the TTL and “minimum TTL” are both 0.
Administration
You can administer WildFire clusters using the local WildFire CLI or through Panorama. There are two administrative roles available locally on WildFire cluster nodes:
  • Superreader—Read-only access.
  • Superuser—Read and write access.
Firewall registration
WildFire appliance clusters push a registration list that contains all of the nodes in a cluster to every firewall connected to a cluster node. When you register a firewall with an appliance in a cluster, the firewall receives the registration list. When you add a standalone WildFire appliance that already has connected firewalls to a cluster so that it becomes a cluster node, those firewalls receive the registration list.
If a node fails, the connected firewalls use the registration list to register with the next node on the list.
Data Migration
To provide data redundancy, WildFire appliance nodes in a cluster share database, queuing service, and sample submission content, however the precise location of this data depends on the cluster topology. As a result, WildFire appliances in a cluster undergo data migration or data rearrangement whenever topology changes are made. Topology changes include adding and removing nodes, as well as changing the role of a pre-existing node. Data migration can also occur when databases are converted to a newer version, as with the upgrade from WildFire 7.1 to 8.0.
Data migration status can be viewed by issuing status commands from the WildFire CLI. This process can take several hours depending on the quantity of data on the WildFire appliances.

Related Documentation