Saturday, 12 May 2012

Gratuitous ARP

For many years I knew that in order to start sending traffic to some destination on the network in most cases a host needs to resolve IP address to a physical address. And of course saw ARP packets in Wireshark captures, but didn't pay to much attention to them. And only a couple of years ago some I started asking some questions about the way ARP works.

I was looking into different Network Load Balancing and HA solutions, software ones like BalanceNG and hardware ones like old Nortel Alteons. Most of them are based on playing with MAC addresses transitioning, virtual MACs and virtual IPs. The details of the router redundancy protocols are not relevant in this case, what's important is that after a failover has occurred traffic destined to a failed node's MAC address will not reach the destination. The hosts should somehow re-resolve the MAC for the virtual IP. And as you know hosts keep an ARP cache which is there for a reason. So how does the failover occur almost seamlessly then? That was the question I had at the time.

The answer is a Gratuitous ARP packet. In simple terms this broadcast packet refreshes hosts' ARP cache and allows to immediately switch from using old MAC for some IP to a new MAC. In this post I will mostly collect quotes from standards related to the gratuitous ARP packets.

A Gratuitous ARP [23] is an ARP packet sent by a node in order to spontaneously cause other nodes to update an entry in their ARP cache. A gratuitous ARP MAY use either an ARP Request or an ARP Reply packet. In either case, the ARP Sender Protocol Address and ARP Target Protocol Address are both set to the IP address of the cache entry to be updated, and the ARP Sender Hardware Address is set to the link-layer address to which this cache entry should be updated. When using an ARP Reply packet, the Target Hardware Address is also set to the link-layer address to which this cache entry should be updated (this field is not used in an ARP Request packet).
In either case, for a gratuitous ARP, the ARP packet MUST be transmitted as a local broadcast packet on the local link. As specified in [16], any node receiving any ARP packet (Request or Reply) MUST update its local ARP cache with the Sender Protocol and Hardware Addresses in the ARP packet, if the receiving node has an entry for that IP address already in its ARP cache. This requirement in the ARP protocol applies even for ARP Request packets, and for ARP Reply packets that do not match any ARP Request transmitted by the receiving node [16].
RFC 2002
ARP packet structure (IPv4 over Ethernet)

I guess a short paragraph about the ARP protocol won't hurt. So ARP protocol is an Address Resolution Protocol that is used in every Ethernet network carrying IPv4 packets. Strictly speaking it can be used for different L2 and L3 protocols, but the point is that it an analogue of DNS for resolving L3 addresses into L2 addresses. Note that ARP is not used for IPv6, there's Neighbour Discovery Protocol that serves a similar purpose).

ARP packets can be of two types: request and replies, the type is specified in the op of the packet (see the packet structure diagram above). When a host needs to resolve an IP address to MAC address, it sends a broadcast frame with ARP request asking a host with a given IP address to reply with its MAC. The target host replies with an ARP reply providing it's MAC address. Very simple, and at the same time very vulnerable. But that's a different story

So the most important packet sections that are used for the exchange are Sender Hardware Address (sha), Sender Protocol Address (spa), and the same pair of fields for the Target: tha and tpa

Gratuitous ARP packet (or sometimes it's called ARP Announcement) is normally an ARP request packet with tha := 00:00:00:00:00:00 and spa == tpa, where spa is set to the IP address that now should be updated in all network hosts arp cache. In the Wireshark packet details below my router ( sends an announcement to all hosts on the network that it's MAC address is 00:26:44:84:7e:4a.

Frame 770: 42 bytes on wire (336 bits), 42 bytes captured (336 bits)
Ethernet II, Src: ThomsonT_84:7e:4a (00:26:44:84:7e:4a), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
    Destination: Broadcast (ff:ff:ff:ff:ff:ff)
    Source: ThomsonT_84:7e:4a (00:26:44:84:7e:4a)
    Type: ARP (0x0806)
Address Resolution Protocol (request/gratuitous ARP)
    Hardware type: Ethernet (1)
    Protocol type: IP (0x0800)
    Hardware size: 6
    Protocol size: 4
    Opcode: request (1)
    [Is gratuitous: True]
    Sender MAC address: ThomsonT_84:7e:4a (00:26:44:84:7e:4a)
    Sender IP address: (
    Target MAC address: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Target IP address: (
We need to note here that ARP Announcement doesn't have to be an ARP request, it can be an ARP reply as well. Wireshark wiki article has some more technical details on this.
Note that some devices will respond to the gratuitous request and some will respond to the gratuitous reply. If one is trying to write software for moving IP addresses around that works with all routers, switches and IP stacks, it is best to send both the request and the reply. These are documented by RFC 2002 and RFC 826. Software implementing the gratuitious ARP function can be found in the Linux-HA source tree. A request may be preceded by a probe to avoid polluting the address space. For an ARP Probe the Sender IP address field is ARP probes were not considered by the original ARP RFC.
Wireshark Wiki article on Gratuitous ARP
As I have mentioned in the intro, Gratuitous ARP messages have quite important applications, they are essential for router redundancy protocols such as VRRP, HSRP and GLBP, and links aggregation techniques (interface teaming).
In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave. One gratuitous ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP address configured.
Linux Kernel documentation about Network Interface Bonding
Receive Load Balancing is achieved through an intermediate driver by sending Gratuitous ARPs on a client by client basis using the unicast address of each client as the destination address of the ARP Request (also known as a Directed ARP). This is considered client load balancing and not traffic load balancing. When the intermediate driver detects a significant load imbalance between the physical adapters in an SLB team, it will generate G-ARPs in an effort to redistribute incoming frames. The intermediate driver (BASP) does not answer ARP Requests; only the software protocol stack provides the required ARP Reply. It is important to understand that receive load balancing is a function of the number of clients that are connecting to the server via the team interface.
SLB Receive Load Balancing attempts to load balance incoming traffic for client machines across physical ports in the team. It uses a modified Gratuitous ARP to advertise a different MAC address for the team IP Address in the sender physical and protocol address. This G-ARP is unicast with the MAC and IP Address of a client machine in the target physical and protocol address respectively. This causes the target client to update its ARP cache with a new MAC address map to the team IP address. G-ARPs are not broadcast because this would cause all clients to send their traffic to the same port. As a result, the benefits achieved through client load balancing would be eliminated, and could cause out of order frame delivery. This receive load balancing scheme works as long as all clients and the teamed server are on the same subnet or broadcast domain.
Source: Broadcom Gigabit Ethernet Teaming Services Manual
Further reading:


Ramaximus said...

Just don't forget that most clever sysadmins keep gratitious arps ignored for security. Thus anything based on resolving a shared ip will fail until arp cache expires.
The solution is a 'shared MAC'. With a shared MAC there's no need for renewing arp caches on all the hosts in the broadcast domain. And it takes just a single frame for switches to learn new location for the mac.
This is how BalanceNG works. Some routers do it too.

Ramaximus said...
This comment has been removed by the author.