Operating Systems

Linux on System z: Network Link Aggregation

6 Pages

Whatever your usage scenario is for Linux on System z, networking is an integral part of that environment, and the most important requirements are availability and bandwidth. The System z network adapter Open System Adapter (OSA) provides a high standard in both areas, but customers are requesting solutions that transcend the capabilities of a single interface. A technique called link aggregation, or channel bonding in Linux terminology, lets you bundle multiple physical interfaces into one logical interface, thereby increasing availability or bandwidth beyond the capabilities of the current network adapter technology.

This article describes the Linux network channel bonding concept and the link aggregation possibilities of the z/VM 5.3 Virtual Switch. For block devices, a similar approach known as multipathing supports I/O failover and path load sharing. Fault tolerance for network connections can be implemented using Virtual IP Address (VIPA) in conjunction with dynamic routing. Multipathing and VIPA concept aren’t discussed here.

Network Channel Link Aggregation

Link aggregation is designed to overcome two problems with network connections: bandwidth limitations and lack of redundancy. A Network Interface Card (NIC) is represented by a network interface in Linux. This network interface provides the I/O functionality to the TCP/IP network stack. This one-to-one relationship limits the throughput the TCP/IP stack can achieve on a single network interface. A proven solution to such a resource shortage is to bundle two or more network interfaces into a single (virtual) network interface the network stack can use. The benefits of link aggregation are:

  • Increased bandwidth and load balancing: The capacity of multiple links is combined into one logical link.
  • Increased availability and fault tolerance: The failure of a single link (port-cable-port) in its group doesn’t need to result in network failure from the application’s perspective. 

Key aspects for all link aggregation implementations are:

  • Transparency: The link aggregation method provides an interface compatible to conventional interfaces. The higher network layers and Media Access Control (MAC) clients don’t need to be aware of the link aggregation and no special coding is needed to exploit the aggregated interface.
  • Frame distribution policy: This method sends frames to the links in a group. A conversation pristine policy shouldn’t misorder frames that are part of any given conversation or cause frame duplication.
  • Link monitoring:  A mechanism must be established to detect the vitality of each link. This mechanism should be simple and reliable, as link monitoring is a constant process.
  • Configuration: Individual links may be automatically allocated to link aggregation groups. Dynamic configuration of the link aggregation groups simplifies administration. The scope of application for such link aggregation solutions is switch to switch, which isn’t covered here, or switch to server. With switch to server, connection from a server’s network adapter to the switch is presumably with a Single Point of Failure (SPOF). The Linux bonding driver implements link aggregation on the basis of NICs. The Institute of Electrical and Electronics Engineers (IEEE) 802.1AX standard also applies in this scenario. 

Bonding

The Linux bonding driver is the typical method for aggregating multiple network interfaces into a single logical bonded interface.  The introduced bonding device behaves for the Linux kernel functions like a normal network device, thereby satisfying the transparency requirement. Further, it’s possible to configure an IEEE 802.1q Virtual LAN (VLAN) interface on top of a bond interface using the 802.1q driver. Donald Becker originally developed the bonding driver contained in the Beowulf (high-performance parallel computing clusters of inexpensive PC hardware) patches for the Linux 2.0 kernel. Like other device drivers, the bonding driver must be enabled during kernel configuration. Usually, the bonding driver support is configured as a module and all major distributions are configured this way. Figure 1 shows a typical bonding setup.

 

A device working under the control of a bonding device is called a slave device; the process of bringing a slave device under the control of a bonding device is called enslaving. The bonding driver constantly monitors availability of its slave devices and detects link failures. No modification of the routing table is necessary to disable non-working slave devices; the concept of bonding doesn’t require a dynamic routing daemon.

The bonding device represents an ordinary networking interface marked as (bonding-) MASTER in the ifconfig command output. Carrying an IP address and a network mask, the MAC address is inherited from the first slave device (see exception in the fail_over_mac discussion). The slave devices are still visible to the networking interface and marked as (bonding-) SLAVE, but they don’t have an IP address; the bonding driver can change the hardware (MAC) address during the enslave process or at other events. The ifconfig output doesn’t contain information regarding which slaves are associated with which master. For that view, reference the bond configuration file (read-only) in the /proc/net/bonding directory. Certain operating modes of the bond device require specific switch functionality and configuration.

6 Pages