Feb 1 ’06

The 10 Commandments of TCP/IP Performance

by Editor in z/Journal

The 10 commandments of TCP/IP performance are a distillation of hard-won experience. Monitoring and tuning TCP networks on the mainframe is complex for the basic reason that each network is a mixture of many applications and pieces of hardware. Like an onion, each connection contains layers of protocols and subprotocols that must be decoded to make sense of the traffic patterns. Making sense of it all is the first step to tuning and improving performance.

For example, if you’re trying to find a problem with an IP printer, you may need to understand the IP protocol, the TCP protocol, the Line Printer Remote/Line Printer Daemon (LPR/ LPD) protocol, and then the hardware implementation of the printer. The following 10 commandments of TCP/IP performance diagnostics have served us well to tune the TCP stack and to find performance problems.

1.       Thou shalt monitor thy application backlog queue: Consider the application backlog queue as

“low-hanging fruit.” It’s easy to do, can help users a great deal, and can solve problems that may appear to be quite difficult. The TCP application may have a maximum number of connections that can be active simultaneously; let’s say five. Then, there’s another maximum number of connections defined, let’s say 10, that are waiting for those five to finish. This is the backlog queue. When the 16th user tries to make a connection, he can’t. The connection is dropped.

So, you have the following:

We once worked with an installation that used an accounting application. Sometimes, the user received a “Connection Refused” message when they tried to connect to the application; other times, the session initiation would just hang.

Technical support was stumped. The users were angry and the problem had been escalated to quite a high level. We were able to look at the application backlog queues to see that they were being exceeded and connections were dropped. Figure 1 shows the backlog queues in the Netstat All display.

When the user connection was in the backlog queue, the session initiation appeared to “hang.” When the user connection was dropped because the queue was exceeded, the “Connection Refused” message was received. The real problem is that the application took so long to process and didn’t complete connections in a timely manner.

 

2.       Thou shalt not kill thy network by many short connections: TCP works by creating a virtual

circuit between the two ends of the connection— the remote host and local host. The remote and local hosts are also known as “client” and “server” or “local address” and “foreign address.” When the two ends need to talk to each other using the TCP protocol, a connection is established, which lasts for a period of time bounded by the open and close. This connection is called a “virtual circuit.” All the TCP protocol functions occur in the context of this virtual circuit.

During the open sequence, TCP packets flow back and forth with various bits of the header turned on. The header is the first 20 or so bytes of the TCP packet that carry various pieces of control information. IP adds a header, too. In the open sequence, first, a TCP packet is sent from one side. Then, in response, the other side allocates buffers and other resources. This is called the SYN—SYN/ACK sequence or the TCP three-way handshake. At the end of thehandshake, if it’s properly concluded, the connection or virtual circuit is ready for data transmission. During the close sequence, several packets flow back and forth, too.

If you have many short connections, you’re adding overhead for the open and close handshakes. You can tune your TCP usage by checking the applications that have many connections from the same address pairs in a short time. Maintaining persistent connections may decrease the CPU usage and network traffic.

3.       Thou shalt drop unused connections: This seems like a “no-brainer.” If you’re finished using a

TCP connection, then the application should close it! What is so hard about this? Actually, quite a bit. Often, applications are developed using code generators that “shield” the programmer from the perceived intricacies of raw sockets code. So, it can be quite difficult to tell in the program if the socket is actually being closed or not. We’ve seen applications, such as Lightweight Directory Access Protocol (LDAP) servers, run out of sockets and abend because connections weren’t closed. There may be parameters for the application to time out idle connections or to do keep-alive. The keep-alive probe will drop unused connections. Dropping unused connections and eliminating errors on the TCP network can save CPU time used by the TCP stack on the mainframe.

4.       Thou shalt honor thy TCP duplicate ACKs and thy TCP retransmissions: What is a duplicate

ACK? If a packet is lost, then TCP will send the same acknowledgment again. When TCP gets three duplicate acknowledgments, it will retransmit the packet. Figure 2 shows a diagram of a packet loss scenario. Note that Segment 2 was lost, duplicate acks with the ACK number 100 were sent, and finally, Segment 2 was retransmitted.

 

You may find when monitoring your TCP network some counters that may be called parameters, TCP Retransmits, TCP Retransmit Timers, and TCP Duplicate Acknowledgments. These counters may appear in the output of the Netstat STATS command or you may see them while interrogating the SNMP MIB. These counters may all be related and indicate problems with network congestion. Duplicate acknowledgements indicate packets are either lost or received out of sequence. When three duplicate acks are received, the packet is retransmitted. If there are many duplicate acks, you may want to find out which addresses and subnets may be having the problems. Duplicate acks can impact network response time—called round trip time. You may want to see if either round trip time or, more likely, round trip variance, is affected by duplicate acks.

If there’s excessive round trip variance, then the user may be frustrated by erratic response time. You need to determine which remote addresses have duplicate acknowledgments. After you find which addresses are having problems, you may want to see if they have anything in common such as the same subnet, time of day, socket application, and route/set of hardware.

We’ve seen situations where one device had more than 49,000 duplicate acks. This was an IP printer that was out of paper. Consider the impact on your network of many such devices!

5.       Thou shalt relate thy TCP resets to the cause: A RESET packet is sent by TCP to abort a

connection. The fact that you have resets may or may not indicate a network problem. For example, a RESET segment is set to terminate a connection. A user may have gone away and left the connection idle. The application may have a keep-alive process that terminates the connection after a period of idle time. In this instance, the RESET to close the connection would be proper and indicate no problem. On the other hand, if an application is refusing connections because it’s out of resources, then you may see many RESETs.

In monitoring your TCP network, you may find some counters called Established Resets and Resets Out. Established Resets is the number of connections that were reset and Resets Out is the number of segments sent with the RESET flag on. Investigating the cause of resets can help you find many types of problems.

6.       Thou shalt not fail to watch your TCP attempt fails: You may find in monitoring your TCP network

some counters called Connection Attempts Failed, Connection Attempts Dropped, or Connection Attempts Discarded. These counters may appear in the output of the Netstat STATS command or you may see them while interrogating the Simple Network Management Protocol (SNMP) Management Information Base (MIB).

These counters mean a remote host IP address has tried to connect to an application on the mainframe and the connection failed. It could be that the application the remote users want to get to is inactive or doesn’t exist. One cause of degradation on TCP networks is unnecessary traffic. Sometimes PCs or other types of hardware on the network do “broadcast” type queries to many devices on the LAN and even to the mainframe to ask for applications that are PC-based.

Figure 3 shows SYN packets sent to start a connection to a port 445 that didn’t even exist on the mainframe. A SYN packet will be responded to by a SYN-ACK packet if the application open is successful. In this case, an RST packet responded to each SYN packet. The RST packet indicates the session couldn’t be established. Each time this occurs, the Connection Attempts Dropped or Connection Attempts Discarded counters will increment.

In Figures 3 and 4, you see SYN packet 75 responded to by RST packet 76, SYN packet 148 responded to by RST packet 149, and SYN packet 225 responded to by RST packet 226. If this is a mistake and happens thousands of times a day because some PCs aren’t properly configured, then consider how much unnecessary CPU time the TCP stack may be taking for needless error recovery.

Notice TCP port 445, which is used for SMB (Server Message Block) protocol file sharing in Windows NT/2000/XP. In Windows NT, it ran on top of NetBIOS over TCP/IP, which used ports 137 to 138 (UDP) and 139 (TCP). In Windows 2000/XP/2003, Microsoft added the ability to run SMB directly over TCP/IP, without the extra layer of NetBIOS over TCP by using TCP port 445. This port is often misused by hackers!

 

7.       Thou shalt delve deeply into User Datagram Protocol (UDP) No Ports errors: The UDP

equivalent of TCP Attempts Failed is UDP No Ports. These counters may appear in the output of the Netstat STATS command or you may see them while interrogating the SNMP MIB. UDP No Ports means some packets were sent for a UDP port that wasn’t available. It may be there’s a UDP application that isn’t active. If all UDP sockets are active, then it may be that UDP traffic is coming in at too high a rate for a particular port. We’ve seen this error to be correlated with ICMP Destination Unreachable Sub Type Port Unreachable error.

The ICMP Destination Unreachable message has several subtypes that indicate the type of error. These subtypes are as follows:

0 = Network Unreachable

1 = Host Unreachable

2 = Protocol Unreachable

3 = Port Unreachable

4 = Fragmentation Needed But Do Not Fragment Set

5 = Source Route Failed

6 = Destination Network Unknown

7 = Destination Host Unknown

8 = Source Host Isolated

9 = Network Access Prohibited

10 = Host Access Prohibited

11 = Network Unreachable for Type of Service

12 = Host Unreachable for Type of Service

13 = Communication Administratively Prohibited (administrative filtering prevents packet from being forwarded)

14 = Host Precedence Violation (indicates the requested precedence isn’t permitted for the combination of host or network and port)

15 = Precedence Cut Off in Effect (precedence of datagram is below the level set by the network administrators).

Figure 5 shows packets with sub type 3, or Port Unreachable.

These Internet Control Message Protocol (ICMP) packets are generated because the IP address 10.3.32.37 port 10139 tried to access port 161 on the same machine. No application was listening on port 161, so this generated an ICMP error. Since this port happened to be for UDP, it also generated a UDP No Ports error.

8.       Thou shalt address the reason for your IP address errors: Consider problems that may appear in

IP traffic. You may find in monitoring your TCP network some counters that may be called IP Address Errors. One especially suspicious circumstance is when the IP address errors and IP discards in counters are the same. Packets are coming in with an “unknown” address and are being discarded.

What kinds of packets might these be? Figure 6 shows many packets coming into the mainframe with an address of 255.255.255.255. This is a broadcast address; by setting the address to all ones (255.255.255.255), all hosts on the network receive the broadcast. These packets may not contain data that the mainframe can understand or is interested in, so the packets are discarded. Why send packets just to have them discarded?

9.       Thou shalt not convert thy applications directly from multi-dropped Synchronous Data Link Control (SDLC): When many short segments are sent, there’s overhead associated with the traffic;

each packet contains at least 40 bytes in IP and TCP headers. If you send a short segment, then a high proportion of the packet is overhead. Figure 7 shows an application that was ideal for a multi-dropped SDLC link. It was converted directly to a TCP application keeping the same message lengths. On a TCP virtual circuit, this kind of application is quite resource-intensive and may have poor response time.

10.   Thou shalt not use two packets when one will do: We’ve seen applications send two packets for

each transmission with the second packet having only a protocol flag set on! The protocol flag could have been combined with the flags in the first packet. This is a small mistake, but when you make it a million times a day, it becomes a big mistake.

Conclusion

Tuning TCP/IP is like death by a thousand paper cuts. There can be thousands of small mistakes. When we do tuning, we work little by little, fixing each one the best we can. The results have been quite satisfying. As problems are fixed, we can see a reduction in the overhead CPU usage for the TCP stack on the mainframe. Throughput and response time for the applications should also improve. It’s a task well worth undertaking.