The 10 commandments of TCP/IP performance are a distillation of hard-won experience. Monitoring and tuning TCP networks on the mainframe is complex for the basic reason that each network is a mixture of many applications and pieces of hardware. Like an onion, each connection contains layers of protocols and subprotocols that must be decoded to make sense of the traffic patterns. Making sense of it all is the first step to tuning and improving performance.
For example, if you’re trying to find a problem with an IP printer, you may need to understand the IP protocol, the TCP protocol, the Line Printer Remote/Line Printer Daemon (LPR/ LPD) protocol, and then the hardware implementation of the printer. The following 10 commandments of TCP/IP performance diagnostics have served us well to tune the TCP stack and to find performance problems.
1. Thou shalt monitor thy application backlog queue: Consider the application backlog queue as
“low-hanging fruit.” It’s easy to do, can help users a great deal, and can solve problems that may appear to be quite difficult. The TCP application may have a maximum number of connections that can be active simultaneously; let’s say five. Then, there’s another maximum number of connections defined, let’s say 10, that are waiting for those five to finish. This is the backlog queue. When the 16th user tries to make a connection, he can’t. The connection is dropped.
So, you have the following:
- Current backlog: The current number of connections in backlog
- Maximum in backlog: The maximum number of connections allowed in backlog at a time
- Exceed backlog: The total number of connections dropped by the listener due to backlog exceeded.
We once worked with an installation that used an accounting application. Sometimes, the user received a “Connection Refused” message when they tried to connect to the application; other times, the session initiation would just hang.
Technical support was stumped. The users were angry and the problem had been escalated to quite a high level. We were able to look at the application backlog queues to see that they were being exceeded and connections were dropped. Figure 1 shows the backlog queues in the Netstat All display.
When the user connection was in the backlog queue, the session initiation appeared to “hang.” When the user connection was dropped because the queue was exceeded, the “Connection Refused” message was received. The real problem is that the application took so long to process and didn’t complete connections in a timely manner.
2. Thou shalt not kill thy network by many short connections: TCP works by creating a virtual