Feb 21 ’11
CSI Database: The Forensics of DB2 Network Traffic
Forensics is a collection of tests and techniques used together to uncover detailed evidence that can be used to help reconstruct an event. This article describes how network-related forensic tools can be useful to DB2 for z/OS.
Victims and Suspects
Network access to DB2 for z/OS is provided by Distributed Data Facility (DDF), a built-in component. Any application server or client, running on z/OS or any platform, can use TCP/IP to access DB2 though DDF. In this environment, there are many potential victims of bad network service and performance (see Figure 1).
DDF runs as a z/OS address space named ssidDIST, which opens TCP ports and listens for connections. Every component in a multi-tier application path can be a suspect when things go wrong. Was it the network, client, application, or DB2 server?
Crime scene investigations use data from sources such as bank account records and interviews to get a broader understanding of victims, other events, and any patterns that might connect them.
When investigating networks, TCP/IP traffic analysis figures for a z/OS Logical Partition (LPAR) indicate who has been active, how, when, and where. Traffic passing through DDF TCP ports can be broken down for insight into the behavior of DB2 for z/OS and its connection partners:
- Which remote node had the most traffic with one or any DB2 DDF task in the last minute, hour, day, or since DB2 started?
- Which DDF task is the busiest by bytes? By connections?
- How long are most TCP/IP connections to a DB2 DDF task—milliseconds for Web transactions and hours for persistent connections? Is this as expected? How does this compare to other DDF tasks?
- Which network interface carries the most traffic to and from this DDF?
A problem is reported with a TCP/IP connection between client A and DB2 B. But what else are client A and DB2 B involved in simultaneously? Knowing that can help you determine where to start investigating. For example:
- There are thousands of active connections with DB2 B. Is there some kind of limit?
- There are no connections with DB2 B. Is having zero connections occasionally normal for DB2 B? Historical performance data will tell you. Or, are there zero connections because DB2 B is troubled?
- Is there connectivity between the DB2 B LPAR and client A? Is client A accessible? Does it have any other connections with that LPAR? Does it have any connections at all?
TCP/IP connection lists provide this kind of information about each connection and partner. Sorting connection lists is a fast way to spot the top talkers, longest running connections, longest idle connections, and connections stuck in unhealthy TCP/IP states. All these are possibly useful clues.
A Lab Tour
Forensic analysis uses tools such as microscopes and mass spectrometers to examine the most promising evidence. For DB2 TCP/IP connection problems, key examination tools are IP packet stream viewing and IP packet inspection. After comparing connections on a list, you can hone in on a single suspicious connection and use packet stream viewing to show what’s logically happening over the connection.
Packet stream viewers use data from both the IP packet headers and packet content to provide chronological and annotated lists of all IP packets flowing over a connection—preferably in real-time (see Figure 2). Most DB2 DDF packet streams you see include a mix of these kinds of packet flows:
- TCP/IP protocol activity
- DDF partner setup
- Distributed Relational Database Architecture (DRDA) protocol activity
- Application SQL.
TCP/IP protocol activity: These packet flows do the mechanics of setting up and taking down TCP/IP connections. While irrelevant to DB2 itself, these activities are highly relevant to DB2 user problems:
- A user or application can’t connect to DDF. Is this a network or an application problem? If you see the TCP three-way handshake in the packet stream, then the underlying TCP connection was successfully set up and it’s an application problem. But if the handshake failed to complete, it’s a network problem.
- A successfully established connection is stopping unexpectedly. The packet stream shows whether the DDF server is abnormally terminating the connection (by sending a reset flag, or RST) or the client is ending it in an orderly way (by sending a finished flag, or FIN).
DDF partner setup: Once a TCP/IP connection is established, DB2 DDF must control the logical introductions and negotiations with its remote partner. The partner can be anything from another DB2 database, a Java Database Connectivity (JDBC) application, an ordinary workstation, a data aggregator product, or DB2 Connect in one of its many modes. Partner setup includes exchanging product names and versions, security mechanisms, and security check details. Only after the communication conditions are agreed on by both ends can any data start to flow. Proposal and acknowledgement packets—or lack of them—can be seen in the packet stream.
DRDA protocol activity: DB2 DDF communicates over TCP/IP using an open application protocol called DRDA. The DDF address space listens for DRDA commands from requesters, invokes DB2 on their behalf, generates a DRDA reply, and sends this reply to the requester. DRDA is a highly structured protocol built on Distributed Data Management (DDM) commands that provide its command and reply structure. Some DDM commands map directly to SQL commands; others do specific distributed database tasks. The sequence of all DDM commands is clearly visible from a packet stream.
Application SQL: The motive of any DB2 TCP/IP connection is to eventually access a database. Packet stream viewing reveals which packets contain the payload of application SQL queries and their responses. With multi-tier applications, sometimes the exact SQL codes and states are accidentally or deliberately not externalized to users, hindering problem determination. These are always visible in a packet stream. DB2 authorization problems are one example that can cause SQL error responses and apparent failures at the user interface end, even though the underlying TCP/IP connection is fine.
Packet stream viewing also shows you the context of individual SQL request and response packets and their related packets. For example, packet stream viewing lets you pair commit request packets with their matching end unit of work packets; these are important in distributed database processing, where a logical unit of work can span multiple SQL requests and multiple databases.
From the timestamps in the IP packet headers, TCP response time metrics can be calculated. A key metric for DB2 is TCP server response time—the time difference between when the packet containing the SQL request was received and when the first packet containing the response was sent. This is how long DB2 took to respond to the request. Comparing this with other metrics such as network Round Trip Time (RTT) and data transfer time indicates whether the mainframe or the network is contributing more to poor response time.
Packet stream analysis can become quite complex. If the problem is that you don’t have a TCP/IP connection, you have to trace something common, then search for your connection attempt among all the other packets. One TCP/IP connection can be supporting simultaneous work for many database users, with functions such as thread pooling, connection concentration, and different Database Access Thread (DBAT) processing modes.
Under the Microscope
In contrast to packet stream viewing, IP packet inspection concentrates on the data content of one individual packet. Packet inspection products will decode the packet data content, which means translating it from hex to something easily readable by a DB2 person. Packet inspection reveals the full text of SQL queries and their matching result set values and SQL communication area.
Other details available from packet inspection include:
- Client accounting fields such as end-user workstation and user ID name, end-user ID password flags (and password values if not encrypted)
- DB2 plan and package names
- Product release details of remote DB2 subsystems.
The Logical Unit-of-Work Identifier (LUWID) is also available; it’s used to correlate IP partner details with DB2 threads, which Workload Manager (WLM) uses to assign priorities to application work arriving from DDF.
Obscuring the Evidence
Forensics can only do its best with what it’s given. Often, this isn’t perfect; fingerprints can be wiped off, bleach can be used to destroy DNA, or crowds may accidently trample footprints.
Packet encryption will obviously obscure evidence for well-intended activities such as in-house network forensics as well as the ill-intended activities that it’s deployed against. Commercial mainframe packet stream viewing and inspection products don’t decrypt any packet content data. There are two approaches to mainframe TCP/IP encryption. The following summarizes their effects on network investigations:
SSL/TLS provides security and encryption for TCP connections. Generally, the TCP applications perform the SSL processing, and use different TCP ports for their normal and encrypted connections. HTTP and HTTPS are the best-known examples, but DDF also supports SSL.
TCP SSL connections still transmit their IP and TCP packet headers in the clear. They must; otherwise, every router in the network would need to be able to decrypt and re-encrypt them. Only their TCP packet content is encrypted. This means TCP connection setup, traffic statistics calculation, and response time calculations can continue to be visible because these are based on fields in the TCP headers.
One frequent use of packet stream viewing is during the initial setup of the SSL environment. Connection failures between a TCP application and partners are varied and common when doing the complex certificate setup tasks SSL requires. SSL handshake failures and their exact reasons can be highlighted with packet stream viewing.
IPSec securely connects private IP hosts over insecure public networks, possibly including the Internet. IPSec does this by encrypting the data and using Virtual Private Network (VPN) tunnels. The IP stack performs IPSec processing; applications such as DDF aren’t even aware of it. IPSec can protect all IP protocols, not only TCP.
When TCP connections use IPSec, they cease to be TCP at all; they use special IPSec IP protocols and different IP port numbers. Packet stream viewing recognizes these IPSec protocol packets—the IP packet header is always in the clear—and can identify and decode events such as key exchange negotiation failures. However, information normally obtained from the TCP headers isn’t available.
The CSI Effect
Network traffic forensics has it much easier than the complex, life-changing investigations resolved every week on television shows such as “CSI,” where well-groomed people generate a breakthrough with a single lab test. Instead of the gamut of human behavior, network forensics deals only with a small set of far more predictable, repetitious activities. IP connection setup and application data transfers follow set rules, standards, and protocols. Network and packet data reveal the hidden information you need to help nail the culprits of DB2 connection and performance problems. So, make friends with your mainframe networking group. They’re your expert crime lab consultants.