Jan 1 ’04
An Introduction to Web Services & Performance Issues
Are you having trouble making robust legacy applications talk to hot, new Web applications? Trying to manage business transactions with your partners when everyone is running their proprietary software on different platforms? Simple, flexible interoperability is the “Holy Grail” behind Web services. However, simplicity does not come without cost. This article will introduce you to Web services, provide an introduction to the key performance issues that characterize Web services-based applications, and help you begin to explore the possibilities for performance monitoring such complex, distributed applications.
What exactly are Web services? They are programs that are accessed over the Web, independent of the tools used to create them and independent of the operating system on which they run. Web services are interesting because they offer a way to provide programs that are platform-independent.
A Web service supplies “discovery” information about itself, which means that you can query a Web service and have it tell you the names of the functions it provides, the arguments accepted by each function, and the return values for the functions. In other words, to be able to use a Web service as part of your programs, you don’t have to know much other than it exists!
Simply stated, a Web service offers a method(s) you can invoke, using open standards (including HTTP and XML) for communication. For those readers who are not up to speed on object-oriented programming, a method is just a function or procedure that is associated with an object. In other words, when a method is invoked, it simply executes an action of some kind.
The driving force behind Web services is to provide components that can communicate with each other regardless of the language they were written in or the operating system on which they run. Web services make their methods available for use or consumption by other programs. To consume a Web service, a program makes a request running over HTTP to use the methods available via HTTP GET and HTTP POST—or more commonly— Simple Object Access Protocol (SOAP). Given that this article is an introduction to Web services and performance, I refer to Microsoft’s .NET environment, primarily because .NET “insulates” the Web service developer from the coding details of SOAP—which is particularly comfortable for the newcomer. However, the SOAP “transaction” that .NET constructs is always available for viewing, as it is in other Web services development environments.
With respect to the universal discovery mechanism provided by Web services, .NET also automatically creates a Web Services Discovery Language (WSDL) document, which other applications (or users) can read to understand how to use the Web service. In addition to .NET hiding the SOAP coding details from the new Web services developer, .NET also insulates the developer from the details of creating the WSDL document; again, it is important to know that it exists in any case.
Web services also use universal description, discovery, and integration (UDDI). UDDI is a kind of Yellow Pages for Web services you can use to find a Web service or to ensure that your Web service can be found.
The SOAP standard is overseen by the World Wide Web Consortium (W3C) and is based on XML. The standard consists of three parts:
- An envelope that defines rules for describing a message and how to process it
- A set of encoding rules for instances of the application-defined data types
- A convention for representing remote procedure calls and responses.
If you think that hand coding a Web service following the SOAP standard sounds complex and fairly tedious, you’re right! Again, the .NET environment insulates the developer from those details. If you are interested in learning more about the SOAP specification, go to www.w3.org/TR/soap12-part0/.
WSDL is used to create documents that describe the methods supported by a Web service, the arguments that the methods accept, and what the Web service returns. The WSDL document tells the program (or the programmer) what’s needed to consume a Web service. The format of a WSDL document is an XML-schema—XML that specifies the format of an XML document. As I said before, the .NET development environment hides the details of building the WSDL document, but if you are interested in obtaining more information, go to www.w3.org/TR/wsdl12/.
With UDDI, we know what Web services are available for use. If you want others to use a Web service you’ve built, you would use UDDI to list your service so that others can find it. Similarly, you would use UDDI to find Web services to use in your own applications. A word of caution: In some discussions, UDDI can divert people from the real issues, such as the need for careful performance design when building a distributed application.
FACTORS AFFECTING THE PERFORMANCE OF WEB SERVICES
Face it, Web services is remote computing, so its performance is affected by many factors, including latency and bottlenecks in the network; intermediaries (if present); and your service provider. Some of these factors are out of your control, especially network servers not under your control. However, because XML is used as a message format, it is likely that the main contributing factor to elongated response time is XML.
XML, more than any other specification, is key to Web services, especially in large organizations, such as governments, retailers, and banks that might best exploit XML technologies. In addition, developers are increasingly using XML in non-Web services environments.
Unfortunately, XML performance is not discussed much in the industry literature. Thus, a comprehensive and detailed overview would be extremely valuable, but that is beyond the scope of this introductory article. XML is humanreadable and verbose, being both textencoded and metadata-encoded (metadata is “descriptive” data that describes the “real” data). While XML is great in terms of flexibility and maintainability, it’s just terrible in terms of performance. The use of XML presents several performance challenges:
- TRANSMISSION: The longer the SOAP message, the longer the transmission time.
- PROCESSING: The parsing, binding, validation, and transformation of XML/SOAP messages will add to the response time.
- PERSISTENCE: RDBMS/XML databases— the databases referenced by XML—must stay “resident” for the life of the SOAP message. This might imply that the XML message might have to “live” beyond the life of a process.
The killer in this is transmission: An XML message can be 10 to 20 times larger than the equivalent binary representations. For example, if we need to pass the values of a single point on a graph—an xposition and a y-position—we would merely pass the binary values (-1 and -2, in this case) to an “ordinary” called procedure. However, for a Web services method, we must encode the parameters with self-defining metadata to describe the actual parameter values (see Figure 1).
XML processing refers to several XML activities, such as parsing, schema validation, binding, and transformation. XML processing is quite CPU-, memory-, and I/O-intensive. Parsing and schema validation involves a lot of character encoding/decoding and string processing, which can require significant system resources. XML validation ensures that an XML document follows a predefined structure, which is an absolute necessity. However, it is usually three to four times slower than XML parsing!
XML transformation deals with changing data from one XML structure to another, or changing data from XML to some other format. It is key when integrating different applications and probably presents the greatest performance challenge.
Just because Web services are a relatively new application paradigm, this does not mean that classic performance best practices do not apply—they do! Three common-sense practices exist:
- PROACTIVE STRATEGY: Involves training engineering and QA staff on writing optimized XML Web services
- DEFINITIVE STRATEGY: Follow irrespective of the stringent performance requirements
- REACTIVE STRATEGY: Defining and addressing performance-oriented requirements right in the design phase; i.e., software performance engineering (SPE). SPE is the discipline that identifies best practices and necessary hardware and software for leveraging performance. The reactive strategy involves analyzing bottlenecks and weak links that may be causing performance problems. Once issues are identified, they are resolved by both analyzing and tweaking the code, or by changing the design.
REDUCING THE SIZE OF XML
We can use traditional ZIP/GZIP compression with a variety of similar tools. While ZIP compression can yield 10:1 compression, both the endpoints must understand and abide by the same compression algorithms. In addition, the compressed XML completely loses all human readability and requires additional processing cycles.
Compact encoding, such as WAP Binary XML (WBXML), can reduce the number of bytes to a great extent and also eases parsing overhead at both endpoints. Unfortunately, the industry has not yet established an encoding compression standard.
OPTIMIZING XML VALIDATION
Assuming the application has been written to specifications, you can turn on the validation when the incoming XML document has originated from outside the application. Additionally, since an XML document that originated from outside the application has already been validated, turn off validation when XML documents are exchanged within the components of the application itself. However, one important caveat is that this is not a viable solution for exchanging documents between different applications. Studies have shown that validation adds two to three times more processing when compared to an XML document that is not validated.
Much of what has already been discussed has centered on application performance, issues that can negatively impact an application’s performance, and the possible remedies. However, without measurement, we cannot manage such applications.
Believe it or not, it is potentially easier to monitor application availability and performance when using Web services than for traditional distributed computing environments. There are several reasons for this:
- SOAP provides a clear starting point for defining and subsequently monitoring “transaction” performance. Content formatted in XML is easily parsed by software and readily understood by humans. Measurement tools can then access and act on the detailed application information contained within SOAP messages. Potentially, workload characterization can reach a detailed level of granularity that would otherwise require detailed knowledge of application logic and a willingness to make significant source code modifications.
- WSDL files provide information necessary for automated setup and dynamic discovery of new services. Though WSDL files are meant to expose Web services for use by external consumers, their mere existence implies the ability to discover new services as soon as the Web service becomes available. This makes it possible for measurement tools to discover the services they need to monitor and to adapt automatically to changes in the Web services available in their environments. Thus, setup and operation can, in principle, be automated to a degree that would be very difficult to achieve in other environments.
- Passive surveillance of SOAP messages via “software observers” is a new technique developed specifically for Web services environments. The basic idea is to insert a data capture module into the chain of handlers that process incoming and outgoing SOAP messages. Its main function is to copy XML tags and variables (e.g., name of Web service being invoked, IP address of system invoking the service, length of the incoming SOAP message, and arrival time). These tags and variables are copied from the object prepared by the SOAP engine during parsing to a monitoring object. Subsequently, this new object is processed asynchronously by other components of the performance monitoring subsystem. This strategy should keep overhead very low.
SUMMARY AND FUTURE
In the future, we would expect to see more articles that address the details surrounding the performance of SOAP/XML. In addition, we hope to see case study-type articles that would examine the not-so-obvious “gotchas” in implementing XML-based solutions. As with the development of any complex application, great care must be taken to ensure that efficient design and coding strategies be employed. The use of Web services is no exception. While you might at first think that Web services applications may impose difficulties in terms of performance management, this environment would seem to contradict that conclusion. Alternative measurement strategies are emerging with the proliferation of Web services applications. Analysts will no doubt carefully study the different strategies. From a performance perspective, Web services have again confirmed that we live in interesting times!
If you wish to learn more about Web services and performance, I’ve listed several interesting works for your continued study.
- Creating and Using Web Services With the .NET Framework and Visual Studio.Net¸ Rick Strahl, www.westwind.com/presentations/dotnetwebservices/DotNetWebServices.asp
- Creating Web Services From Existing Applications, Sanat Gersappa, www.indiawebdevelopers.com/technology/XML/creating_web_services.asp
- “Enabling Your Key Services as Web Services,” White Paper, www.webenable.com/business/enabling_key_services.pdf
- Creating Web Services for DB2 UDB for OS/390 Stored Procedures Using WebSphere Studio Version 5, Peter Xu, www7b.software.ibm.com/dmdd/library/techarticle/0304xu/0304xu.html
- Creating Web Services With Visual C++.NET, Mark Schmidt, Richard Simon, www.informit.com/isapi/ product_id~%7B9DA03878-E791- 441A-86A3-18518F79F2E6%7D/-content/index.asp
- Creating Web Services in Java, Mark Wutka, http://docs.rinet.ru:8083/JSol/ch23.htm
- Creating and Consuming Web Services in Visual Basic, Scott Seely, Deon Schaffer, Eric A. Smith, Addison- Wesley, www.amazon.com/exec/obidos/tg/detail/-/0672321564/102-0675553-092147?v=glance#productdetails. Z