IT Management

When assessing where to deploy your applications, there are many factors that go into your decision making. Some of them are financial, some are political and, hopefully, some are technical. Middleware that is supported across many platforms has generally evened the playing field for functional requirements. Many middleware products, including web servers, databases and application servers, provide a common set of APIs and functions so that you can host them on your platform of choice.

A fit-for-purpose assessment is a common way to evaluate platform selection factors. However, the results may often be skewed toward the consultant’s preferred platform that leaves you wondering if the results are trustworthy or if there might be a conflict of interest. You may not have the expertise to perform the assessment within your own organization so you default to a simple cost comparison, which can lead to disastrous results.

There are some simple operational requirement assessments you can perform to help with your platform decision making. Operational requirements, sometimes called non-functional requirements, have to do with the qualities of service that the platform provides. Assuming that your middleware is supported on the platforms being compared, you can look at how the platform runs in several key areas. The list of major operational requirements includes performance, availability, scalability, security and many others.

Performance

There are many aspects to performance, but the two major areas are response time and throughput. If your application has end users, you will be concerned about how long they have to wait for a response from your system. You can assess the platform architecture to determine its ability to handle requests efficiently. I have even done “paper and pencil” exercises to estimate path length across each component of the application to get a ballpark idea of the performance. While these are not 100 percent foolproof, they generally provide a good test of whether a platform can meet your response time SLAs.

Throughput is a measurement of how much work your system can perform in a given amount of time. Typically, this is used to assess batch workloads where there is no end user. You might have a limited batch window for processing and you need to be able to get all of your work completed. The capacity of the system, the speed of the processors and the amount of parallelism are factors that affect your throughput. It may seem intuitive that more parallelism will enable your workload to have a higher throughput, but this is not always the case. Many workloads have bottlenecks where locks are held, preventing much parallelism in that specific area.

Availability

You probably have very specific SLAs for allowed downtime of your system. Not meeting your availability requirements can be costly in many ways. You might lose customers and have your brand tarnished by periods of downtime. You might incur regulatory fines if processing does not complete soon enough.

Platform availability is much more than just mean time between failures (MTBF). It also includes how quickly you can recover from a failure and how much redundancy is in place to minimize the impact of a failure. Sometimes the MTBF includes the built-in redundancy, but you should check that with the platform vendor. Either way, you should assess how quickly the failed system can be repaired and the impact of a failure. Some systems automatically isolate failures and report them before there is any impact. Some even have spare components that sit idle and can replace the failed component when needed.

Scalability

The flip side of performance is scalability. Can your system continue to maintain your performance objectives as it grows? Or is there degradation in performance that causes you to miss your SLAs? Does the system scale by adding more nodes (scale out) or increasing the size of a node (scale up)? All of these questions (and more) pertain to growing your system. And, of course, you should also assess how well your applications scale. Some applications are designed poorly and will not scale even if given more system resources.

Most vendors will publish scalability metrics that you can use. If you double your capacity, you should be able to double your workload without seeing any appreciable performance degradation. This is referred to as linear scalability and should be the goal of any system that claims to be scalable.

Security

Computer system security is frequently in the news and not in a good way. No one wants to find his or her company in front page news for a security breach. That kind of publicity can ruin your reputation and negatively impact your bottom line. There are many aspects of security that are outside the scope of platform selection. Having a good security management system and procedures to go with it will go a long way. But there can also be security implications in your platform architecture selection.

At the platform level, the most common security criteria is encryption. You should assess how encryption is performed. Is it software or hardware? Is it built into the chip or an add-on card? What encryption algorithms are used? Does the platform support industry standards like PKI? While many of these operational requirements can be stretched a little, security requirements are usually fairly rigid so you must be familiar with your requirements and validate they are met.