Storage

If you follow the conversation around software-defined data centers, you’re eventually going to encounter the term “elastic” in conjunction with computing, networking or storage. No, elastic doesn’t refer to some sort of new media for storing data or to some sort of slingshot method for transporting data faster than the speed of light. Nothing so grand. 

Elastic has to do with provisioning resources and services to support workload, according to the experts, and it works somewhat similarly to the “Goldilocks and the Three Bears” story you heard when you were a kid. Provision a resource to a workload based on its normal requirements and you will likely experience periods when high demand will outstrip resource availability: The resource will be insufficient to handle the load. Alternatively, you could provision the resource for the peak operating requirements, but then, for the most part, the resource will be over-provisioned, going to waste. 

Elastic resource provisioning achieves “just right” status in this Goldilocks tale by shifting with demand. When you need more, you get more. When you need less, some of the resource you’ve been provisioned is reclaimed for provisioning elsewhere. 

In a perfect world, such elasticity would be automated and optimized. It would be delivered by some uber intelligent resource controller, whether centralized or federated, on all infrastructure components. It would be easy to set up and configure and would work across all stands of technology, regardless of which vendor’s name is on the outside of the box.

Well, I’m all for that. Let’s bring on this elasticity thing. IBM announced “elastic storage” last May, but alas, it was only a code name for something as yet to be determined. When they talked about it, they acknowledged that other vendors had gotten there first. Amazon was calling part of their vision “elastic block storage” while EMC had announced “elastic cloud storage” earlier in the year.  The situation was ripe for producing another dollop of confusion to add to that parfait of confusion called software-defined data centers.

With the term elastic storage, IBM was, in fact, referring to its General Parallel File System (GPFS) and to products based on it. GPFS is great stuff for optimizing data access and eliminating chokepoints in distributed data access paths. I guess you could twist that into something elastic insofar as it would enable the allocation of additional capacity to a global namespace on demand.  You have basically a file server of limitless capacity, which might have some benefits in, say, Big Data analytics.

Being an aficionado of the Linear Tape File System (LTFS), which enables tape libraries to become active storage repositories for long block files, I particularly like the mix of LTFS and GPFS—in particular, the capability of the latter to integrate the two to create a hierarchical storage management solution for files. That kind of thing provides genuine elasticity by continuously freeing up performance storage by moving less active data to cheaper, “slower,” capacity tiers comprising capacious serial-attached SCSI (SAS)/serial ATA (SATA) disk and tape.

Still, even in this context, the term elastic tends to be a bit too stretchy to be meaningful. IBM describes three different types of software-defined storage, all of which are supposed to manifest the quality of elasticity. One is the GPFS storage. The second is SmartCloud VSC, which is virtualized storage using SAN Volume Controller with some capacity optimizing technology in the form of Storwize compression. The third is the IBM XIV Storage System, an array with XIV software purchased by IBM a couple of years ago in the controller that provides other forms of capacity and storage service allocation and deallocation capabilities.

From where I’m standing, the lynchpin of any elastic storage play comes down to one thing: the ability to see the status of resources—both physical kit and software services—at any given moment. This is a lot more challenging than it sounds.

For years, distributed storage vendors have gone out of their way to isolate their gear from competitors by limiting its ability to be managed in common with peers bearing other logos. The only way, according to Forrester, to achieve any kind of efficiency in storage was to purchase everything from a single manufacturer, assuming they had a deep enough bench of products to have all the storage flavors you needed. The challenge was that even vendors with deep product benches tended to have very different architectures on those products, usually because they were acquired from third-party vendors or sold through OEM rebranding agreements. Bottom line: Lacking a common pedigree, the products usually provided no coherent method for common or unified resource management.

This will need to be addressed before elasticity can go mainstream in the enterprise. IBM is paving a path forward through its support for OpenStack architecture, which enables unified plumbing and service management (albeit only for those who embrace OpenStack). Frankly, I would rather see a hell bent for leather effort around a truly universal management paradigm that leverages the World Wide Web Consortium’s (W2C’s) RESTful protocols. 

Word up to IBM, the real innovator in RESTful management was the team out of Seagate that became X-IO’s REST engineering team. These guys recently got their pink slips as X-IO shifts focus toward moving boxes rather than innovating in management technology, so there’s plenty of talent available if you want to do something with 2009’s Project Zero, which IBM announced as an initiative to REST-enable all of its products going forward. It was aptly named since no big REST announcements have been forthcoming from Big Blue since the original Project Zero launch.
Imagine what could be done with real-time RESTful monitoring and management combined with elastic provisioning all coordinated by Watson?! The mind boggles.