Callout: Ceph is a file system that provides petabyte-scale parallel, network-based data storage across multiple servers and storage technologies.
Somehow, I think Lewis Carroll would have loved the Linux world—it’s often seen but usually absurd. No matter what I plan to talk about, there are always other interesting things that come to mind when this column is due. Here we’ll discuss an upcoming show and some interesting technology coming to the Linux environment by way of the High-Performance Computing (HPC) community. We’ll also examine an exciting way programming skills in the Linux on System z environment are being taught using the Alice project at Carnegie Mellon University (CMU).
First, the Ohio LinuxFest 2010 dates have been announced; it will be held Sept. 10 - 12 in Columbus, OH. It’s a large, volunteer-run show focused on open source, Linux on all platforms, and other open source projects. Registration rates haven’t yet been set, but they’ve historically been very low. The interesting connection with Linux on System z is that the Columbus area is littered with substantial mainframe and Linux on System z sites (for example, Nationwide Insurance), and there seems to be a grassroots movement to cover open source efforts outside the Intel world, including some sessions on z/VM open source tools and Linux on System z management. If you’re thinking of attending a Linux-oriented conference this year, it’s an inexpensive way to make your conference dollars count. To learn more, visit www.ohiolinux.org.
Second, Linus Torvalds approved a series of kernel commits that introduced a fascinating new high-performance distributed file system called Ceph to the mainstream Linux kernel sources. Ceph (more details at http://ceph.newdream.net) is a file system that provides petabyte-scale parallel, network-based data storage across multiple servers and storage technologies. It’s designed to work in environments with different kinds of physical storage; it has automatic data location balancing (i.e., it will automatically reshuffle data to optimize performance) and fault management (with petabyte-scale data sets, something is bound to be broken almost all the time). Ceph adds a policy component that tells the file system that “any file stored in this directory needs to be replicated in at least three physical locations that can’t share a cabinet, power supply, or disk shelf.”
Originally developed at the University of California Santa Cruz, a document available at http://ceph.newdream.net/weil-thesis.pdf describes the function of Ceph in detail and provides sample performance numbers—near-wire-speed performance across all kinds of different loads and failure scenarios. Here at Sine Nomine, we have it operational with both Intel and Linux on System z and are continuing to investigate how to get HSM function working.
Last, the Alice project at CMU. Brainchild of Caitlin Kelleher of Washington University and championed by the late Randy Pausch, Alice provides a exploratory environment for application programming targeted at 3D objects and simulation software such as the game “The Sims” (in fact, Electronic Arts donated several motion and object libraries). Recognizing the growing availability of the System z platform, and the use of virtual machines to deploy collaborative and innovative environments, Alice software packages (in Red Hat Package Manager [RPM] format) are now available for Linux on System z. Alice has been used to introduce the ideas of what is programmable and the logic of programming to non-programmers—maybe just the thing to introduce to your manager to show him or her what is doable in a short time frame. Information about Alice is available at www.cmu.edu/homepage/computing/2009/winter/alice-3-software.shtml.
In the next issue, we’ll take another look at RHEL 6.1 for System z and some improvements in the kernel API for scheduling processes as groups and a few other goodies tossed in from our friends in Boeblingen, Germany. Also, as promised, we’ll provide more details on XCAT in an upcoming column.