Sep 6 ’12
Should Auld Programming Languages Be Forgot?
Before you answer, consider this quote from the FAQ section of Google’s blog post about the first release of its fascinating new programming language, Go:
Programming today involves too much bookkeeping, repetition, and clerical work. As Dick Gabriel says, “Old programs read like quiet conversations between a well-spoken research worker and a well-studied mechanical colleague, not as a debate with a compiler. Who'd have guessed sophistication bought such noise?” The sophistication is worthwhile—no one wants to go back to the old languages—but can it be more quietly achieved?
As someone who was peripherally involved in the creation of today’s ubiquitous, object-oriented programming languages, and who has also coded real-world-used programs in COBOL, FORTRAN, C, and a little-known gem called MODEL 204 User Language, the answer to the question of whether it’s time to toss away old programming languages for most IT shops with mainframes is: not yet. Why? Ah, thereby hangs a tale.
The Means Don’t Always Justify the Object
What follows is a grossly generalized hop, skip, and jump over the history of programming languages. It starts with recognition that there’s a rough, sharp dividing line around 1990 between pre-1990 “straight-line” languages and post-1990 “semi-object-oriented languages.” Today, if you’re a mainframe programmer (or even if you’re tending old Windows C programs), in most cases you’re doing straight-line programming. If you’re a Web programmer, by and large, you’re doing object-oriented programming.
At a 20,000-foot level, the difference between the two is as follows: Straight-line programming says to the machine executing the program, do this, then do this, then do this. It’s an intuitive way of programming. Object-oriented programming requires a mind shift: It says, view a program as a set of independent pieces of code (“objects”), sending messages to each other, with the program itself shifting from object class/state to object class/state. Such a program is assumed to run forever; an example is software on one machine handling communications from other machines. The machine waits for inputs, changes state, depending on the input to respond appropriately, and then waits for the next input.
The virtues of object-oriented programming have been rehashed so often we tend to forget there are many problems this approach didn’t solve. Among the key virtues of object-oriented programming:
• It’s superb for representing Windows-type user interfaces with their icons and “virtual screens.”
• It provides a way of improving program quality by “proving” the correctness of a piece of code without needing to revisit it in another program; it does this by isolating the program as a separate “class.” For example, many of today’s object-oriented programming languages allow “assertions” in classes to double-check error-handling.
• It helps decouple software from a particular machine; e.g., a Java program can run with its own little “virtual machine” that’s easy to transfer online from one machine or operating system to another.
However, for various reasons, object-oriented programming languages as implemented were and are deficient in two related areas:
• Data processing
• Hardware and software parallelism.
Not Yet Solved
The fact that “object” is sometimes used for a large, amorphous block of data shouldn’t blind us to the fact that, to an object-oriented program, an “object” is, or should be, a nice, neat separable bundle of code and its data. By the way, that’s to keep programming simple and so data as well as code can be “proved correct.” That’s a real problem because, to scale to today’s petabytes, data must be shared and compressed.
The solution of object-oriented languages is to punt: essentially, to insert SQL or variants such as XML Query Language (XQL) in the middle of an object class, and wait for the query result to come back from the enterprise database’s straight-line program. It’s called the object-relational mismatch, and it continues to cost in coding time and query delays. Moreover, over the last decade, a generation of programmers who haven’t a clue how to optimize querying and think it doesn’t matter has sprung up.
Parallelism is a tricky subject, but it boils down to this: There are three kinds to worry about—
concurrency, recursive parallelism, and all other parallelism.
• Concurrency is when you can’t run things simultaneously, but you want to fake it as much as possible by interleaving processes. There’s a lot of that in data processing, which involves updates because data is being shared between processes.
• Recursive parallelism is when you have to keep dividing the process into subprocesses and run them in parallel in order to scale. The problem here is that, physically, you quickly run out of room to subdivide.
• The other parallelism is the easy kind: Everything is isolated, so you just run as many copies at once as you can.
Object-oriented programming provides no way to distinguish between the three types or handle them differently, so it doesn’t really add anything in terms of customizing a program to handle new hardware features or tune for a particular kind of parallelism. Moreover, because it doesn’t handle data processing that well, it doesn’t do well at handling concurrency in particular.
This isn’t to say straight-line programming languages do that much better in and of themselves. However, because they’ve had 20 to 30 extra years to sort out how to handle concurrency and recursive parallelism, anyone replacing a front-end program querying in PL/C by an Enterprise JavaBean is going to see a performance and scalability hit that may well grow over time. Moreover, the decrease in query optimization skills as you phase out straight-line programmers is a long-term major cost. You may not see that cost, given short-run accounting, but it’s there.
What’s the Solution?
Frankly, the solution is for companies such as Google to start focusing on the problem. Google’s Go is entirely admirable and worth using within the bounded world of Web and object-oriented programming. However, neither in the FAQ nor in other efforts such as Python is there a real recognition of the problems of object-oriented programming with respect to data processing and parallelism—much less an effort to solve those problems. Quick note: There are some promising advances in multi-core hardware parallelism support in Go; but they haven’t proved out yet.
Until that happens, however, the solution for IT is just to accept that they will continue to see some need for the data-processing and concurrency/recursive-parallelism scalability of so-called “legacy” mainframe and straight-line programs—and that means some future for COBOL and the like. It’s certainly possible, now, to provide veneers and re-engineering solutions to put the result in a straight-line format on another platform, another straight-line programming language (as C++ can be made to be), or with a Web service veneer so you don’t necessarily have to deal with all the outdated quirks of straight-line programming languages. However, it’s not worth doing that for all legacy programs; and, in any case, it’s important to preserve the querying skills of past COBOL and MODEL 204 User Language generations.
The combined Veryant and Google announcements deliver a clear message: It’s not yet time to throw the oldster out with the bath water. Until some major advances in object-oriented programming languages arrive, the best IT strategy is to do nothing and do it with great skill. More precisely, we should anticipate developing new straight-line code for the foreseeable future and preserve COBOL and other straight-line programs and skills where appropriate. Should auld programming languages be forgot? Sorry, not yet.