January 2001   
Into the New Millennium with...MIDI?
Keeping an Old Friend Young


Well, this time it's for real. The new millennium. No more arguments about when it begins--it's here. The Y2K warnings and the Y2K jokes can be put away forever; the census has been taken, the presidential election circus (I hope!) is over and we're now in 2001, indisputably the 21st century.

Arthur Clarke and Stanley Kubrick were a bit over optimistic about what we were supposed to have by this year: There's no permanent space station above the Earth, and no Howard Johnson's restaurant in it (or anywhere else for that matter), and no one's building a nuclear-powered HAL/IBM-controlled starship to take humans to Jupiter's moons, or even our own. In his earlier writings, Clarke was right on the money about communications satellites covering the Earth, but he was dead wrong about the role of garbage at the turn of the 21st century: He saw it as a potential fuel source, not something to be delivered over microwave, twisted copper pair, coaxial cable and optical fiber to every home and office in every corner of the globe by the Petabyte. And he never foresaw the culture of the personal computer, the ubiquitousness of the Internet, the dynamic forces that are pulling the entertainment industry apart and putting it (maybe) back together in an entirely new way...or MIDI.

MIDI? "Why would anyone want to talk about MIDI in 2001?" I hear you cry. "I thought MIDI was like so last century!" Well, it was. But it's a little early to be digging its grave and dancing (no doubt using downloaded 124bpm loops) on it.

I'll admit it. I'm a MIDI-holic. Yes, my name is Paul, and I use MIDI. A lot. I compose with it, perform with it, mix with it, process with it, teach it and, yes, write about it. And except for the last one or two items, I'll bet most of you do many of the same things. It's become so commonplace, so mundane that we don't even think about it anymore. And it's true that it's not very exciting, compared with the tools we now have for manipulating real audio. But even though messing around with MIDI data may not be as immediately gratifying as running old Rick James samples through Acid, it's still an important part of what our industry is all about (especially among those apparently dwindling numbers of us who value originality).

But MIDI is old stuff, right? And nothing's happening with it, right? Yes, it is old stuff (the MIDI Specification, after literally hundreds of changes in its 17-plus years, is still referred to as "1.0"), but to say that it's moribund is to ignore some very important work that's being done today to keep it useful in the age of digital video, T3 Internet connections and GigaHertz desktop computers.

A lot of the work, not surprisingly, has been on the consumer side of things, where marketers and manufacturers see potential numbers that exceed by orders of magnitude the size of the professional audio and music markets. MIDI is still viewed as an efficient and highly flexible way of handling music for games, Web sites and similar applications that require either low bandwidth or a high degree of interactivity. It's still a lot easier--and more convincing, if it's done right--to make a MIDI file instantaneously change the mood of a piece of background music in a game than it is with digital recordings, no matter how many tracks you might have to play with. And when a game designer has used up all available CPU speed and RAM on polygon generation and has forgotten to leave any room for audio, there's always enough space to slip in a MIDI file. As for the Internet, when you are dealing with typical dial-up connections (which most people still have), any audio file, even after you've crunched it through the compression algorithms of MP3 or Real Audio, goes down the pipeline way slower than a MIDI file.

While many consumers still associate MIDI with the cheesy FM sounds of early PC sound cards, even the cheapest "wavetable"-based chipsets of today sound a lot more respectable than that. (Wavetable is actually a misnomer for these devices, because they are, in fact, sample-based, and true wavetable synthesis is something completely different. But I won't get into that now.)

Much of the credit for the improvements can be taken by the MIDI industry's adoption--through its administrative body, the MIDI Manufacturers Association (MMA, www.midi.org)--of Downloadable Samples Level 1 (DLS-1). The significance of DLS-1, which is now almost four years old, is that instead of being stuck with the sounds a manufacturer puts into a synthesizer chip's ROM, or the 128 sounds in the original General MIDI specification, a composer or sound designer can create custom sounds in the form of samples. These can be downloaded as a block into dedicated RAM on the chip and then called up quickly and polyphonically from a MIDI file. In many ways, this makes for the best of both worlds: A 2MB sound set and a few hundred kilobytes of MIDI data can provide literally hours of high-quality, completely interactive music. (Another technology that follows the same general idea is Beatnik, Thomas Dolby Robertson's contribution to music on the Web.)

But DLS-1 didn't solve everybody's problems. Even before it was developed, Creative Technology, the parent company of E-mu and Ensoniq, was working on its own version of this concept, calling it "Sound Fonts," which was similar to DLS-1 but with more advanced performance features.

DLS-1 and Sound Fonts threatened to cancel each other out, until Creative and the rest of the MIDI industry (as well as the MIT Media Lab and some other interested parties) came up with a higher functioning standard that was acceptable to everyone, and not proprietary to anyone (as Sound Fonts was). This is now known, not surprisingly, as DLS Level 2. The major improvements in DLS-2 are dynamic filters and matrix-based modulation, two features that are essential to any professional-level sampler or synthesizer. DLS-2 was formally adopted by the MMA in the summer of 1999 and has reached beyond the MIDI community to become part of the MPEG-4 standard, where it is called "Structured Audio Sample Bank Format."

The first DLS-2 chips are about to hit the market, and one manufacturer claims that by the end of 2001, 40 to 60% of all computers being made will have DLS-2 sounds built right into the motherboard. On the game side, Microsoft is supporting the new standard in its upcoming X-Box platform.

Running parallel to DLS and DLS-2 has been the adoption of General MIDI Level 2. Before the ink was even dry on the original General MIDI Specification, which was supposed to ensure a high degree of file compatibility across many synthesis platforms (again, mainly in the consumer realm), Roland and Yamaha announced "extensions" to GM that were, of course, incompatible with each other. These extensions gave their devices more polyphony, effects like reverb and chorus, and an expanded sound palette. Other manufacturers of domestic keyboards and low-cost sound modules wanted to be able to improve the capabilities of their products, as well, but didn't want to have to license technology from Roland or Yamaha, or invent their own. So, they clamored for a nonproprietary expansion to GM.

GM Level 2 (formally adopted in November 1999) increases the minimum polyphony of an instrument from 16 to 32 voices, defines more controllers and more precisely than the original spec. For example, the new spec includes a formula for mapping MIDI volume controller values to amplitude in dB. (This is largely in response to a survey I designed in the early '90s on behalf of the MMA, in which it was found that controllers were being used very differently by different manufacturers.)

GM Level 2 also mandates and defines effects and significantly increases the number of available sounds, both instrumental and "rhythm," or percussive, using Bank Change commands to augment the 128 program changes. The advantage of a GM-2 instrument over a DLS-1 is simply that there is no sound set at all to download. So in applications where there isn't time or RAM for a downloadable sound set, music can play instantly; even at dial-up connection speeds, a MIDI file playing over the Internet is indistinguishable from one playing over a MIDI cable.

The first units to adopt GM Level 2 are from Roland, interestingly enough, and are the latest models in their Sound Canvas line, which started the whole General MIDI movement; and Korg--surprisingly, in its high-end Triton rack. More are expected to follow. Be sure to check out the product intros at winter NAMM this month.

But things are happening at the other end of MIDI, too--the professional end. The lowly MIDI cable, with its 31,250-bit/second speed, is ridiculously slow compared to today's networking and busing capabilities, and that fact has not been lost on the MIDI developer community. While MIDI over SCSI never was practical (SCSI is fast, but it works in spurts, which is okay for buffered digital audio, but not okay for the real-time control that MIDI requires), there have been strong efforts to incorporate MIDI with the newest networking protocols: USB and IEEE-1394, or FireWire.

USB MIDI interfaces have been around since early 1999. After Apple released the first USB Macintoshes, manufacturers like Emagic, Roland, Steinberg and Mark of the Unicorn scrambled to put out USB-compatible MIDI interfaces. Now there are a dozen or more on the market, from simple palm-sized 1-in, 1-out boxes to rackmount multicable interfaces with SMPTE and audio I/O. Happily, a standard method for putting MIDI on a USB cable is defined by the USB Implementers Forum (USB-IF, www.usb.org). Unhappily, the MIDI Manufacturers Association never endorsed the USB MIDI spec--and you'll see why in a moment.

USB has been very successful in replacing, or at least displacing, many of the disparate computer-networking formats like serial, parallel, PCI or SCSI ports. Printers, modems, scanners, removable media drives and gadgets we didn't even know we needed just a couple of years ago are now using USB cables. There are great advantages to USB, such as the ability to connect up to 127 devices of all kinds to a single computer (using bridges and hubs), automatic configuration (no more IRQ or SCSI ID nightmares), the ability to "hot-swap" devices, and higher potential throughput than any of the formats it replaces, with the exception of SCSI.

So what's the problem with MIDI? According to Jim Wright at IBM Research, a longtime member of the MMA Technical Standards Board and chairman of the organization's working group concerned with new transports, USB has timing problems that make it problematic for MIDI. He has conducted tests comparing "classic" (i.e., serial, parallel, PCI or PCMCIA) interfaces against USB interfaces, looking at their round-trip latency (the amount of time it takes for a MIDI event to get in and out of the interface) and their jitter (the variation in the latency). He found the latency in the USB interfaces to be between seven and eight milliseconds, about three times that of the classic interfaces. This is not in itself an insurmountable problem, because musicians adjust to small latencies in sound sources quite well--a bass player and a lead guitarist standing seven feet away from each other usually have no trouble staying together.

But the jitter in USB interfaces was also much higher than the older interfaces--about twice as high, meaning (to continue our analogy) that the two players could at any given moment be five feet away from each other, and the next moment be 10 feet away--and constantly moving. In another analogy, which Wright likes to use, imagine playing a slightly arpeggiated guitar chord: The jitter could make it sound as if one of your fingers jerked slightly while you were playing the chord. And for tight grooves and thick MIDI data streams with lots of aftertouch or controllers, this level of jitter is really unacceptable. Wright also found that when you add audio to the USB stream, the jitter goes up another 50%--so it's three times what MIDI musicians have had to deal with in the past.

Why is this the case? Well, the USB developers, according to Wright, came to the MIDI community very late in their development stage, and thus the MMA and its Japanese counterpart, AMEI, didn't have much of a chance to give their input about how MIDI on USB was going to be handled (although Roland, acting on its own, got involved much earlier). On a USB cable, MIDI uses asynchronous timing (that is, there's no underlying clock as there is with, say, AES/EBU digital audio), which means if there's a lot of traffic on the line, then the MIDI data will be delivered in fits and starts, and there's no guaranteed delivery time, even under the best of circumstances. (The same is true for a standard MIDI cable, but preventing this is what multiport interfaces are for!)

Audio on USB, on the other hand, uses isochronous timing, which means the delivery time is guaranteed. So the problem is further compounded by the fact that because they use different timing schemes, MIDI and audio data on the same USB cable can easily lose sync with each other. Getting MIDI and audio to work together in perfect sync is something software and hardware developers have labored hard for years to achieve, and now we're potentially seeing all those efforts being tossed away.

The interface manufacturers are not unaware of these problems--it's this very issue that's behind the huge advertising campaign that MOTU has been running promoting its "MTS," a proprietary system of time-stamping MIDI events as they enter the USB cable to overcome USB's timing problems. Time-stamping of MIDI events has never really been necessary before, because the latency and jitter of the synthesizers themselves have been greater than that of any delays in the MIDI network (or the resolution of MIDI itself, for that matter), but that's no longer true with USB. Emagic has followed MOTU's lead and is using its own version of time-stamping, and Steinberg is reportedly planning something similar.

But it's the same old song: None of these solutions are compatible with each other, which negates the entire philosophy of MIDI and USB. MOTU's MTS works only if you have the company's software and hardware and not with Emagic's hardware or Steinberg's software, and vice versa, et cetera, ad infinitum.

It's the computer manufacturers who are potentially in the best position to do something about this, and perhaps they will. Mac OS X might include time-stamping in its MIDI drivers, according to some sources. Doug Wyatt, the developer of the Opcode MIDI System, the best software driver for multiport MIDI on the Macintosh (and the primary casualty in the train wreck Gibson has made of that poor company--more on this next month), is reportedly leading the OS X MIDI team, but Apple isn't saying much about it just yet. (And, sad to say, their corporate track record on MIDI support has been consistently pretty miserable.)

Similarly, according to Jim Wright, the Windows Streaming MIDI API has a 1ms time-stamping feature already built-in, but it only works on output, not on input. Microsoft's DirectMusic supports time-stamping (at a far greater resolution: 100 nanoseconds!), but apparently none of the hardware interface makers are taking advantage of this yet.

Last April, the USB-IF (led by Intel) announced USB 2.0, in which throughput is increased by a factor of (take a deep breath) 40. Will it solve the timing problems? Until someone comes out with a USB 2.0 computer and a USB 2.0 MIDI interface and someone (else) tests them, we won't know.

IEEE-1394, though it's more expensive, seems to hold a lot more promise for the future of MIDI. I'll talk about that next month, as well as some new and proposed enhancements to the Standard MIDI File spec and (dare it be whispered) the possibility of MIDI 2.0.

Starting early in the new year, Mix's Web site, mixonline.com, will have a whole new look. It's going to be much more tightly integrated with the sites of our sister magazines under the Intertec/Primedia corporate umbrella, including Millimeter, Sound & Video Contractor, Broadcast Engineering, Video Systems, Entertainment Design and, of course, Electronic Musician, as well as other services like Digibid.com. There will be some new features, and some current features will be discontinued. I'll continue to work with Mix Online, but I may not be as visible a presence. I'll still be writing this column, and I can still be reached at mixonline@gis.net, so please stay in touch. And thanks to the many thousands of you who have made developing and running Mix Online so much fun these last three years.

"Insider Audio" columnist Paul D. Lehrman thanks Jim Wright, Tom White and Rick Cohen for their help, and promises to let go of his MIDI cables when you pry them out of his cold, dead hands. Or whenever he converts his entire studio to 1394, whichever happens first. Read about his latest multimedia adventures at antheil.org.

These materials copyright ©2001 by Paul D. Lehrman and Intertec Publishing