AT&T Rock: Audio Innovation

Since the mid-1970s AT&T researchers have been exploring ways to deliver CD-quality music over communications networks.

The key to it is "audio compression" and the key to that is "perceptual coding." Audio compression is necessary because compact-disc-quality sound requires roughly 1,400,000 bits of information per second, while the number of bits that can be delivered over networks to homes and businesses is roughly 128,000 or less. That involves a "compression" of 11 to 1.

If you'd like to see some AT&T Labs compression technology in action, go to the a2b music website. You'll be able to download free software that will play digital music encoded with the latest perceptual coding algorithms. You can select and purchase songs, download the compressed file, and play them on your PC.

The a2b music approach incorporates AT&T Labs encryption technology, making it safe for musicians and publishers to transmit music over the web. It also employs a flexible music licensing system that controls how music is used and distributed over the web.

Early compression experiments tried waveform coding, but the programmers couldn't reduce the bit rate much below about 400,000 bits per second while still retaining the quality of the CD original. That still left way too many bits to be funneled into the network.

What You Can't Hear Won't Hurt You

Perceptual coding takes an entirely different approach. It works by identifying "irrelevancies" in music that cannot be heard by the human ear. When the music is stripped of this excess information, music encoders can work at much lower bit rates and still deliver sound quality indistinguishable from the original. The "compressed" audio sounds as good as the CD, but its file size is much, much smaller.

Music Over The Internet

AT&T researchers have been at the forefront of audio compression from the beginning, and have been awarded many fundamental patents. AT&T audio encoders, from PXFM to ASPEC to PAC, have consistently been among the best, both in bit rate reduction and delivered sound quality. AT&T researchers have been active participants in setting standards for music encoding, and have contributed both to MPEG layer 3 ("MP3") and more recently to MPEG Advanced Audio Coding, which incorporates many algorithms from AT&T's PAC encoder.

The Sound Squeeze Team

Audio compression depends on many skills and sciences. The algorithms rely on fundamental research and experiments performed by mathematicians and acousticians both inside and outside of AT&T. The implementation of the algorithms into working computer programs benefitted from the contribution of computer scientists within AT&T. And the final "tweaks" -- or tuning of the coder to produce the finest sound quality at the lowest bit rate -- depends on "golden ear" engineers and musicians who have been involved with music recording and playback from its inception.

AT&T's work in acoustical research reaches back to the invention of the telephone, and today covers a broad spectrum of areas -- including speech synthesis and speech recognition.

Your Computer Can Talk!

Creating natural-sounding speech synthetically has been an elusive goal for speech researchers. Human voice and language, with its nuances of intonation and prosody, has been nearly impossible to map using mathematical models conjuring wholly electronic sounds. First generation text-to-speech applications have been as intelligible as the technology would allow, and have been used successfully for many years.

Researchers have been working to evolve text-to-speech past its current harsh, robotic delivery. A new approach uses actual vocalizations as a basis for modeling the text into natural speech. Sampled sounds are extracted from a speech database, and text is converted to warmer, more natural sounding phrases.

Try typing your favorite rock lyrics into this text-to-speech demo from AT&T Labs.