|
|
|
|
Download 13 second Sample Sound Original wav: 1.75 MB MP3 and OGG: A 128MB portable MP3 player can store about 136 minutes of good-fidelity audio (at 128kbps). If a player supports ogg, it could hold 200 minutes of similar quality audio (at 87kbps) |
Ogg Vorbis is an
excellent new sound compression format--the next MP3. Ogg Vorbis files sound
better than mp3s for a given file size, or, alternately, are smaller
than mp3s for a given sound quality. If you see a file with an "ogg" extension,
it's probably Vorbis audio. The degree to which Ogg Vorbis sounds better is particularly evident for low bitrates. Listen to the 56kbps (7 KB/sec) files on the left and hear for yourself! (If you can't tell the difference, you badly need better headphones or speakers. Or, you're deaf, in which case you aren't reading this.) There are caveats for converting MP3s to Oggs. See here. See also FLAC, a 100% free lossless audio encoder. With FLAC you can store perfect copies of your audio in almost half the space of uncompressed WAV files. More Information
Note on terminology: Vorbis is the name of the audio compression; Ogg is the name of a larger project to create an open multimedia system. However, Vorbis is the only completed part of this system, so many people treat Ogg and Vorbis as synonyms. ArtifactsArtifacts are specific defects in the quality of audio or video. How an artifact sounds or looks depends on the type of data. For example, TV in North America is transmitted using NTSC, an analog data format. There are a specific set of common defects in NTSC signals. These include "static" (where random, constantly moving noise is overlaid on the picture), "ghosting" (where a faint copy of the picture can be seen, usually to the right of the main picture), and over/under saturation (where colors are far more or less intense than they should be). These might be called "NTSC artifacts", or in the case of static, which affects all analog signals in a similar way, "analog artifacts". Digital signals don't suffer from static in the same way as analog signals; they have their own kinds of artifacts. For example, have you ever been watching a TV show with a perfectly clear picture, when all of a sudden little square blocks of color or garbage appeared here and there on the picture? These blocks, which disappear slowly or abruptly within a few seconds, are probably defects in the MPEG video stream used. It is possible for any piece of data to get defects in it, but how the defects actually look (or, for audio, how they sound) depend on the kind of data and the kind of defect. When you see those little squares of garbage, it's a good bet you're seeing MPEG video artifacts, because that's what it looks like when there are small errors in digital MPEG data. |
| Below: original image (0.4KB with lossless compression) Below: JPEG compressed image
(2.1KB) |
Not all artifacts are caused by errors in the data; some are by design, or
rather, by limitations in design. Error-free (i.e. non-corrupted) JPEG images
have obvious artifacts, for example, when trying to compress an image with
sharp edges, such as the one on the left--especially when there is a color
change as well. JPEG is designed to work best for smooth gradients of color and
brightness; when there are sharp changes in either, clear artifacts appear,
and not much compression is achieved. I suppose I must apologise; whenever I start talking about something, I feel I must explain it fully before moving on. Hopefully you now understand what artifacts are so we can get on with talking about Ogg and mp3. Almost. But first: |
Even though CDs can hold a lot of data (such as the uncompressed full text of an Encyclopedia), regular 74 minute audio CDs have uncompressed data (in other words, the data on your store-bought CD might be identical to the digital master at the recording studio.) Uncompressed audio takes up a lot of space (i.e. digital bits). Digital compression is any technique that makes your digital data take up less space (fewer bits). Uncompressed audio takes up a lot of room, but with compression you can easily get 10 times as much audio in the same space.
Compression can be either lossy (meaning some of the data is lost or distorted) or lossless (meaning that after compression, the original data can be perfectly recovered from the compressed data.) Pieces of software used specifically for compression of audio or video are known as codecs. Of course, lossy codecs generally have better compression than lossless codecs. MP3 and Ogg are both lossy.
A lossless codec is concerned with how to represent the same data in fewer bits. For example, consider Morse code. Morse code is a lot like digital data, because there are only two data values: "dot" and "dash". This is comparible to digital data, in which everything is fundamentally represented with bits, which are zeros and ones. The main difference is, morse code has a third data value, nothing (neither dot nor dash), which is used to separate letters and words. But let's ignore that for a minute.
Normally, every character (spaces, letters, and punctuation) in uncompressed text data in a computer uses the same amount of bits. For example, ASCII text requires 7 bits, which gives 128 combinations of 7 zeros/ones. It's stored this way because fixed-size data is easier for a computer to deal with. But morse code is variable size, and the most common English letters use fewer "Morse bits" (dots/dashes). The most common letter of all, "E", is just one dot, and the second most common, "T", is just a dash.
When a computer transmits text uncompressed, it always uses at least 7 bits per character (often 8 bits to account for special symbols and accented characters, or 16 bits to send large character sets like Chinese.) But if you are transmitting ordinary English in Morse code, it is essentially more efficient than a computer, because more common letters use fewer Morse bits. So you usually save bits (except when transmitting unusual data like a series of numbers or punctuation characters). Thus Morse code can be thought of as a simple kind of compression. In Morse code, the words are compressed losslessly, because you don't lose any of the letters when sending Morse code. However, in a sense Morse is a lossy compression because some information that may have been in the original text is lost, such as the case (uppercase/lowercase).
A lossy codec can make data smaller in at least three ways:
As an example of (1), if a very smart image codec was trying to compress a picture of my face, it might determine that the pimples were not important data and therefore discard them. Then, when it later uncompresses the image in order to display it, it would need to reconstruct the missing parts of the image, perhaps by guessing that there might have been a pimples in the missing parts, and inserting what it thinks they might have looked like. Of course, real codecs aren't smart enough to know what they are compressing, so they would use fancy fourth-year-university statistical analysis and such to figure out what bits we humans might not notice are missing.
As an example of (2), the aforementioned very smart image codec might figure that it could store my face in fewer bits if my hair was spaced according to a certain mathematical formula. So it might move around the hairs just slightly here and there--trying to do it only a bit, so no one notices--to make it resemble some formula or pattern, and then feed the result to some lossless compressor that is known to compress that pattern very well.
While a lossy codec could discard any part of the data, as a rule it tries to discard only the least important parts of the data. Of course, what is least important depends on how the compressed data will be used. The aforementioned very smart image codec would not be a very good choice at all, for example, if the compressed images were going to be used for a scientific study of pimples.
Though an ISO standard, MP3 encoding is covered by patents exercised by Fraunhofer Institute. At one time there were several freeware developers creating MP3 software, but once MP3 became popular, Fraunhofer began using legal means to rid the internet of such free programs. Now developers have to pay for any MP3-encoding program they give away. You can still find free MP3 encoders in the dark corners of the net, but, at least in the U.S., they are illegal. (MP3 players are still free, however; the patent does not cover playback.)
As a GPL software developer, and a free thinker, I really don't like patents--especially in the technology field, where the US patent office grants long-term monopolies for the most basic technologies of the computer age. Unfortunately the European Union seems to be moving ever closer to US-style patent policies as well. Perhaps the worst thing for developers is, it's tough to know whether you're infringing somebody's patent. Imagine how you would feel if you created something--but didn't realize there was a patent pending, or granted--and were then made to pay someone else to give it away!
Ogg Vorbis audio was developed to be 100% patent free, and as far as anybody knows, Vorbis infringes no patents.
MP3 and Ogg Vorbis encoders analyze sounds and throw away much of the sound data. This fact, combined with other compression techniques, make Ogg and MP3 files much smaller than the original data from the CD or other source.
But if they discard data, how can a compressed file sound the same as the original? The answer is that human ears cannot sense every bit of the sound data; they can discern only part of the sound, its "essense", if you will. Both MP3 and Ogg are based on acoustic models of what the ear can hear. They try to discard data, and distort data to aid compression, only when they think our ears won't notice the difference.
Either codec can be told to compress data by any desired amount, but at some point it becomes impossible to only discard data the ear cannot hear. For a high bitrate MP3 such as 160kbps, only the best trained ears listening on the best equipment can hear the difference between the MP3 and the original, but as the bitrate is decreased, the data loss and distortions become noticible to more and more people. Eventually, if the bitrate is low enough, the data becomes indistinct and you practically can't make out individual words or sounds.
The type artifacts depend on the codec used. I have noticed the following MP3 artifacts in particular:
In contrast, at the lowest rate my encoder allows (48kbps), Ogg audio seems to lack swishing completely, and more high frequency sound is retained. The only problems I've noticed at this rate are
I hesitate to even call these artifacts, because when you think about it, this is the best case scenario: for when trying to fit so much data into a tiny space, some of it must be discarded. Thus the best that can be hoped for is to discard the least audible parts of the sound, and leave the remainder without distortion. Vorbis comes pretty close to this ideal, so I have nothing but praise for it. In fact, if you're planning to listen to music in a noisy environment, you may as well use 48kpbs Ogg files, since you wouldn't be able to pick out the subtle missing elements anyway.
Newer versions of WinAmp support Ogg Vorbis (a plug-in is available for old versions), and there's new software coming out left and right that supports it. Unfortunately I haven't seen any portable Ogg players yet, but I understand there will be one soon: the Neuros Digital Audio Computer . For Winamp 2.x users, a quick way to make Oggs is to use this plug-in. See here for the list of Vorbis software at Vorbis.com.