Digital files don't wear out, right? This is one of the big advantages
of the medium, particularly in studio situations: people love the warmth
of tape, but it's fragile and it loses a tiny bit of fidelity every time
you play it, much less when you make a copy. If you read a lot of studio
how-to articles (a guilty pleasure of mine), a common theme is the
engineer who records on tape for the sound, then immediately dumps it
into Pro Tools for actual editing and mixing. And of course you can make
a perfect copy of a digital file, where as there's no such thing in
analog.
With one exception: back when DRM'd music sales were the norm, the
typical way to remove that DRM was to burn the file to a CD and re-rip
to MP3 format. This was seen as kind of a kludge, because the process
involves conversion to a lossless .WAV format and then back into lossy
pyschoacoustic compression. In theory, every time this happens, the
latter step means a loss of information, and thus fidelity.
But how much of a loss? I started wondering this when I went to make a
CD for a fellow dance student from some MP3 files I'd gotten from More Than A Stance. I didn't
know how he planned to play them or how tech-savvy he was, so audio CDs
seemed like a better choice than audio files on a data CD. But if he
decided to rip the CDs back, how bad would the quality hit be? I decided
to find out.
Using some shell scripting (first PowerShell, then old-fashioned batch
files--never use a computer without at least one scripting option,
kids), I sent a couple of MP3 files through a conversion roundtrip a few
hundred times. My choices were "Beam Katana Chronicles" from the No More
Heroes soundtrack and a remix of the Jackson Five's "Life of the Party"
from DJ D.L.'s Soul Movement II, picking these particular tracks for a
few reasons:
Both tracks are relatively close to the real-world case I was
trying to figure out, with the latter being an actual dance track.
Both were layered compositions, with plenty of detail to lose
during conversion.
Both included strong percussion tracks with plenty of hi-hat and
snare--the kinds of high-frequency transient noises that easily smear
and blur under psychoacoustic compression.
I used LAME to do the decoding and encoding at a 256kbps bitrate. On the
first test, I actually ran the file out to a separate .wav and back. The
second time, I figured out how to pipe the stdout from one LAME instance
to the stdin of a second, and just bounced it between two MP3 files,
which was much faster.
The results were surprising. Here's a table with some samples (caution:
may be loud), which I'll summarize below.
iterations
track
audio
original
No More Heroes
DJ D.L.
50
No More Heroes
DJ D.L.
100
No More Heroes
DJ D.L.
500
No More Heroes
DJ D.L.
At under 10 iterations, I can't tell a difference between the two files.
At 30-50, it's subtle--there's a little bit of swirliness around the
high end, and the transients are a little blurry, but nothing more than
you'd expect from, say, a turntable. It's not until you hit 100
iterations--that's 100 times going from an MP3 file to a WAV and
back--that it starts to become noticeable. At that point, there's some
definite artifacting, and you can start to hear a little bit of pumping
in the volume after each peak. Even still, it's not much beyond the
extremes of dynamic compression that have emerged from the loudness
wars, and if you snuck it into my playlist I wouldn't guarantee that I'd
pick it out. Once you get beyond 100, it becomes more obvious that
something's broken. By 500, there's some real glitchiness going on when
the track hits full volume--surprisingly, much more in the NMH track
than the J5, although the latter also has its "underwater washing
machine" moments.
There are a few holes in my experiment that would be interesting
to test:
I used a symmetrical encoding and decoding process, with the same
codec feeding into itself. It would be interesting to see how a mix of
two or more encoders would change these results. It's likely that this
would accelerate the decay rate, but would it be enough to overcome the
sizeable margin in this test?
Likewise, this was a test of high-bitrate encoding--simply because
that's the scenario where most people would realistically encounter. I'm
guessing the minimum bitrate for most people is 192kbps, and anything
you buy these days is usually higher. But yes, at lower bitrates I'm
guessing this is dramatically more detrimental.
Finally, this is a test of MP3. I like MP3, and I think the folks
behind LAME have done about as good a job with it as they could, but it
is a last-generation compression format. It'd be interesting to see how
OGG, AAC, or WMA could stack up against it.
Still, I have to admit this is far better performance than I expected
going in, and I was cheering for LAME to begin with. I think we can
safely reach the conclusion that for limited, real-world cases of
digital dubbing, there's no serious impact on sound quality that wasn't
already lost in the first MP3 encoding. Burn and rip away!
Mile Zero is the personal website of Thomas
Wilburn. All
statements
and opinions here are my own, and do not represent the views or policies
of my employers at Congressional Quarterly, Ars Technica, or other
publications.