Feb 25, 2005

I wanted to try compressing audio with JPEG.

Yes, you heard me right. I want to use an image compression on audio. I was little curious about such a synesthesic compression experience. The question is, how well would JPEG do on audio files?

JPEG is a block based transform. Basically, this means it treats images as an array of blocks of pixels (usually 8×8). It then applies a transform (DCT) to each block, and quantizes the heck out of the high frequency information, thereby reducing the amount information in the coefficient domain. After this, some type of lossless entropy coding (Huffman, usually) is applied to coefficients. And voila, you have a compressed image.

Audio compression is similar, but takes advantage of psychoacoustic models of the ear. It also treats the data in blocks and allocates bits to each transform coefficient based on acoustic masking models and the such.


First, I converted my test WAV file (Handel) into a picture (as seen on the right). I simply chose an arbitrary size and normalized the audio data to fit the image specifications. The audio data here is row major. We see some sinusoidal patterns in the image, as expected.

Then, I applied JPEG compression. Afterwards, I decompressed the JPEG and read back the data into a WAV file. The following are the results as WAV files.

The original can be heard here: original.wav [36 kB].

JPEG Quality Size of JPEG Compression Ratio WAV File
99 16 kB 2.25 out99.wav
95 11 kB 3.27 out95.wav
93 10 kB 3.6 out93.wav
90 8 kB 4.5 out90.wav

The audio degrades horribly for qualities 90 and below. This is probably due to the fact that JPEG is not so careful about blocking the data, and discontinuities are more common near block edges. This is unacceptable for audio coding.

However, we do get some reasonable results at higher quality levels. There is slight degradation at 99 quality level, but not too annoying. Of course, the compression ratios are horrible compared to audio centric compression schemes like MP3 or AC3. Just for a reference, a zipped version of the original is about 22kB. So indeed, we are saving more than just entropy coding.

2 Responses to “Synesthesic Compression”

  1. min Says:

    stop being so smart yo

  2. Tippy Says:

    Yeah, i try to get him to stop. It is hard, he’s on an image compression spree. He tried it on our fish. Unfortunately, it didn’t work out so well. Unlike the audio file, our fish degraded horribly for JPEG qualities 99 and below…

Leave a Reply