Modern studio quality audio interfaces can support up to 192KHz sampling rates, more than 4 times CD quality. Can you hear the difference?
Recently at Mission I’ve been performing testing on the digital audio interface in the Gemini Amplifiers. Gemini are a series of powered amplifiers with a digital audio interface on board. You can connect up a computer via the USB connection and Gemini shows up in your software as an audio interface. You can record and playback directly though it without having to use a sound card or USB audio interface.
Since we expect the product to be used in professional studio environments, we have utilized an audio interface that supports multiple different sampling rates and bit depths such as we see in the modern recording studio. The XMOS USB is a 16 core micro controller with USB 2.0 up to 480Mbps and built in DAC for audio encoding up to 192KHz at 24 bit. Here’s the data sheet on the controller.
Up to now most of my testing has been focused around a 48KHz sampling rate. Before I moved on to doing more analysis on the higher bit rates, I thought I should do some research to familiarize myself with the use cases and testing methods for higher bit rate encoding. This lead me down an interesting and entertaining side track.
It turns out that there is quite a debate on the subject of sampling rates. On the Audiophile side of the fence, proponents are convinced they can hear huge differences between different sampling rates. On the Engineering side, people are equally convinced that physics and biology clearly prevent any human being from being able to perceive a difference in audio at sampling rates beyond the CD quality 44.1KHz 16bit format. So which is right? Let’s find out.
My investigation started with this article 24/192 Music Downloads and why they make no sense by Monty Montgomery from Xiph.org. This is well worth a read, and provides a solid grounding in how the digital audio process works. There’s also a follow up video which explains why the stair step analogy of a digital representation of an analog waveform is incorrect. The video includes a really well put together demo which is very entertaining as well as informative.
The next step was to test myself to see if I could perceive any difference. This proved a little harder than expected, but I eventually came up with a system that should be a least a decent starting point. My first test was just to download some audio in different formats from the web and see if I could obviously tell any difference. There are a few high definition audio vendors that provide free test files.
I used these from Naim Audio – Test files
and these from EClassical.com – Test files
I played the files from iTunes on my iMac. The Mac is a 2013 model utilizing Intel High Definition Audio for its internal audio output. The HDA spec indicates support for 192KHz and 32bit, but it looks like OS X core audio only implements up to 96KHz and 24 bit because that is the maximum I could select using the audio MIDI setup application. I used my KRK KNS 8400‘s which are an honest set of monitor headphones with a nice flat response. The frequency response of the headphones is listed as 5Hz to 23KHz which exceeds 20Hz – 20KHz range typically cited as the limits of human hearing. Like most people, my high frequency perception has deteriorated with age and I cannot hear a 20KHz sine wave at typical SPL levels so the headphones should be fine. Being able to perceive ultrasonics in music is at the center of the whole discussion anyway, so this will do for now. Later I’ll try something with a higher frequency range and see if it makes any difference.
This first test revealed that I could not tell any difference. Although I was playing the files knowing which was which, I could not say with certainty I could tell one from the other: They sounded the same. I needed a better test to see if this was really the case, so next I downloaded ABX tester. This is a handy little Mac App that lets you compare two files and then test yourself to see if you can tell which is which. I tried it three times with the audio files linked above to see if I could identify 16 bit 44.1 KHz vs 24 bit 96KHz. I could not. In three tries I scored 40%, 60% and 60%.
To eliminate the headphones, I repeated the ABX test but this time I connected the output from the Mac to my HiFi. My Yamaha HTR pre-amp has a feature called Pure Direct that disables the processing and tone features and just sends the signal straight through to the power amp which is a Krell KSA 100 driving Martin Logan electrostatic speakers. Ok, I’ll admit I spent some time on the Audiophile side of the house myself. I reran the ABX test and scored 20%, 80%, and 60%. The mean of the headphone and speaker tests comes out at 53.3%, which is pretty much exactly what we would expect for guessing.
At this point I had a few concerns. The first was that something was going on that meant the files were all the same (or at least not getting the full benefit of the difference) and my tests lacked a control. Another concern was that since I didn’t know for sure how the original files were put together, maybe I was missing something. So in the next step I created my own files. I used Logic X Pro and put together a short song. I set Logic to record at 96KHz, 24bit. I utilized some hi hat and crash cymbals at the top and bass at the bottom. I mixed it about as I normally would, but made sure not to limit or filter frequencies outside 20Hz – 20KHz. I then bounced the song down to the following formats without making any changes to the original recording. On all the PCM files I used interleaving format, no dithering, and disabled normalizing.
Here is the audio.
96KHz, 24 bit uncompressed:
48KHz, 24 bit uncompressed:
44.1KHz, 16 bit uncompressed:
96KHz, 24 bit Apple looseness encoder (ALAC)
64Kbps MP3
In the final 64Kbps compressed MP3 file, I can tell the difference. I identified it 100% of the time with the ABX Tester. Listen in particular to the crash cymbal at 0.04. Can you reliably tell the difference between any of the others in a blind test? Let me know in the comments below.
Tools
Original audio files for downloading and running your own ABX test.
96KHz, 24 bit uncompressed
48KHz, 24 bit uncompressed
44.1KHz, 16 bit uncompressed
96KHz, 24 bit ALAC
64Kbps, MP3
ABX Tester for Mac . This is the one I used for blind testing. I found a couple of others for Windows and Linux. I didn’t try these myself, but here are the links:
Lacinato’s cross platform AXB Tester
Foobar 2000 Windows audio player with ABX testing capability
Bitperfect for iTunes. This handy Mac app ensures iTunes always plays files back in their native file format rather than up or down converting. This is useful if you want to use iTunes to do AB listening tests. I used this when doing my non-blind comparisons between file formats. It costs $9.99 on the iTunes store. Since I also use iTunes on my Mac as a music player it seemed like a good investment.
While yes, it is very hard to tell a difference, it makes a difference when slowing down audio. When slowing down the speed of audio, it makes it lower in pitch and decreases the sample rate.
Let me make an example of how it would effect a sound.
Let’s say you have a song that is 44.1kHz (bit depth doesn’t matter in this case,) and you slow it down by 50%. Now, the pitch is noticeably lower, and the sample rate is cut in half to 22.05kHz. Now, had you done this with a song that is 96kHz, the sample rate would still be 48kHz, more than you need to cover the full spectrum.
So, really, it isn’t extremely important for regular listening. That said, as I’ve shown, it can be extremely helpful for things like remixing, or even just for fun, as you’re able to slow down the audio and not have it sound compressed.
You can do this the other way too, but that’s not really as important. If you’re on a budget, and can’t record very high quality, you can technically play the song at half tempo and 1 octave down, then double the speed to have it be “higher quality,” but I can’t imagine this being taken advantage of in a professional context.