Why Oversample?

An AI search told @AndyBell that oversampling (a/k/a upsampling; technically correct term is interpolation) would clear up inaudible frequencies. I assume the AI means ultrasonics, because it’s a popular notion on the web that this is what high sample rates/oversampling is for, to bring out ultrasonic frequencies. Actually that’s almost the exact opposite of what oversampling is for. So let’s discuss.

First, the question is almost always not whether to oversample, but where. That is, nearly all DACs will take a “bit perfect” signal you feed them and do their own oversampling internally. However, many DACs will take a signal that is oversampled to DSD resolutions in a computer and pass it right on through to the final analog filter. Even where the DAC does additional oversampling, oversampling first in the computer usually avoids some oversampling steps in the DAC. So regarding whether to oversample, the issue is almost never yes/no, and nearly always computer/DAC. Modern computer CPUs have the capability to run more sophisticated filters than DAC chips, so that’s one possible factor in the decision, but of course the final determination is up to your ears.

Regardless of where it’s done, why has oversampling been standard engineering practice in digital audio since before there were such things as separate DACs (very soon after the introduction of CD players, the DAC chips in them were doing oversampling)?

I’m sure we’ve all heard of the Heisenberg Uncertainty Principle in quantum mechanics. (Rest easy, I’m not about to suggest there’s some weird quantum stuff going on in digital audio.) It says that the more certain you are of the momentum of a particle like an electron, for example, the less sure you can be of its position, and vice versa. This is because in the math being used, called Fourier analysis, momentum and position are what are called conjugate variables - as one variable becomes more optimized, the other becomes less so.

Well, Fourier analysis just happens to be the math used in making digital audio filters.

[A momentary aside: Why are we talking about filters? Because they are essential for digital audio. When you sample a signal at, for example, 44.1kHz, that frequency is sitting there in the sampled audio, and it wasn’t there in the original. Now yes, 44.1kHz is way above what we can hear, so why even bother with it? Because it interacts with other frequencies through aliasing and imaging, processes that result in noise and distortion at audible frequencies. Therefore we need to get rid of this frequency, and any other frequencies created by aliasing and imaging that weren’t in the original signal, and we do this with a low-pass filter (one that passes the lower frequencies we want, and gets rid of the higher frequencies that would cause unwanted noise and distortion).]

Right, so back to Fourier analysis, conjugate variables, and digital audio filters. It happens that digital audio filters have conjugate variables of their own. In lay terms, the more sharply you cut off the undesired high frequencies, which is good to get rid of frequency-based distortions, the more your filter will “ring” (technical term is the Gibbs effect - see Wikipedia for discussion), which is a time-based distortion. There is controversy over the audibility of “ringing” (one reason is that it usually occurs at ultrasonic frequencies) but why have this distortion in any case if you can avoid it?

So now let’s look at our 44.1kHz sample rate example. Due to some gentlemen named Nyquist, Shannon, and Whittaker, in this 44.1kHz sampled signal we want to get rid of all the frequencies above 22.05kHz (this is one-half the sample rate, a/k/a the “Nyquist frequency”). But some humans (mainly younger females) can hear up to 20kHz, and we would not want to deny the full spread of higher frequencies to our young Taylor Swift fans. That means our filter needs to go from passing everything to passing nothing within 2.05kHz. That’s a pretty sharp cutoff, thus, difficult to avoid ringing. But if we relax the sharp cutoff, we allow higher frequencies through, and we have aliasing and imaging distortion. (There are our conjugate variables in action.)

But now let’s oversample to a really high number, 768kHz. (We’re sticking with PCM, because introducing the subject of sigma-delta modulation to get DSD is just complexity we don’t need to make the point.) That 22.05kHz cutoff just became 384kHz (remember, half the sample rate), and the 2.05kHz spread that we had between all-pass and no-pass has increased more than 16-fold. So we have got loads of room for our filter to go from letting everything through to cutting everything off. Now it’s much, much easier to avoid both cutting too sharply (time-based distortion) and not cutting sharply enough (frequency-based distortion).

So that, finally, is why oversampling is done: It makes good filtering easier. Plain and simple.

7 Likes

Corollary…

Theory of Upsampled Digital Audio
Doug Rife, DRA Labs
Revised May 28, 2002
http://www.mlssa.com/pdf/Upsampling-theory-rev-2.pdf

(Below) Great resource, lots of information available…

:notes: :eye: :headphones: :eye: :notes:

@Jud

Thanks for the clarification!

If I’ve understood you correctly:

  • It’s not a question of if, it’s a question of where
  • My PC has more processing power than my DAC, so it’s the ideal place to do it
  • Depending on how the DAC oversamples (if it is doing a good job already) the difference between PC oversampling and DAC oversampling might not be huge

I think in that other thread, @Agoldnear pointed out that the DAC could introduce jitter (or some such term) and moving the workload to the CPU may avoid this.

I think this has set my expectations better for what to expect when I use upsampling in AV.

Add…

I’m in my 60’s and just did an online hearing test on AudioCheck.net. I could hear the 14K frequency (just) but nothing above it. Will that affect my ability to discern differences between CPU oversampling and my DAC’s?

Andy

1 Like

Thanks, Jud, for this excellent intro to oversampling! Well written for a simple mind like me :grinning:.

You made me think about the term ,conjugate variables‘ and I had a flashback of the thermodynamic lectures I had to take. Although they hadn‘t been named conjugate variables, but the energy (internal, to be precise) of a system is described with such conjugates: pressure/volume; temperature/entropy, etc.

Never thought that oversampling reminds me of my chemistry lectures :rofl:!

Thos is a really great forum that I enjoy very much!

1 Like

No… Your hearing acuity is-what-it-is… and the signal presented through your system is-what-it-isDo you have problems discerning details in the natural world? …Presumably not…

:notes: :eye: :headphones: :eye: :notes:

Understood.

So, in an imaginary scenario, if the DAC performs its upsampling perfectly and the PC does likewise, then there would be no difference in the outputs.

But the likelihood is that the PC, having more processing power and memory, and advanced filter algorithms, should do a better job than the DAC and if I connected a meter of some sort to the audio output, it would demonstrate the differences.

But my headphones, cables, ears, and listening style (etc) take the place of that meter, which may affect my perception of the differences. My expectations (aka confirmation bias) may also be a factor.

In my experience, different people discern different things in just about everything they see and hear. It’s what makes us human - and very annoying!

1 Like

The difference will be measurable, but whether you think it’s audible is another matter. So far, you’ve indicated you don’t hear a difference, and that’s absolutely fine. Measurements on ESS DACs by a professional, as I think I indicated in the other thread, turn out best when DSD256 or DSD512 is sent to the DAC using a 5th order modulator (B5 or A5 in Audirvana). If you hear no difference and there is any difficulty or bother associated with oversampling in the computer, then don’t do it. :slightly_smiling_face:

There are potential advantages and disadvantages to both. Making the DAC work harder, if the power supplies aren’t well isolated, could indeed cause electrical noise to enter the DAC’s clock, causing jitter. On the other hand, if there’s a direct electrical connection such as a USB cable between the computer and the DAC and the DAC doesn’t electrically isolate this input well, then electrical noise can enter the DAC’s clock, causing jitter. :slightly_smiling_face: General advice: Whatever sounds best, do it. If you hear no difference, fine, don’t worry.

Bottom line: Probably not.

More detail: Depending on the specific filter and the track being played, there might be quite rare occasions that it would potentially matter.

There are two ways less good filtering of ultrasonic frequencies can affect the audible tones we hear: Aliasing and intermodulation.

Aliasing “mirrors” higher frequencies around the filter cutoff, so they “fold down” into audible frequencies. For example if your low-pass filter is designed to cut off frequencies above 22.05kHz, but it’s sloppily done and allows noise at 26.05kHz (cutoff +4kHz), then that noise will be “folded down” by the filter and appear at 18.05kHz (cutoff -4kHz). You of course wouldn’t hear this noise, since it’s ~4kHz above your hearing range. In that type of circumstance your auditory range would affect whether you hear the noise. But ESS makes a fine DAC, so unless the manufacturer has provided, and you have chosen, a filter that doesn’t cut much, there’s very little of this noise that should be getting through. (Of course you could also set Audirvana’s filter to cut very little and achieve the same problem, if for some reason you wanted to do that.) There isn’t a great deal of this noise in the first place, since musical instruments provide the great bulk of their energy at audible frequencies and the sampling noise is sitting up at 44.1kHz or above where even a mild filter will take care of it.

Now besides aliasing we mentioned intermodulation. Sound waves and electrical signals interfere with each other in two ways, constructively (additive) and destructively (subtractive). So for example if you have noise at 19kHz and 20kHz, you’ll get noise artifacts at 39kHz (additive) and 1kHz (subtractive). One is far outside your hearing range (this is even more true for ultrasonic noise) while the other is well within it, and this will be the case for most intermodulation artifacts, meaning the fact the upper limit of your auditory range is 14kHz should make little or no difference to whether you hear this distortion.

Intermodulation, by the way, is what takes care of the fact that some of our ability to distinguish between the sounds of different instruments comes from the harmonics they produce, multiples of the base frequency. A trumpet, for example, produces harmonics up into the 30kHz range. A trumpet sounds different to us than a bassoon not because we can hear sounds at 30kHz, but because those harmonics intermodulate with the other frequencies produced by the trumpet and give us its characteristic timbre in the audible range. (There are microphones and speakers that can reproduce harmonics up to the 100kHz range and thus recreate this intermodulation in our listening rooms, but few recordings are made with such microphones, and even fewer of us own such speakers. But that’s fine, as long as the audible range products of the intermodulation are faithfully recorded and reproduced.)

So: If you think you are hearing a difference, go with whatever sounds better, and if you don’t hear a difference, go with whatever you please. :+1:

2 Likes

Thanks @Jud that’s a really clear explanation.

My last listening session used r8brain to PCM upsampling but also the plugins I use to get the sound I want. Of course, they make a very audible difference!

It was definitely the best session I have had with this equipment, but I had also switched from AV’s EQ for headphone correction to Blue Cat’s Re-head, which bases its EQ on the reference .wav file for my headphones on AutoEQ. Blue Cat told me this is more accurate than manually setting 10 EQ values, and my experience agrees with them.

Whether the r8brain contributed to this or not I cannot yet say. I will have to do repeated listenings with it on and off to see if I can detect a difference. But since fiddling with my setup, I’m no longer getting interference when upsampling, which is a good thing.

1 Like

Great summary @Jud, kudos! I’ve designed and implemented filters in CPUs and DSPs—including in processing-constrained embedded systems—and you’ve done a solid job of capturing the advantages of oversampling. Moving from 2x to 8x oversampling, for example, greatly reduces the demands on the low-pass filter.

1 Like

You might try FIR filters with Convolution if you are looking for State Of The Art…

Final Thoughts

The sound quality of 44.1 kHz digital audio data can be dramatically improved by
employing a “poor” oversampling digital anti-imaging filter having a slow roll-off in place of a “good” digital filter having a fast roll-off and a high stop band attenuation. It was shown that the ultrasonic images output by this “poor” filter is responsible for the improved sound quality, reducing certain forms of non-linear distortion such as that due to the differential non-linearity found in all DACs. There may very well be other, subtler, forms of non-linear distortion in DACs, which may also be reduced by signal-dependent ultrasonic dither.

In any case, there are certainly many other sources of non-linear distortion present in the signal chain. Some may question how such a small reduction in non-linear distortion due to differential non-linearity in DACs can be heard when much larger non-linear distortions are generated by loudspeakers, for example. The answer is that the non-linear distortions in question, like jitter-induced non-linearities, are uniquely digital in origin. Such digital distortions have no counterpart in the analog domain. It can be argued that human hearing is much more sensitive to certain digital forms of distortion as compared to the more common distortions of analog origin. For example, it is widely recognized that very low levels of jitter are audible even in the presence of much larger levels of harmonic distortion generated by loudspeakers.

Theory of Upsampled Digital Audio Copyright  2002 by Douglas D. Rife

1 Like

So, in the context of ‘Intermodulation and Timbre’ and the value of upsampling/modulation of lower resolution PCM files… the importance of this subject needs to be expanded on in detail, as it applies to subjective assessment… I forgot this was in my reference library…

HST.725 Music Perception and Cognition, Spring 2009
Harvard-MIT Division of Health Sciences and Technology
Course Director: Dr. Peter Cariani
HST 725
Music Perception & Cognition
Lecture 3

What we hear:
Basic dimensions of auditory experience

https://ocw.mit.edu/courses/hst-725-music-perception-and-cognition-spring-2009/52563e93de4c8813e614115d8e3500d9_MITHST_725S09_lec03_what.pdf

:notes: :eye: :headphones: :eye: :notes:

1 Like

My experience remains that I am yet to hear the difference between AV upsampling and DAC upsampling.

In the same vein, I have many CDs that I have ripped to lossless flac files, but have listened to, and sometimes purchased, HI RES versions of the same music from Qobuz.

I’m still not hearing a difference.

It could be my system, which is hardly state of the art. It could be my listening ‘technique’ (or whatever you call it), where I get ‘lost’ in the music and am not paying forensic level attention to any particular aspect of it. Musically, I am not pitch perfect, but have good relative pitch and hear ‘bum’ notes instantly. I was recently playing a friend’s piano and told her which keys needed their tuning corrected… So, I’m not unaware of musical glitches, but maybe don’t hear or have the gear to hear the sort of errors upsampling fixes, especially if they are subtle.

I did replace my headphone cable and dac, and now have one a high quality cable that puts out balanced audio.

With this setup, I listened to Beethioven’s 9th (Jarvi version) and found myself gasping at the clarity, especially when the timpani kicked in. But it didn’t seem to matter whether I upsampled or not.

I understand upsampling better, thanks to @Jud and @Agoldnear patiently explaining it to me - thanks!

As a programmer, I don’t find it to be a particularly challenging algorithm - adding zeros is not computationally complex, although the finer details no doubt have their challenges. Compare it to upscaling a photograph, where detail has to be added by the algorithm, and audio upsampling becomes a very simple thing…

To me, that means that the challenge faced by the DAC, when presented with an audio stream needing upsampling, is not algorithmically challenging. The main challenge is time - the DAC is receiving a constant stream of data and simply has to get it processed and output in constant time otherwise all sorts of horribleness ensues.

AV, however, has the luxury of buffering its audio stream - there’s a slight delay when it starts to upsample, before it sends the results to the DAC. Providing its buffer is big enough, it will always have data ready to send to the DAC…

Then again, DACS are built for this very purpose. I have yet to experience a DAC failing to deliver audio on time…

Maybe I am lucky and my cables and power supplies (etc) are not causing any interference, so there’s just nothing different to hear. If I was that bothered, I suppose I could feed the audio output into something that could measure/log the audio and then compare things for differences, but I think that would be OTT, especially as I am more than happy with what I’m hearing.

I wonder if those who hear differences have tried a blind experiment - someone else turns upsampling off/on or adjusts the settings and the listener has to identify which is the upsampled version. That would be a worthwhile challenge, as confirmation bias and cognitive dissonance can affect human judgement…

Andy

1 Like

@AndyBell
I believe you mentioned somewhere that you are aligned with ASR dogma… If so you may have missed this:

*Just keep in mind that no contextual harmonic, dynamic or spatial information is created in the interpolation… The original encoded source data remains… What is created in the increase of sample-rate, is an electrical waveform presented to the D/A output circuitry that is more revealing of the source encoding and more representative of a pure analog signal.

In the case of ESS DAC chipsets, all signals are homogenized by virtue of the Hyperstream DSP architecture… Not a bad thing, it is what it is… This in concert with the adroitness of the DAC platform component topology (They do have character), are the lowest common denominator in the assessment… All other influences are system specific and highly subjective in nature… So, your experience is what it is and all that matters, is how you feel about the playback experience in the context of appreciating the music you enjoy listening to.

Nothing that you are listening-to in playback, is real-time… It’s all a result of a flow of mechanisms.

I hope my input was helpful in getting to a place of playback/listening equanimity.
:notes: :eye: :headphones: :eye: :notes:

Andy,
you are not alone. You must know that there are very well respected server manufacturers who design their servers for bitperfect playback without oversampling and prefer to let do the DAC this work. It is one strong point of Audirvana to offer nice upsampling algorithms but if you look over the fence you will see other camps who have also valid arguments.

2 Likes

Not particularly. I certainly don’t think upsampling adds information to the audio, not in the way upscaling adds it to a photo. You and @Jud helped me to figure out what audio upsampling does.

My programming background and tendency towards concrete analysis biases me towards data and testability. But I also have very definite artistic biases (richness in music, deeply saturated colour in art/photography) that is nothing more than personal taste.

I absolutely love what I’m hearing - the artistic part of me is very happy indeed.

I’m just a little frustrated that I can’t hear the difference between upsampling in AV and letting the DACs (I have two ESS and one FiiO R2R) do it. I accept it’s probably just me or the limitations (or even abilities) of my equipment.

I think I will just stick with no upsampling and the EQ settings that I like (which certainly make an audible difference) and not worry about it anymore.

I do appreciate your patience in explaining these things to me - a month ago I was listening to music on an iPod. To say my current setup is an order of magnitude better would be a huge understatement…,

1 Like

There are two difficulties with trying to hear differences:

(1) Our auditory systems aren’t built to hold perfect copies in memory for longer than about 4 seconds under the circumstances of an A/B test. Any longer and the “B” sample will simply replace the “A” sample in memory, leaving nothing to compare “B” to. So listening to two sequential passages and trying to compare them is attempting the impossible. Thus it’s no use trying to get an objective comparison; we’re stuck with what are in actuality memories of our subjective impressions.

This is for most people; those who have trained themselves through hundreds or thousands of hours and know what they’re listening for can do pattern matching to what sounds “right,” and humans are really, really good at pattern matching.

(2) Upsampling doesn’t eliminate “glitches.” What it does is help reduce harmonic distortion, intermodulation distortion, and noise. So first you have to know what these sound like in order to listen for lower levels of them; and second, the levels produced by your DAC’s upsampling may be so low that any differences may be inaudible for humans in general, not just you. I simply do it because I figure starting out at the lowest possible distortion and noise is a good thing, whether or not I can actually hear a difference.

Now subjectively I love the sound I’m getting, but I’ll never claim this as some sort of objective truth for everyone.

So sit back and enjoy your music, whichever way you want to listen. :+1:

2 Likes

There is a level of fantasy in the assertions regarding the audibility of harmonic distortion/intermodulation distortion and noise in a playback of any given music production at any sample-rate, because these anomalies are not steady-state signals and spurious in reality… We don’t listen to pure sine waves or steady-state multi-tones… However harmonic EMF and RF ‘hum’ and ‘hiss’ are steady-state influencers… Neither of these two influencers, have anything to do with up-sampling filter performance.

There is a level of pragmatism in this statement, however, this perspective is from a point of view of the modulation of PCM to DSD in a system that provides an unfettered 1-bit PDM signal path, unlike the ESS chipset and other PCM-centric systems and 1-bit ladder DACs.

It is entirely possible in today’s state-of-the-art DAC platform design dogma, inexpensive DAC platform manufacturers have reached a point in design adroitness, especially surrounding on chip DSP and the component topology in concert with power and ground-plane resources, that they have mooted the benefits of PCM file upsampling done in the computer prior to sending the interpolated signal to a PCM-centric DAC.

It appears the current trajectory is heading toward this design dogma of vertically integrated interpolation … especially in DAC platforms that are modulating PCM to 1-bit PDM (DSD).

So the value proposition of doing upsampling/modulation in the computer prior to sending the signal to the DAC platform, will be device specific as we move forward in digital-audio playback of PCM signals on PCM-centric DAC platforms.

:notes: :eye: :headphones: :eye: :notes:

1 Like

There’s software available that will add them to an audio file you supply:

https://distortaudio.org/

Harmonic distortion and noise are explicitly mentioned. Adding harmonic distortion adds intermodulation distortion as well. Various levels and types of jitter can be added, so you can hear the analog effects.

1 Like

This is sandboxing the signals in isolation… this is not music… any spurious harmonic peaks in music do not exist in isolation and I pose that the greater majority of audiophiles are unable to identify any specific intermodulation distortion in complex signals, especially aliasing beyond 22kHz in the context of the hearing acuity of any given subject human.

I will suggest that folks with generally good hearing are capable of discerning spatial cues with relative acuity, however depending on the relative sub-harmonics produced by doppler-effect in the natural world experience, localization is rather ambiguous… Reverberation is one elemental example… In the Natural world, a reverberation has an infinite number of contextual harmonics.

:notes: :eye: :headphones: :eye: :notes: