An AI search told @AndyBell that oversampling (a/k/a upsampling; technically correct term is interpolation) would clear up inaudible frequencies. I assume the AI means ultrasonics, because it’s a popular notion on the web that this is what high sample rates/oversampling is for, to bring out ultrasonic frequencies. Actually that’s almost the exact opposite of what oversampling is for. So let’s discuss.
First, the question is almost always not whether to oversample, but where. That is, nearly all DACs will take a “bit perfect” signal you feed them and do their own oversampling internally. However, many DACs will take a signal that is oversampled to DSD resolutions in a computer and pass it right on through to the final analog filter. Even where the DAC does additional oversampling, oversampling first in the computer usually avoids some oversampling steps in the DAC. So regarding whether to oversample, the issue is almost never yes/no, and nearly always computer/DAC. Modern computer CPUs have the capability to run more sophisticated filters than DAC chips, so that’s one possible factor in the decision, but of course the final determination is up to your ears.
Regardless of where it’s done, why has oversampling been standard engineering practice in digital audio since before there were such things as separate DACs (very soon after the introduction of CD players, the DAC chips in them were doing oversampling)?
I’m sure we’ve all heard of the Heisenberg Uncertainty Principle in quantum mechanics. (Rest easy, I’m not about to suggest there’s some weird quantum stuff going on in digital audio.) It says that the more certain you are of the momentum of a particle like an electron, for example, the less sure you can be of its position, and vice versa. This is because in the math being used, called Fourier analysis, momentum and position are what are called conjugate variables - as one variable becomes more optimized, the other becomes less so.
Well, Fourier analysis just happens to be the math used in making digital audio filters.
[A momentary aside: Why are we talking about filters? Because they are essential for digital audio. When you sample a signal at, for example, 44.1kHz, that frequency is sitting there in the sampled audio, and it wasn’t there in the original. Now yes, 44.1kHz is way above what we can hear, so why even bother with it? Because it interacts with other frequencies through aliasing and imaging, processes that result in noise and distortion at audible frequencies. Therefore we need to get rid of this frequency, and any other frequencies created by aliasing and imaging that weren’t in the original signal, and we do this with a low-pass filter (one that passes the lower frequencies we want, and gets rid of the higher frequencies that would cause unwanted noise and distortion).]
Right, so back to Fourier analysis, conjugate variables, and digital audio filters. It happens that digital audio filters have conjugate variables of their own. In lay terms, the more sharply you cut off the undesired high frequencies, which is good to get rid of frequency-based distortions, the more your filter will “ring” (technical term is the Gibbs effect - see Wikipedia for discussion), which is a time-based distortion. There is controversy over the audibility of “ringing” (one reason is that it usually occurs at ultrasonic frequencies) but why have this distortion in any case if you can avoid it?
So now let’s look at our 44.1kHz sample rate example. Due to some gentlemen named Nyquist, Shannon, and Whittaker, in this 44.1kHz sampled signal we want to get rid of all the frequencies above 22.05kHz (this is one-half the sample rate, a/k/a the “Nyquist frequency”). But some humans (mainly younger females) can hear up to 20kHz, and we would not want to deny the full spread of higher frequencies to our young Taylor Swift fans. That means our filter needs to go from passing everything to passing nothing within 2.05kHz. That’s a pretty sharp cutoff, thus, difficult to avoid ringing. But if we relax the sharp cutoff, we allow higher frequencies through, and we have aliasing and imaging distortion. (There are our conjugate variables in action.)
But now let’s oversample to a really high number, 768kHz. (We’re sticking with PCM, because introducing the subject of sigma-delta modulation to get DSD is just complexity we don’t need to make the point.) That 22.05kHz cutoff just became 384kHz (remember, half the sample rate), and the 2.05kHz spread that we had between all-pass and no-pass has increased more than 16-fold. So we have got loads of room for our filter to go from letting everything through to cutting everything off. Now it’s much, much easier to avoid both cutting too sharply (time-based distortion) and not cutting sharply enough (frequency-based distortion).
So that, finally, is why oversampling is done: It makes good filtering easier. Plain and simple.