I suspect that our belief in the perfection of CD audio is entirely the fruits of the efforts of 80s marketing firms tasked with selling CD players.
There is a general perception that digital audio sampled at 44.1kHz @16 bits per sample is as good as is necessary to replicate any piece of audio happening in nature. The reasoning follows along the lines that human pitch perception limits at something like 22kHz and, given niquist theorem, we need 2 samples for every cycle to faithfully represent a waveform, so that gives is a standard of around 44kHz. This seams reasonable so long as the assumptions are valid.
The basic assumption that human pitch perception limits at something like 22kHz is, perhaps, fine. Mine limits well below that in normal circumstances. I can’t say for certain its a good or bad assumption, but human hearing is far too complex to apply such simple limitations.
If you look at how our brains localise sounds, that is, discern the direction of the source, you’ll find a possible flaw in CD audio.
Like many of the features of the brain, there are lots of strategies employed simultaneously, sometimes even redundantly, to solve any particular task. To locate the direction of a sound source there are a couple of mechanisms. They work with varying degrees of precision and reliability in different circumstances. One mechanism works by Inter-aural Delay.
A sound is a perception, but the physical thing being detected is a pressure wave. That pressure wave arrives at each ear, but often will arrive at different times. If the sound source is directly in front or behind of you the pressure wave will hit each ear simultaneously, so the inter-aural delay is 0. If the sound source is in any other direction there will be a none-zero inter-aural delay. If there was a way to measure this delay it could be used to determine the direction of the sound source .
Things get a bit complicated here, bit the idea as I’ve described it is pretty complete. Neurons are surprisingly slow. One of them firing takes a pretty long time, so measuring tiny time differences isn’t straight forward. How this is done like this:
Our brains have a type of neuron that takes two signal spikes. If the spikes happen at the same time, the neuron fires. So its like a ‘simultaneous spikes detector’. Imagine the two inputs were ears, left and right, and let’s say there was an inter-aural delay of 1 second on the left ear (so the sound reached the right ear 1 second ahead of the left). The neuron would obviously not fire. But then imagine that there was some way of adding an artificial delay of 1 second into the right input. You would essentially have a ‘1 second inter-aural delay on the left detector’.
If sound events were turned into spikes that were fed into this circuit, the circuit would light up when you have an inter-aural delay of 1 second on the left.
Since the above circuit can only detect 1 inter-aural delay the brain needs a few of them. Actually we have roughly 1 per degree. This is where the shortcomings of CD might start to show. The largest IAD is when a sound is directly the left (or right) and the shortest is zero corresponding to directly ahead (or behind). The angle between forward and right is 90 degrees, and we have 1 of those circuits for each degree, so that means we can detect IADs (in 800Hz sound events) of 6.94µs (525/90). CD audio samples are spaced at one every 22.68µs. My numbers could well be off. IAD varies a lot for lots of reasons. This page says that the minimum detectible IAD is 10µs, which is still much smaller than 1 CD audio sample.
 At least to an extent. It wouldn’t be able to tell you the difference between a sound coming form ahead versus from behind. This is where one of the other strategies comes in handy. I think the shape of our ears, for example, distorts sounds differently if the come from the front than from the back. Our brains can detect that distortion.