Audirvana Studio Review: Optimizing Sound Quality Through Software

mhsmit · April 17, 2025, 8:31pm

Audirvana Studio Review: Optimizing Sound Quality Through Software

[UPDATED APRIL 19, with new detailed notes on my reference tracks at the end]

Introduction

After years of using Audirvana 3.5, I recently upgraded to Audirvana Studio despite my initial hesitation about its subscription model. What I discovered through careful optimization of its upsampling settings was nothing short of revelatory – improvements that rival or even exceed what I’ve experienced from significant hardware upgrades.

My audio journey has involved meticulous system building: clean power supplies, carefully selected power cables, and acoustic room treatment. Yet it was only after replacing some of my switching power supplies with linear ones (LPS) that I could truly hear the profound impact of Audirvana’s digital processing settings. My system now reveals subtleties in sound that I never knew existed in my familiar reference tracks. [UPDATE APRIL 19: Although linear power supplies on my digital components resulted in better sound to my ears, there are good switching mode power supplies as well. In fact, my Benchmark DAC3 uses an internal SMPS and it sounds amazing. Also: I have achieved a cleaner system by replacing the “cheap” wall warts on USB hub, hard disk and central Zyxel ethernet switch by, still inexpensive, Meanwell switching power supplies.]

In my experience, each element in an audio system has the potential to “deviate” from the ideal reproduction of music. My ultimate goal has always been to approach real-life sound – to make instruments and voices sound as they do in real life. This typically means aiming for a flat frequency response, which is what recording studios calibrate their monitoring systems to achieve. With a frequency response that’s anything but flat, I find that a system becomes “selective”: some recordings sound great, but others can sound terrible. While every human ear is different, and orchestral music poses different challenges than rhythmic pop or electronic music, a neutral foundation allows the recording itself – not the playback system – to determine the character of the sound.

Understanding Audirvana’s Upsampling Parameters

For those unfamiliar with digital audio processing, Audirvana offers exceptional control over how your music is upsampled before reaching your DAC. Each parameter affects different aspects of sound reproduction, and finding the optimal settings for your specific system can unlock remarkable improvements in sound quality.

Beyond frequency response, I’ve found that I’m particularly sensitive to the accuracy of the soundstage – the precise positioning of instruments and voices against their background. This is where the time domain becomes crucial, which leads us to one of Audirvana’s most important settings.

SoX Filter Phase (Min. Phase to Linear)

This setting determines how the upsampling filter balances between time-domain accuracy and natural timbre. It was one of the first parameters I optimized years ago because its effects are relatively easy to hear:

At 100% (Linear Phase): Perfect time-domain accuracy, which benefits rhythmic precision and precise positioning. However, it can make acoustic instruments sound slightly less natural due to pre-ringing effects.
At 0% (Minimum Phase): Preserves the natural timbre and attack of instruments but can slightly compromise precise positioning and timing.

After extensive listening to various music genres, I settled on 66% as my preferred setting. This balanced approach maintains solid imaging while preserving the natural character of acoustic instruments. I recommend establishing this setting first before moving on to other parameters, as it forms the foundation of how your music will sound spatially.

SoX Filter Bandwidth (% Nyquist)

This setting determines where the upsampling filter begins to roll off high frequencies. At 100%, it would theoretically extend to the Nyquist frequency (half the sampling rate), but this can introduce significant issues. In my experience, setting the bandwidth too high can create interference and distortion in some tracks, resulting in noticeable harshness in upper mids and highs. This artificial edginess contributes to what many people refer to as “digital sound” – the very reason many listeners still prefer vinyl over digital.

Most DACs have their own internal filtering, but they rarely offer the fine-tuning capabilities that Audirvana provides. Without the ability to adjust these parameters, listeners are stuck with whatever compromise the manufacturer decided was “best” for the average system.

I settled on 94.5% as the optimal setting. When I experimented with lower values, I noticed a significant loss of “freshness” in string instruments and high-frequency content. This sweet spot preserves all the air and sparkle while avoiding the harsh artifacts that can make digital sound fatiguing. With this setting, orchestral strings maintain their natural shimmer without the artificial edginess that can make long listening sessions uncomfortable.

SoX Filter Length

This parameter controls how aggressively the filter operates in the time domain. Longer filters theoretically provide better frequency response but can affect transients and spatial cues.

Through extensive testing, I discovered that even single-digit changes in this value could affect how instruments were positioned in the soundstage. For example, in Philip Glass’s “Anthem Pt. 3” from Powaqqatsi:

At 23300: The panning synth moved clearly from left to right, but the snare drum lost some of its natural reverb
At 23290: The synth became slightly less precise in its movement
At 23296: Perfect balance – clear instrument positioning with natural reverb and decay

The precision required here was stunning – at 23297, timpani attacks became less clear but had better integration with the orchestra. At 23295, the timpani attacks were precisely defined but the pan flutes lost some of their velvety quality.

SoX Filter Anti-Aliasing

This setting determines how aggressively the resampler works to prevent aliasing artifacts. Aliasing artifacts occur when the digital conversion process creates unwanted frequencies that weren’t in the original signal. Think of it like a mirror reflection – frequencies that should be eliminated are instead “folded back” into the audible range, creating distortions and artificial sounds.

When upsampling, the mathematical process can create these spurious frequencies, and the anti-aliasing filter works to suppress them. Theoretically, 100% anti-aliasing should provide the cleanest sound by eliminating all of these artifacts.

Counterintuitively, I found that 96% sounded significantly better than 100%. At 100%, I experienced what I can only describe as “nervous smearing” in complex orchestral passages. At 97-99%, a certain digital flatness set in. At 96%, there was a perfect balance – string articulation remained precise while the music gained an emotional quality that was previously missing.

This perfectly demonstrates how “theoretical perfection” in digital processing doesn’t always translate to musical satisfaction. The 96% setting allowed just enough of the natural character of instruments to come through without sounding artificially processed. Excessive filtering, while mathematically “perfect,” can actually create its own subtle artifacts that affect the natural decay of harmonics and the micro-dynamics that give music its emotional impact.

DSD Sigma-Delta Modulator Filter Types

[UPDATE APRIL 19: I have found that my Benchmark DAC3 sounds the most “magical” when I enable conversion to DSD64. I’ve tried “just” PCM upsampling to 2x the sampling rate or to maximum sampling rate (192kHz). But subsequent conversion to DSD sounds even better to my ears. It adds an “analog” and “realistic” feel to the music. Some have commented that the DSD64 limit of my DAC is sub-optimal. But I think it sounds excellent. I’ve also been really happy with my SACD collection, which also uses DSD64. I think that sometimes we should not let ourselves be guided by numbers only: It’s our enjoyment of the music that counts.]

When converting PCM to DSD, Audirvana first upsamples the PCM signal using SoX, then converts this high-rate PCM to a 1-bit DSD stream through sigma-delta modulation. The sigma-delta modulator’s filter type affects how this conversion happens by shaping the quantization noise (the unavoidable errors introduced when converting to a 1-bit format).

Audirvana offers three distinct filter types (A, B, and C) for DSD conversion, each with different approaches to noise shaping:

Type A comes in different orders (4th through 8th). Higher orders push more noise to higher frequencies outside the audible range but can affect timing and natural sound:

Type A (4th order) consistently provided the most natural, engaging sound in my tests
Higher orders (5th-8th) created a “blacker” background but introduced strange artifacts – acoustic guitars lost focus, bass became woolly, and orchestral strings started to sound more like a synthesizer than individual instruments
The tradeoff became clear: increasing technical “cleanliness” at the expense of musical engagement

Type B takes a different approach to noise shaping that is more aggressive in the midrange:

It offered an interesting alternative – wider, deeper soundstage but lighter tonal balance
It seemed to create a “hole” in the lower mids that made music sound less emotionally engaging
While it cleaned up some “confusion” in orchestral mids, it made music sound too “lightweight”

Type C uses a more recent, single-stage algorithm designed to balance noise shaping with natural decay:

It sounded cleaner but more sterile, lacking the warmth and emotional connection that Type A provided
It maintained good positioning (especially with xylophone and percussion) but instruments had less “body”
The timpani that was clearly audible during crescendos with Type A became masked with Type C

These differences were far from subtle. With each filter type, the entire character of the music changed – not just technical aspects but the emotional connection as well.

The Optimization Process

The journey to these optimal settings was methodical, required patience, but was fun overall. Here’s what I learned:

Use reference tracks you know intimately. I selected seven tracks across different genres that I’ve heard hundreds of times, each testing different aspects of reproduction.
Change one parameter at a time. I began with filter bandwidth, then length, then anti-aliasing, and finally the DSD filter type.
The approach-overshoot-return method. Often, the best setting was found by moving in one direction until the sound degraded, then stepping back slightly. Without going “too far,” I wouldn’t have known where the optimal point was.
Trust your emotional response. Technical “perfection” sometimes sounds less engaging. When evaluating the DSD filter types, I consistently returned to Type A because it simply made music more moving, even if Type C sounded “cleaner.”
Listen for specific elements. In “Powaqqatsi,” I focused on the snare drum’s attack and reverb, the panning synthesizer’s movement, and how well I could still hear the timpani during crescendos.

[UPDATE APRIL 19:]

Enjoy the music. I find that I cannot find optimal settings when I’m tired, nor when I’m not actually enjoying what I’m listening to. The goal of optimizing my system isn’t to reach some technical solution or “mathematical perfection”, but to maximize musical enjoyment. I get a kick when an instrument or voice suddenly causes goosebumps. The optimization process should be fun. The moment you feel it becomes tedious… perhaps it’s time for a break because I feel that’s when we stop enjoying the actual music.
Take your time. I find that sometimes I have to revisit a change the next day. When we focus on one thing during a long listening session, we can forget about the “overall picture.” And… never get stuck on a single track. I don’t think I have any single track that I would be able to calibrate my system on. After each change, I force myself to re-audition all of my seven reference tracks.
Revisit prior changes. The various parameters of the SoX upsampler change the sound in different ways. An improvement to one parameter can reveal even more resolution in another. It’s a bit like focusing multiple pairs of binoculars in series: You go back and forth until suddenly everything falls into place and the system sounds “magical.”
Revisit after hardware changes. I have found that, as I improved my hardware and especially after I cleaned up my power supplies, some of my setting preferences changed because I had previously used them to mask external distortion. For example: a nasty “grit” in the upper mids might have been caused by bad power (possibly introduced by LED transformers on the same circuit, for example). You may have compensated for this by over-filtering these frequencies. But after the external distorting factor is gone, you can re-focus them and achieve fine-tuned results that had remained masked before.
Make notes. I find that it extremely helpful to jot down notes during my reference track auditioning sessions. The recent optimization process took weeks, and my auditory memory isn’t perfect. I would not have been able to write this review without going back to my extensive notes about how each setting affected me, for example.

Real-World Results

The improvements from these optimized settings were dramatic across my reference tracks:

Dire Straits - “Love Over Gold”: The acoustic guitar now has a focused position with natural warmth. Mark Knopfler’s voice reveals subtle details I’d never noticed before, with sibilants that sound natural rather than harsh. The xylophone panning across the soundstage at the end is startlingly realistic. The CD-quality version now comes remarkably close to my MoFi SACD version.

Philip Glass - “Anthem Pt. 3” from Powaqqatsi: The opening snare drum now displays clear positioning with natural reverb extending to the back of the soundstage. The panning synth moves precisely from left to right against a stable orchestral backdrop. Most impressively, during crescendos, individual instruments maintain their separation instead of smearing together. The pan flutes have a new “velvety” quality where I can almost hear the breath passing through them.

James Newton Howard - “Evacuation” from I Am Legend: The cellos at the beginning now sound like individual instruments rather than a homogeneous mass. The choir and strings occupy distinct spaces in the soundstage rather than blending together confusingly. The deep bass remains powerful during complex passages without masking other elements.

John Ottman - “They’ll Remember You” from Valkyrie: This track, which always sounded somewhat confused on my system, now presents clear separation between choir and orchestra when played with inverted polarity. The female soloist has a precise position at the front of the soundstage with natural presence.

Recommendations for Other Users

While my exact settings may not be optimal for every system, here’s my suggested approach for Audirvana users:

Start with SoX Filter Phase: Begin by determining your preferred balance between rhythmic precision and natural timbre. For predominantly acoustic or orchestral music, try settings between 50-70%. For electronic or rhythm-focused music, higher settings (80-100%) might be preferable. Pay attention to soundstage precision versus the natural sound of acoustic instruments.
Experiment with SoX Anti-Aliasing: Try 96% instead of 100% and listen for improvements in naturalness, particularly in string and vocal performances. This single adjustment can dramatically reduce digital harshness.
Fine-tune Filter Length: Begin around 20000-25000 and adjust in small increments while listening to complex musical passages. Focus on instrument separation during crescendos and the natural decay of sounds.
For DSD conversion: Start with Type A (4th order) as a baseline before trying other types. Higher orders aren’t necessarily better – they often trade natural sound for technical cleanliness.
System synergy matters: Clean power makes these differences more audible. If possible, use a linear power supply for your digital components.
Be methodical but trust your ears: Technical “perfection” doesn’t always sound best. If a setting makes you connect more emotionally with the music, that’s the right setting.

[UPDATE APRIL 19: After my recent Audirvana optimizations, I find myself glued to Qobuz, listening even to music in genres that I don’t usually listen to. When you suddenly find yourself wanting to keep listening (or wanting to turn up the volume), that is a sign that the optimization was actually successful. It’s an awesome thing. And one reason why I wrote this review. I hope others will benefit and I definitely hope that Damien and his team at Audirvana will be able to keep up their great work. In fact… I’m hoping one day to have Audirvana SoX upsampling in my B&W-equipped car as well. This is one thing Roon ARC beat them in.]

Alternatives I’ve Recently Explored

Before committing to Audirvana Studio’s subscription model, I explored several alternatives to ensure I was making the right choice:

Roon: While the latest version does offer upsampling options, it lacks the granular control that Audirvana provides. In my system and to my ears, Roon sounded quite good but didn’t come close to the results I achieved with Audirvana’s fine-tuned settings. The ability to adjust parameters by single digits makes a substantial difference that Roon’s more limited controls can’t match.

JPlay for iOS: This software enables bypassing the computer entirely, streaming directly to a network player. Even without upsampling capabilities, JPlay sounded remarkably good – better than Roon to my ears and surprisingly close to my Mac mini running Audirvana. For listeners seeking a simplified approach to high-quality audio, JPlay offers an excellent entry point. However, in direct comparison to my optimized Audirvana setup, JPlay sounded clean but “less analog” – missing that final layer of natural reproduction that properly configured upsampling can provide.

The Role of AI in My Optimization Journey

A crucial part of my recent optimization process was leveraging Claude AI to better understand these complex parameters. While I had a general understanding of what each setting controlled, Claude provided deeper technical explanations of what was happening in the digital domain.

For example, our discussions about anti-aliasing helped me understand why 100% might not be ideal – how over-aggressive filtering could affect natural harmonic decay patterns. Claude suggested specific things to listen for when adjusting each parameter (such as the positioning of instruments, reverb characteristics, and bass definition), which helped me focus my listening and make more informed adjustments.

This collaboration between human critical listening and AI technical knowledge proved remarkably effective in navigating the complex landscape of digital audio processing.

Conclusion

Audirvana Studio’s depth of configurability allowed me to optimize the digital signal path specifically for my system in ways I never imagined possible. The subscription cost, which I initially resisted, now seems wholly justified by the sonic improvements I’ve achieved.

What’s particularly striking is that these software optimizations have provided improvements comparable to significant hardware upgrades. Even with identical components, the difference between default and optimized settings is immediately apparent and musically significant.

For anyone serious about getting the most from their digital audio system, I can’t recommend Audirvana Studio highly enough. Its ability to be fine-tuned to your specific system makes it an essential tool for audiophiles seeking the highest sound quality from digital sources.

My System Details

Equipment Used:

Ethernet Switch: Uptone Audio Etherregen, powered by Uptone JS-2 Linear Power Supply at 9VDC (LPS) [UPDATE APRIL 19: I actually have it set to 12V, as recommended by Uptone. But I haven’t noticed an audible difference between 9V and 12V and I have a fan that cools the EtherRegen so it doesn’t get hot either way.]
Player: Mac mini (late 2012) with Uptone Audio DC-Conversion / Linear Fan Controller Kit (MMK), powered by Audiophonics LPSU200 LPS
Operating system: macOS Monterey (installed with OpenCore Legacy Patcher)
USB renderer: Sonore UltraRendu 1.2 running SonicOrbiter software 2.9 powered by Uptone JS-2 LPS
DAC: Benchmark DAC3 with Uptone IsoRegen USB reclocker powered by SBooster LPS (set to 6.5V) [in my initial review I incorrectly wrote that the IsoRegen was also powered by the JS-2 LPS] and connected using an Uptone USPCB adapter
Amplifiers: Two Classé CT-M600 mono blocks
Speakers: Bowers & Wilkins 802 D2

Cables Used: (these matter enormously)

Power cord for Uptone JS-2 LPS: DR Acoustics (Quebec Canada) Back Moon 1.8m
Power cords for Audiophonics LPS, Benchmark DAC and Classé amps: DR Acoustics Red Moon 1.8m
Ethernet cable between EtherRegen and Mac mini and EtherRegen to UltraRendu: Supra CAT8 1m
USB cable from Mac mini to IsoRegen: DR Acoustics Digital Marvel 1m
Balanced Interconnects from DAC to amps: DR Acoustics Red Moon 1,5m
Distribution strip for amps: Furutech E-TP60E NCF Rhodium
Distribution strip for LPSes: Supra MD-06EU MK3
Power cord to wall and between distribution strips: Supra Lorad 2.5 SPC

Other Equipment Used: (these made a noticeable difference too)

Dedicated power line using its own phase separated from switching devices using 2,5mm2 wire from fuse to Furutech Rhodium wall outlet
AHP Klangmodul fuse holder
AHP gold-plated copper fuse
Furutech fuses in the Classé mono blocks

Optimized Audirvana Settings:

SoX Filter Bandwidth (% Nyquist): 94.5
SoX Filter Max. Length: 23296
SoX Filter Anti-Aliasing (%): 96
SoX Filter Phase (Min. Phase to Linear): 66
DSD Sigma-Delta Modulator Filter Type: A (4th order)
Safe volume reduction before DSD upsampling: -4dB

Reference Tracks: A Detailed Guide

Selecting proper reference tracks is crucial for system optimization. Below are the tracks I’ve used extensively, what each one reveals about a system, and specific improvements I observed after optimizing Audirvana’s settings.

Note that your list of reference tracks will be different. You should use tracks that you are intimately familiar with to optimize your system. We cannot calibrate our system when we don’t know for sure what each recording is supposed to sound like. This requires knowing the tracks very well, and having heard them in different environments and at different times.

1. Les Gestes Délicats - Yves Duteil (44.1kHz 16-bit CD Quality)

Why it’s useful: This intimate recording features voice with accordion, acoustic guitar, piano, and cello in a small setting. It reveals a system’s ability to handle delicate timbres and positioning.

What to listen for:

Accordion opening can sound harsh on lesser systems
Vocal detail and sibilance character
Acoustic guitar positioning
Piano key attacks and decay
Cello warmth and emotion

Improvements after optimization: The opening accordion now has a focused position with newfound warmth I’d never previously noticed. Yves Duteil’s voice reveals micro-details like subtle hesitations and natural breaths. The softness of consonants (“ma fenêtre vers l’avenir”) sounds remarkably natural, while sibilance is present but not exaggerated. The acoustic guitar now stays firmly centered rather than shifting position with certain notes. The piano’s attack transients sound more natural while maintaining clarity. Most impressively, the cello exhibits both improved detail and enhanced emotional warmth – a combination that previously seemed mutually exclusive.

2. Street Fighting Years - Simple Minds (44.1kHz 16-bit CD Quality)

Why it’s useful: This rock track with both acoustic and electronic elements tests how a system handles complex layering, bass definition, and soundstage coherence.

What to listen for:

Opening bass harmonics and plucked character
Wood block positioning and decay
Vocal clarity amid dense instrumentation
Electric guitar texture and aggression
Drum warmth during crescendos

Improvements after optimization: The opening bass now reveals harmonic overtones I’d never clearly heard before, with individual string plucks clearly defined against a blacker background. The wood block maintains stable positioning with natural reverb extending to the rear of the soundstage. Jim Kerr’s voice now has an improved “center image” that sounds like it’s coming from an imaginary center speaker, with improved sibilance that sounds natural rather than harsh. The electric guitar toward the right of the stage maintains its aggressive character but without fatiguing sharpness. Even after hundreds of listens, I’ve discovered new emotional intensity in both vocals and instrumentation that was previously obscured.

3. Anthem Pt. 3 (from Powaqqatsi) - Philip Glass (44.1kHz 16-bit CD Quality)

Why it’s useful: This soundtrack gradually builds to emotional crescendos, testing a system’s ability to maintain separation during complex orchestral passages.

What to listen for:

Snare drum positioning and reverb tail
Left-to-right panning synth clarity
Pan flute timbral accuracy
Timpani definition during crescendos
Instrument separation in complex passages

Improvements after optimization: The opening snare drum now has remarkable clarity with a reverb tail that extends naturally toward the rear of the soundstage. The panning synthesizer moves precisely from left to right without any smearing or vagueness. The pan flutes have gained a velvety quality that makes the performer’s breath almost tangible. During crescendos, individual instruments maintain their separation instead of collapsing into an indistinct mass. The timpani remains clean and powerful even during the loudest sections, while subtle details like the triangle in the background remain clearly audible throughout.

4. Sleepers Beat Theme - Ben-Lukas Boysen (44.1kHz 24-bit Hi-res)

Why it’s useful: This minimalist piano piece with deep bass and electronic elements tests a system’s low-level resolution and bass control.

What to listen for:

Ultra-deep bass stability and decay
Piano timbre and dynamic range
Background harp clarity
Silences between notes (system noise floor)
Natural fade-out at the end

Improvements after optimization: The deep bass opening now has remarkable stability with a clean decay that fades naturally without any “woolliness.” The piano notes have gained character and weight, with improved harmonics that make the instrument sound more real. The harp on the left maintains perfect clarity and sparkle against the dark background. As the final notes die away at the end of the track, they fade smoothly into silence without any digital artifacts or hesitation. The entire track now sounds more emotionally powerful while revealing subtle production details that were previously masked.

5. Evacuation (from I Am Legend) - James Newton Howard (44.1kHz 16-bit CD Quality)

Why it’s useful: This orchestral piece with choir tests how a system handles complex layering of similar frequency ranges and spatial separation.

What to listen for:

Individual cello definition in the opening
String section emotion and articulation
Choir separation from orchestra
Deep bass impact without bloating
Female vocal clarity

Improvements after optimization: The opening cellos now sound like distinct instruments rather than a homogeneous mass. The strings convey heightened emotional impact while maintaining precise articulation. The choir and orchestra now occupy distinct spaces within the soundstage rather than blending together confusingly. The deep bass synthesizer remains powerful during complex passages without overwhelming other elements. The acoustic basses sound deeper, warmer, and more emotionally engaging, while the female voices in the choir have gained an angelic quality with improved clarity.

6. They’ll Remember You (from Valkyrie) - John Ottman (44.1kHz 16-bit CD Quality with INVERTPOLARITY)

Why it’s useful: This orchestral piece with choir and solo vocalist reveals issues with complex harmonic interactions and was particularly challenging before polarity inversion.

What to listen for:

String clarity during complex passages
Choir/orchestra separation
Female solo voice focus and natural timbre
Overall coherence and lack of “smearing”
Emotional impact of the composition

Improvements after optimization: After applying polarity inversion and optimizing upsampling settings, this previously problematic track has been transformed. While I still detect slight smearing between choir and strings during the most complex passages, both elements maintain much better definition. The choir sounds remarkably clean with improved emotional connection. The female soloist at the end of the track now has precise positioning at the front of the soundstage with natural timbre and improved presence. This track demonstrates how both technical adjustments (polarity) and processing optimization can work together to solve challenging recordings.

7. Love Over Gold - Dire Straits (44.1kHz 16-bit CD Quality)

Why it’s useful: This audiophile favorite tests tonal balance, spatial reproduction, and natural timbre across acoustic instruments.

What to listen for:

Opening acoustic guitar warmth and definition
Knopfler’s vocal timbre and sibilance
Percussion spaciousness and natural decay
Xylophone natural harmonics
Left-to-right panning effects

Improvements after optimization: The opening acoustic guitar now has perfect focus with natural warmth and articulation. Mark Knopfler’s voice reveals subtleties I’d previously only noted on SACD, with sibilance that sounds completely natural. The percussion has gained improved spatial definition with clearer placement of each element. Most impressively, the xylophone that engages in a “duet” with the acoustic guitar now has precise positioning, and when it pans across the soundstage at the end of the track, each strike has remarkable clarity and natural harmonics. My optimized Audirvana settings bring the standard CD version remarkably close to my MoFi SACD version of this recording – something I would have thought impossible before.

These reference tracks have served me well because they span different genres and recording techniques while revealing different aspects of system performance. I recommend creating your own set of reference tracks that you know intimately, as they will provide a consistent baseline against which you can evaluate system changes.

Note on AI assistance

I used Claude.ai to help better understand the technical aspects of digital upsampling while optimizing my Audirvana settings over several weeks in March-April 2025. During this process, I documented detailed listening notes for each parameter change across my seven reference tracks, resulting in extensive observations. Claude helped explain technical concepts like anti-aliasing, filter behavior, and sigma-delta modulation, which improved my understanding of what I was hearing. For the review itself, Claude assisted in organizing these findings into a coherent structure while maintaining the specific observations and conclusions from my listening sessions. The descriptions of how each setting affected sound quality, the recommended optimization process, and all listening impressions are based on my actual experiences. I made final edits and refinements to ensure the review accurately reflected my recent Audirvana optimization steps.

Jud · April 17, 2025, 8:52pm

Thanks for your very thorough review. Just to let you know, I was informed by the developer of the modulators used in Audirvana that B7 and C are identical. (I thought I heard differences too.)

AndyLubke · April 17, 2025, 9:45pm

Also thanks for your thorough review. Out of curiosity… did you also try the r8brain upsample algorithm (also present in Audirvana Studio) instead of SoX? I am not claiming one is better than the other (and taste is subjective). But I am curious what your opinion is.

mhsmit · April 17, 2025, 10:17pm

I tried the r8brain upsampler when I first installed Studio in November. For me that was a step down from the SOX settings I had in Audirvāna 3.5. The SOX parameter tweaking described in the review above is the process I went through since then. r8brain doesn’t seem to offer the same kind of granularity and with the latest SOX tweaks I reached truly great sound so I didn’t revisit it.

I would imagine that the settings I reached aren’t just personal taste. They probably also would be different with every DAC.

Jud · April 17, 2025, 10:28pm

Here are graphs of Audirvana’s modulator noise profiles at DSD128 and DSD256, as measured by the modulator developer. (He didn’t provide graphs at DSD512 IIRC.)

Ddude003 · April 18, 2025, 4:47am

Pardon me if I missed anything you may have said about physical room correction with bass absorption and/or wide band and/or diffusion devices… It would be interesting to understand where your room baseline is “flat”… Do you use hardware or software room correction?

Agoldnear · April 18, 2025, 5:49am

Error #1 …Unless you recorded the product yourself, you have no baseline for qualitative assessment… purely subjective interpretive bias.

Note: The Benchmark DAC uses their audio-centric SMPS for lower noise than LPS.

Error #2… Using Ethernet protocol through a series of components.

Error #3… Running a software patcher on top of macOS, where the API’s are not native…

Recordings inherently sound different… Contextual harmonics, dynamics and spatialization are not exclusive to any musical genre recording and encoding… they are what they are.… “A” 440/441 Hz is the same in all genre, what may vary is the timbre/harmonic complexity/density and these non-linearities become subjective in nature in the amalgamated playback system and the acoustic environment in which the audition takes place.

What is neutral in any given playback system/environment configuration? Speakers imbue a character that cannot be teased out…
These are not mutually exclusive, they are synergistic, as @Ddude003 alludes…

You have provided great insight into your playback strategy… However, the subjectiveness of your interpretations cannot be removed and must be tempered by the factual exclusivity of your experiential interpretations… All valid for you, a good reference point for some… Thanks for your insights I am sure many will find them of great value…

Edit:
You did not include your PCM up-sampling target sample-rate (We can only presume you are employing ‘Power of Two’ strategy) Also, you did not provide the PCM modulation target sample-rate for conversion to 1-bit PDM (DSD).

The up-sampling/modulation processing produces a more refined signal to the DAC output circuitry and the Nyquist Fs aliasing is moved farther out of the range of human hearing. No contextual harmonic, dynamic and spatial information is added to that which was already inherent/imbued in the source PCM encoding(s) … only zero’s are added to the file(s) in the processing. However, the Nyquist Fs of the PCM encoding will always be the limiting factor in regard to high-frequency bandwidth, as this is codified in the mastering of the product. So if the source were a 44.1kHz file, the high-frequency bandwidth will be cut-off at approximately 22kHz in the up-sampled/modulated iteration.

Amarok1969 · April 18, 2025, 6:10am

Thanks!

Great read, and a good reference starting point for those who are going to fiddle around.
Although I haven’t done it as extensive as you did, my experience is the same. Little tweaks can make a big difference.

Most important message is clear. Trust your ears… These items are the part in the set up that can’t be tweaked yet essential in the total experience
What sounds great to one could sound ‘awful’ to someone else.

You have triggered my urge for tweaking the sound to the summum with this again, so I’ll probably find myself immersed in testing several settings again…

Sailor · April 18, 2025, 7:44am

For the layman: what is the difference between the top two graphs (DSD128/DSD256) and the lower two graphs? Also, the y-axis, I guess is dB, but what is the x-axis? I am surprised how undescriptive often these graphs are shown by highly trained electronic gurus. Such graphs would not stand a rigirous scientific publication process.

Sailor · April 18, 2025, 8:17am

Thank you very much for this in-depth coverage of your Audirvāna ,journey‘, most valuable and well written.

I certainly confirm your observations of JPlay iOS. It is definitely a great alternative in this world of audio players. And its sound is remarkably good.

One question, though, regarding linear power supplies: you run a 2012 Mac model that can be fitted with an LPS. What about newer Mac models that can‘t be modified anymore? Would you still connect your computer directly to your DAC?

Fine review!!!

matt · April 18, 2025, 8:24am

Thanks for your impressions
Please disclose your set-up for JPLAY for iOS.
Which network player did you use?

Sailor · April 18, 2025, 9:35am

My setup (though @mhsmit‘s setup might be entirely different) is either MinimServer or AssetUPnP for the library running on a MacMini (last Intel generation) and streaming to a Raspberry Pi running GentooPlayer set to Mpd/UPmpdCli active. The Raspberry Pi is USB-connected to a Devialet.

It‘s working quite well, though 24/192 files can cause occasional dropouts as I am using WiFi.

Audi100 · April 18, 2025, 11:05am

I have to admit that I never wasted any noteworthy thought on upsampling at all. To me, upsampling seems to be an attempt to »artificially« pimp up a given source. Am I absolutely wrong with that believe?

matt · April 18, 2025, 11:07am

Not at all…
I am with you, no upsampling for me

mhsmit · April 18, 2025, 12:26pm

Your question about newer Mac minis is one I also pondered over.

I initially worried that my 2012 Mac mini wouldn’t have sufficient performance, but I’ve not encountered any problems with Audirvāna. The OpenCore patcher has worked remarkably well. I boot from a Thunderbolt SSD and the Mac has 16GB RAM. It feels fast. Not for tasks such as video encoding, but as a server I find it great. I also have a second Mac mini that I keep as a backup, should anything go wrong with this one.

I don’t know whether newer Macs without the MMK 12V modification can match the sound quality — would be interested in hearing from others. I am exited, though, that as my system evolved I started hearing more and more detail and realism. To a point that even the power chord to the Mac mini made a difference. I ignored it at first, upgrading the LPS for UltraRendu and ISORegen first. I’m an IT guy and the fact that “upstream” changes still make such a difference is counter-intuitive. Even replacing the SMPS to the Zyxel switch before the EtherRegen by a Meanwell one was audible.

Recently my UltraRendu’s USB port broke and I temporarily returned to using the Mac’s direct USB output. While it sounded good, the UR outmatches it. Blacker background, better defined deep bass. With the help of Joel at excellent Swedish distributor Meta Wave I now have a new UltraRendu board and I’m glad to have this excellent sound again. Which, interestingly, is better streamimg it from the 2012 Mac mini with Audirvāna than directly from iOS using JPlayer (which still sounds really good as well).

Jud · April 18, 2025, 12:49pm

In a (gentle) word, yes.

Whatever upsampling you don’t do with Audirvana, your DAC will. The CPU in your computer is more capable and can run more sophisticated digital filters than the chip (or in some more expensive DACs, the FPGA) than your DAC can internally. Because digital audio needs filtering in order to work, and upsampling makes good filtering easier.

If anyone wants an explanation as to why that is, I can provide it.

mhsmit · April 18, 2025, 12:49pm

Well I know what instruments sound like in real life. And especially the emotion they can convey and goosebumps they can produce. I own a mastering studio and know that some music is mixed to deviate from “hyper realism”, which is fine. But I have found any system that isn’t close to “neutral” to subtract from the enjoyment of a lot of music. It’s like setting an equalizer on a well-calibrated and acoustically treated room. I prefer to treat the room and let the system be as neutral as possible.

Yes this absolutely true and I can testify that the Benchmark DAC3 sounds amazing with its built in SMPS. On my ISORegen, UltraRendu, and Mac mini the LPSes resulted in significant improvements to the sound though.

Also, replacing the standard wall warts of my cheap Zyxel switch and a USB hub and 4TB hard disk by (still cheap) Meanwell SMPSes I bought on Amazon also improved the system (I experienced a further reduction in remaining “unevenness” in the upper mids).

You’re right that JPlayer also reaches my UltraRendu over Ethernet. But in my case it streams from iOS over WiFi before it reaches rhe EtberRegen switch. In the case of the Mac mini the entire chain to the UltraRendu is wired ethernet. A longer chain which, to my ears, still sounded better. Or maybe it’s Audirvana’s tweaked upsampling that made the essential difference.

I was hesitant about OpenCore too. But I haven’t noticed any degradation from when the same Mac mini was still running Audirvāna 3.5 using 32 bit macOS.

OpenCore just re-enables old hardware by re-installing drivers that Apple removed in later OSes. It’s not an “emulation layer” or anything like that. There’s no reason why an OpenCore-patched Mac would run slower, if the drivers and patches are bug free. A quad core Intel-powered Mac mini with 16GB RAM and SSD is plenty fast for Audirvāna, in my experience.

Jud · April 18, 2025, 12:54pm

If you want to experiment further you might try the inexpensive medical grade CUI SMPSs. They are in the US $30-$40 range and because they are medical grade they have very low leakage current, which is a bane of the cheap wall warts.

Jud · April 18, 2025, 1:00pm

Heh, no they wouldn’t. I have to run out for a bit, but I’ll provide more explanation for these graphs when I get back.

matt · April 18, 2025, 1:03pm

JPLAY for iOS works as remote control for UltraRendu. There is no streaming of music files from the iOS device. In your case the Ultrarendu gets all music via wired ethernet also when you use JPLAY.