I’d like to do more scientific testing of the dynamic rate control approach. The goal is for the method to be transparent. We thus need to find at which point the pitch distortion is transparent to the human ear. And please no trolling about “accuracy”. That is irrelevant to this test.
I have not found any papers on this, so we’ll have to start with the basics, ABX testing.
The main parameter of dynamic rate control is the delta factor d. The factor d constrains resampling ratio factor between ratio * [1 - d, 1 + d].
Creating test samples: Checkout RetroArch from Git, in audio/test, build testing binaries with make. You can create samples with test-rate-control.sh. Example:
./test-rate-control.sh reference_music.flac reference.wav 0.000 # Assumes music.flac is 44.1kHz and stereo. Resamples to 48kHz with d = 0 (no rate control).
./test-rate-control.sh reference_music.flac drc_020.wav 0.020 # Resamples to 48kHz with 2% pitch deviation (will be very audible).
./test-rate-control.sh reference_music.flac drc_002.wav 0.002 # 0.2% pitch deviation (probably not audible).
The script uses test-sinc-highest (well over 100 dB SNR) to resample, and FFmpeg to do raw PCM conversions, etc.
For each input block of N frames, a resampling ratio will be chosen randomly between ratio * (1 - d) and ratio * (1 + d). The distribution is uniform. Note that this ratio doesn’t really correspond to the same ratio used in RetroArch as the actual ratio variance is generally far lower than the maximum allowed value. This test will simulate the worst case.
ABX tool: Windows: fb2k (duh) *nix: Squishyball (from xiph.org): http://svn.xiph.org/trunk/squishyball/, Arch package: http://svn.xiph.org/trunk/squishyball/
When ABX-testing, remember to add in a “beep” or something when swapping samples, or it is very simple to notice a difference due to tiny timing deviations.