I totally missed that DALI added audio decoding ops. As far as I can see, the ops seems to be based on sndfile and do not support GPU decoding. I wonder if this would still improve single threaded ETL speed on pytorch or tensorflow.
pinging DALI devs @mzient @szalpal as they might have benchmarked the op against tfio or torchaudio which are also sndfile based.