I’m currently using ARI to play audio files (converted to mono WAV with 16bit PCM at 8kHz) via playback from URL. As the audio quality does not really match the standard of 2021, I’m looking for a way to improve the quality in the direction of HD Voice.
So, is there a way to use MP3Player from ARI or what other path could lead into the right direction?
16 bit 8kHz exceeds the public landline network quality (8 bit companded, 8kHz) and significantly exceeds mobile phone quality (vocoded).
If this does not involve the public network, you should convert the audio to the codec used in your network. If it does, you are already exceeding the capabilities of the network.
Asterisk generally requires recordings in other codecs to not have a WAV, etc., wrapper, but sox can create such files.
Afaik mono WAV, 16bit PCM at 8kHz is the Asterisk native SLN format, which should be fine and transcorded on the fly, isn’t it? But I am grateful for any corrections!
Asterisk format suffixes are case sensitive. There is no SLN format. sln format is not Microsoft WAV, as it has no RIFF wrapper, but is pure media, with no meta data. Asterisk WAV format is is Microsoft WAV format, with GSM encoded audio. For Microsoft WAV format with all the attributes you quote, you need Asterisk wav format.
Ah, all right! Then I communicated that wrongly. I use mono 16bit PCM at 8kHz as raw media with .wav extension. Sorry for the confusion.
Meanwhile I looked up the G.722 specifications, which are 16bit PCM at 16kHz. So I ask myself how to achieve the best possible audio quality with Playback in regard to G.722.
G.722 is more complex than that, and will have lower quality than 16 bit 16kHz signed linear. Asterisk supports files in raw 16 bit, 16kHz, little endian, mono, signed linear, using the filename extension .sln16, so if you record to that format, you will get no additional losses from transcoding to the variant of G.722 that Asterisk supports, beyond those inherent in using G.722.
In that case, it will have been downgraded to G.711 A-Law (as you are in Europe), or GSM by the time it reaches the caller. The service you are getting from the PSTN is 3.1kHz audio, which means 300 to 3.4kHz. This is sampled at 8kHz to meed the Nyquist sampling criteria and allow a margin for practically realisable anti-aliasing filters.
The service the mobile network provides is speech, which is more restrictive than 3.1kHz audio.