I have the following VoIP setup: I have a caller using pyvoip library and the asterisk server. I want to play a wave file using the pyvoip.writeAudio() functionality. Asterisk Server will then record the call. I use Asterisk version 18.10.0~dfsg+~cs6.10.40431411-2 and the configuration is straightforwards:
sip.conf (all codecs allowed)
disallow=all
allow=all
extension.conf
exten => 123,1,Answer()
same => n,Wait(1)
same => n,Record(/etc/asterisk/call-${CALLERID(num)}:wav,0,0,k)
same => n,Hangup()
The recorded file is then downloaded by the client via socket. The main goal is to calculate pesq values comparing the reference and the downloaded/ recorded audio.
What are the specifications of the wave file in order to be able to transfer the audio uncompressed and make asterisk record the call with high quality. I tried transmitting the audio as raw data, did not work well. I saw in a different post that I have to transfer the audio as Mono, 8000Hz Unsigned PCM file. The audio quality was not any better. What codecs would you recommend for VoLTE calls.
I expect to reach a pesq value of 4,4 or higher if the client and the server are connected via LAN. The audio file I am using is to be found here https://www.signalogic.com/melp/EngSamples/Orig/ENG_M.wav.
Excuse the question I am new to the audio world and lack the experience. Would appreciate any tips and thank you in advance.
This breaks some versions of Asterisk and stands a good chance of breaking the system as a whole by producing excessively long INVITE packets.
What is the highest frequency that your codecs under test will use? Whilst Asterisk’s interpretation of of .wav will give you landline quality which is is generally the best achievable for for 3.1kHz audio codecs (8kHz sampling rate) used by phones, the highest quality for a given bandwidth will be achieved using the appropriate .slin* (.slin, .slin16, etc.) (might be sln, rather than slin) for the sampling rate, although you will have to turn on high levels of debugging or even look at the code, to make sure that Asterisk transcoding a long way round resulting in a weaker codec being used as an intermediate. slin formats are not .wav compatible. They are are raw 16 bin signed linear, in little ending byte ordering.
I assume VoLTE applies to normal voice calls. All the codecs used on the public system for the phone service are speech only codecs, so will have lower quality than a landline codec of the same bandwidth. Which codec you get may be operator and location dependent.
Thank you for your response. I’ve made adjustments to the SIP configuration to exclusively accept ulaw, as it’s the only codec compatible with the pyVoIP library. Currently, the audio I’m working with is 8-bit 8 kHz unsigned PCM, totaling 132052 bytes. Upon conducting debugging, I can see that the incoming audio utilizes PCMU (G.711 u-law) with a clock rate of 8,000 Hz.
However, when the audio is recorded by Asterisk, it’s stored as 16-bit 8 kHz PCM, nearly doubling in size. Despite these adjustments, the PESQ values I’m obtaining remain unsatisfactory. Do you have any suggestions for better configurations? Additionally, is there a way to specify how Asterisk records the audio to ensure compatibility? Thank you in advance.
µ-law to slin conversion should exactly double the size and is a lossless conversion. (slin to µ-law is lossy, but all the losses are due to the use of µ-law. µ-law to .wav represents slightly more than doubling, as you have to add the .wav metadata. To reduce the size, you could use .ulaw, but this is a raw format. not something with a .wav wrapper
You should expect scores of 4.3 or 4.4 for a chain, like this, were µ-law is the weakest link. Expecting more than 4.4 is unreasonable (and more than 4.5 is impossible, by definition).
If you are in North America (or Japan), µ-law is what the traditional, digital, PSTN uses.
Apart from codec choices, the only other thing that would really have an effect is network problems, and the choice of test signal.
Your test file is in the format that Asterisk calls .wav, so there should be no excess degradation using either .slin or .wav for the Asterisk capture.
I’m not really sure why you are doing this, especially as you are not using a low bit rate codec.
Most of the things that PESQ is looking for shouldn’t happen with a simple, time domain, codec, like µ-law.