Following the severe audio quality issues I described at Choppy audio with DSP warnings, we have finally been able to run RTP debug logging during an affected call, and identified the issue.
The issue is that both Asterisk and our provider’s media server appear to switch codecs when they receive a packet using a different codec from the other side. Because this is happening on both ends, some calls get into an infinite loop of switching codecs.
Example call - INVITE SDP (from us):
v=0
o=- 1014299248 1014299248 IN IP4 [our IP]
s=Asterisk
c=IN IP4 [our IP]
t=0 0
m=audio 15968 RTP/AVP 9 0 8 101
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=ptime:20
a=maxptime:140
a=sendrecv
Note that G722 is the first-listed codec.
Example call - 200 SDP (from our provider):
v=0
o=- 1014299248 1014299250 IN IP4 [our provider's IP]
s=session
c=IN IP4 [our provider's IP]
t=0 0
m=audio 16476 RTP/AVP 8 0 9 101
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:9 G722/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=ptime:20
a=sendrecv
Note that PCMA is the first-listed codec.
Example call - RTP debug logs:
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005417, ts 000160, len 000160)
Got RTP packet from [our IP]:39125 (type 00, seq 011309, ts 3885110459, len 000170)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006061, ts 062560, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005418, ts 000320, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006062, ts 062720, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011310, ts 3885110619, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005419, ts 000480, len 000160)
Got RTP packet from [our IP]:39125 (type 00, seq 011311, ts 3885110779, len 000170)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006063, ts 062880, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005420, ts 000640, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006064, ts 063040, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005421, ts 000800, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006065, ts 063200, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011312, ts 3885110939, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011313, ts 3885111099, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 08, seq 004688, ts 000056, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006066, ts 063360, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 08, seq 005422, ts 000960, len 000160)
Got RTP packet from [our provider's IP]:16476 (type 09, seq 004689, ts 000216, len 000160)
[2024-02-19 06:26:35] WARNING[15003]: dsp.c:1469 ast_dsp_silence_noise_with_energy: Can only calculate silence on signed-linear, alaw or ulaw frames :(
Got RTP packet from [our IP]:39125 (type 00, seq 011314, ts 3885111259, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005423, ts 001120, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006067, ts 063520, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011315, ts 3885111419, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 09, seq 004690, ts 000376, len 000160)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005424, ts 001280, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006068, ts 063680, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 09, seq 004691, ts 000536, len 000160)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005425, ts 001440, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006069, ts 063840, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 09, seq 004692, ts 000696, len 000160)
Got RTP packet from [our IP]:39125 (type 00, seq 011316, ts 3885111579, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011317, ts 3885111739, len 000170)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005426, ts 001600, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006070, ts 064000, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 09, seq 004693, ts 000856, len 000160)
Sent RTP packet to [our provider's IP]:16476 (type 09, seq 005427, ts 001760, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006071, ts 064160, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011318, ts 3885111899, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011319, ts 3885112059, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 08, seq 004694, ts 001016, len 000160)
Sent RTP packet to [our provider's IP]:16476 (type 08, seq 005428, ts 001920, len 000160)
Sent RTP packet to [our IP]:39125 (via ICE) (type 00, seq 006072, ts 064320, len 000170)
Got RTP packet from [our IP]:39125 (type 00, seq 011320, ts 3885112219, len 000170)
Got RTP packet from [our provider's IP]:16476 (type 09, seq 004695, ts 001176, len 000160)
[2024-02-19 06:26:35] WARNING[15003]: dsp.c:1469 ast_dsp_silence_noise_with_energy: Can only calculate silence on signed-linear, alaw or ulaw frames :(
...
Note that the first packet we send to our provider is G722 (type 09), while the first packet we receive from our provider is PCMA (type 08). Due to receiving a PCMA packet, we switch to PCMA, but at the same time, due to receiving a G722 packet, our provider switches to G722. This switching back and forth goes on forever because both ends are “polite” and willing to switch to the codec used by the other side.
As a temporary/interim measure, we have changed our allowed codec list to only PCMA, so that there is no room for confusion. However, going forward, we would want to use the preferred codec indicated by our provider for a given call where possible, especially to use G722 (wideband) audio when it makes sense (our provider lists codecs in a different order for different calls, presumably indicating order of preference for that call).
What are the best ways to handle this situation? Given that our provider switches codecs on the fly, should we be configuring Asterisk to not do the same? In particular, would it be possible to configure Asterisk to just use the codec preferred by our provider (PCMA in this example) from the start, rather than initially trying to use our first choice codec (G722) and then switching on the fly if we receive something different?