Forum Post Title: How to prevent G.722 (16kHz) from transcoding to 8kHz SLIN before EAGI script?
Hello everyone,
I’m working on a voicebot project using Asterisk, a Python EAGI script, and the Google Cloud Speech-to-Text API. My goal is to maintain a full 16kHz audio pipeline from an incoming call using the G.722 codec all the way to my script for high-quality transcription.
Despite my configuration, I’m seeing Asterisk transcode the audio down to 8kHz, which is causing problems for my speech recognition setup. I’ve hit a wall trying to debug this and would greatly appreciate any insights from the community.
The Core Problem
An incoming call is correctly established using G.722 (16kHz). However, when I check the active channel, Asterisk is actively transcoding the audio before it even gets to my EAGI
application.
Here is the output from core show channel [channel-id]
, which is the key evidence of the issue:
-- General --
Name: SIP/provider-trunk-0000000b
Type: PJSIP
UniqueID: 1750859102.22
LinkedID: 1750859102.22
Caller ID: [REDACTED_CALLER_ID]
...
NativeFormats: (g722)
WriteFormat: g722
ReadFormat: slin
WriteTranscode: No
ReadTranscode: Yes (g722@16000)->(slin@8000) <-- THE PROBLEM IS HERE
Time to Hangup: 0
Elapsed Time: 0h0m18s
Bridge ID: (Not bridged)
-- PBX --
Context: from-internal
Extension: [REDACTED_DID]
Priority: 24
Application: EAGI
Data: /var/lib/asterisk/agi-bin/voicebox/voicebox.py
As you can see, the ReadTranscode
line clearly shows the incoming 16kHz G.722 audio is being converted to 8kHz SLIN (slin@8000
). My Python script is configured for 16kHz LINEAR16
audio, so this transcoding is the root of my problem.
My Configuration
I am using pjsip
for the trunk. The user provided sip.conf
, but here is the equivalent modern pjsip.conf
configuration.
/etc/asterisk/pjsip.conf
[transport-udp]
type=transport
protocol=udp
bind=0.0.0.0
[provider-trunk]
type=endpoint
context=from-internal
disallow=all
allow=g722,ulaw,alaw ; g722 is preferred
aors=provider-trunk
direct_media=no
rtp_symmetric=yes
force_rport=yes
[provider-trunk]
type=aor
contact=sip:[PROVIDER_IP] ; The static IP of my provider
[provider-trunk]
type=identify
endpoint=provider-trunk
match=[PROVIDER_IP] ; The IP the calls come from
/etc/asterisk/extensions.conf
This is my dialplan that handles the call. It plays a welcome message and then enters a loop to interact with my EAGI script.
[general]
autofallthrough=yes
[from-internal]
exten => _.,1(start),Set(CHANNEL(format)=slin16)
exten => _.,n,Answer()
exten => _.,n,Verbose(1, ReadFormat=${CHANNEL(readformat)})
exten => _.,n,Set(callstart_time=${EPOCH})
exten => _.,n,Ringing
exten => _.,n,Wait(2)
exten => _.,n,Set(i=1)
exten => _.,n,Set(VBDIR=/var/lib/asterisk/agi-bin/voicebox)
exten => _.,n,Set(voicedir=${VBDIR}/incoming/voice-${UNIQUEID})
exten => _.,n,Set(voicefile=${voicedir}/${i}.wav)
exten => _.,n,MixMonitor(${VBDIR}/output.wav,r(${voicefile}))
; Play a welcome message
exten => _.,n,Set(audio=${VBDIR}/short_welcome)
exten => _.,n,Playback(${audio})
; Loop for conversation with the bot
exten => _.,n(loop),While($[${i} < 24])
exten => _.,n,GotoIf($["${hangup}" = "True"]?hangup,1)
exten => _.,n,eagi(${VBDIR}/voicebox.py)
exten => _.,n,StopMixMonitor()
exten => _.,n,Set(i=$[${i} + 1])
exten => _.,n,Set(voicefile=${voicedir}/${i}.wav)
exten => _.,n,MixMonitor(${VBDIR}/output.wav,r(${voicefile}))
exten => _.,n,Background(${audio})
exten => _.,n,EndWhile
exten => _.,n,hangup()
; Hangup logic
exten => h,1,agi(${VB_DIR}/upload_audio.py)
exten => h,n,System(rm -rf ${voicedir})
exten => h,n,agi(${VB_DIR}/post_call.py)
exten => h,n,hangup()
Python EAGI Script Summary
voicebox.py
: This is the main script called byeagi()
in the dialplan. Its primary job is to manage the call flow and invoke the speech recognition script.speech.py
: This script uses the Google Cloud Speech library. The critical configuration within this script is:
This confirms the script is correctly set up to process 16kHz audio.# speech.py snippet config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16, sample_rate_hertz=16000, language_code='sv_SE' )
What I’ve Tried
- Codec Preference: Ensuring
g722
is the first and preferred codec in mypjsip.conf
endpoint. - Forcing Channel Format: I added
Set(CHANNEL(format)=slin16)
at the very beginning of my dialplan, but this seems to be ignored or overridden, as the transcode still happens. - The Playback Hypothesis: I have been advised that the
Playback(${audio})
command could be the culprit. If myshort_welcome.wav
file is an 8kHz file, Asterisk might be forcing the entire channel to 8kHz to match it. I am in the process of verifying all my.wav
files are saved as 16kHz, 16-bit mono.
My Questions for the Community
- Is the
Playback()
application the most likely reason for forcing this transcode, even when I’ve set the channel format manually? - What is the most robust and reliable way to configure my dialplan to guarantee a 16kHz audio path from the endpoint to the EAGI script?
- Are there any other global settings (in
asterisk.conf
or elsewhere) that could be influencing this behavior and forcing a default 8kHz path?
Thank you in advance for any help or suggestions you can provide!