Asterisk Version: 22.8.0
Channel Driver: PJSIP
AGI: FastAGI (Node.js)
Background
I am building a voice bot system using Asterisk 22.8.0 with PJSIP and a Node.js FastAGI server. Incoming PSTN calls hit my dialplan, get routed to FastAGI, and I record both legs using MixMonitor with the D flag to produce a stereo interleaved raw file (one speaker per channel).
The Problem
My two channel legs are running on different codecs and sample rates:
- Channel 1 (PSTN / caller leg):
ulaw→ 8kHz, 8-bit µ-law - Channel 2 (Voice bot / application leg):
slin16→ 16kHz, 16-bit signed linear
Asterisk channel stream output confirms this:
-- Streams --
Name: audio-0
Type: audio
State: sendrecv
Group: -1
Formats: (slin16)
My FastAGI executes MixMonitor like this:
AGI Script Executing Application: (MixMonitor) Options: (/Recording/4663.raw,D)
As per Asterisk documentation, the D flag:
Interleaves the audio coming from the channel and the audio going to the channel and outputs it as a 2 channel (stereo) raw stream. You must use the
.rawextension.
This creates a single stereo interleaved file: 4663.raw
Converting with sox — All Attempts Fail
Since the two legs have different sample rates, no single sox command produces correct audio:
Attempt 1 — Treating as 16kHz (slin16):
sox -r 16000 -e signed-integer -b 16 -c 2 4663.raw 4663.wav
→ Audio plays too fast, ulaw leg is double speed
Attempt 2 — Treating as 8kHz (ulaw):
sox -r 8000 -e signed-integer -b 16 -c 2 4663.raw 4663.wav
→ Audio plays too slow and distorted, slin16 leg is half speed
soxi output of the resulting WAV:
Input File : '4663.wav'
Channels : 2
Sample Rate : 8000
Precision : 16-bit
Duration : 00:02:30.50 = 1204000 samples
File Size : 4.82M
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM
The duration does not match the actual call length at either sample rate.
What I Already Tried
Attempt — Force codec via FastAGI before MixMonitor:
SET VARIABLE CHANNEL(audioreadformat) slin16
SET VARIABLE CHANNEL(audiowriteformat) slin16
Result — Asterisk throws a WARNING and ignores it:
WARNING[117679][C-00000024]: func_channel.c:802 func_channel_write_real:
Unknown or unavailable item requested: 'audioreadformat'
WARNING[117679][C-00000024]: func_channel.c:802 func_channel_write_real:
Unknown or unavailable item requested: 'audiowriteformat'
So audioreadformat and audiowriteformat are clearly read-only and cannot be set via AGI or dialplan Set().
My Questions
-
When
MixMonitoruses theDflag and the two legs have different sample rates (ulaw 8kHz vs slin16 16kHz), at what sample rate does Asterisk actually write the interleaved stereo.rawfile? Does it upsample, downsample, or just write raw bytes as-is? -
Is there any supported dialplan or AGI method to force both channel legs to the same codec/sample rate before
MixMonitorstarts — without having to enforce it at the PJSIP endpoint config level? -
If enforcing at the PJSIP endpoint level (
disallow=all / allow=ulaw) is the only option, does that cause transcoding overhead on the slin16 bot leg, and is there a way to avoid it? -
Is there a correct sox command to handle a stereo raw file where left and right channels have different native sample rates?
Any help or pointers to the relevant Asterisk source code or documentation would be hugely appreciated. Thank you!