Hi everyone,
I’m working with Asterisk ExternalMedia over WebSocket and I’m seeing packet drops under a specific timing pattern. I’d appreciate some guidance on whether this is expected behavior and how to debug it properly.
Setup (simplified)
-
Asterisk ExternalMedia channel (WebSocket transport)
-
A dummy WebSocket server sends raw audio packets
-
A main media service sits in between:
-
Receives audio packets of optimal_frame_size from dummy server every 10ms
-
Forwards them as-is to Asterisk
-
-
Audio packets are exactly
optimal_frame_sizeas provided byMEDIA_START -
Codec: (slin / ulaw depending on call)
Flow
Dummy WS → Main Service → Asterisk ExternalMedia.
Dummy WS ← Main Service ← Asterisk ExternalMedia
Flow control
-
When MEDIA_XOFF is received from Asterisk:
- Main service stops sending media
-
When MEDIA_XON is received:
- Media sending resumes
-
No packets are intentionally dropped on the application side
The problem
-
Dummy server sends optimal_frame_size packets every 10ms
-
Main service forwards them immediately to Asterisk
-
After ~40–50 seconds of call time:
-
Audio starts breaking
-
Packets appear to be dropped
-
Playback becomes choppy
-
However:
-
If the same packets are sent every 20ms, everything works perfectly
-
No packet loss
-
No audio issues
This makes me suspect I’m overrunning something internally even though I’m respecting:
-
optimal_frame_size -
MEDIA_XOFF / MEDIA_XON
My understanding so far
From documentation / discussions, I understand that:
-
The channel driver maintains an internal media queue
-
Roughly:
-
~1000 frames max
-
XOFF around ~900 frames
-
XON around ~800 frames
-
-
Even with XOFF/XON handling, media sent before XOFF may still overflow if timing is off
This makes me wonder whether:
-
Sending 2× packets faster than ptime (10ms vs 20ms) is inherently unsafe
-
Or whether my understanding of XOFF/XON semantics is incomplete
Questions
-
Is it valid to send
optimal_frame_sizepackets faster than the negotiated ptime?(e.g., 10ms packets when
ptime=20) -
Does ExternalMedia assume wall-clock pacing, not just packet size?
In other words, is respecting
ptimetiming mandatory even if frame size is correct? -
Is there any way to introspect or debug the channel driver media queue?
-
Queue depth
-
Frame backlog
-
Drops
-
Debug logs / CLI commands / tracepoints
-
-
Is MEDIA_XOFF intended as a “hard stop” guarantee, or is it best-effort and timing-sensitive?
For production Voice AI integration, what is the recommended approach?
-
Should the media engine:
-
Always re-clock audio at
ptime? -
Use a jitter buffer / ring buffer?
-
Treat ExternalMedia like RTP in terms of pacing?
-
-
Architecturally, what’s the recommended way to build a media engine that:
-
Talks to Asterisk ExternalMedia
-
Talks to another media source (Voice AI in production)
-
Handles pacing cleanly without trying to outsmart ptime
-
What I’m trying to confirm
Whether the correct model is:
“Even if you receive media faster, you must clock audio into Asterisk at ptime, otherwise drops are expected.”
or whether there’s a supported way to safely send faster-than-ptime media using XOFF/XON alone.