I’m experiencing intermittent audio distortion and lag when running 90+ concurrent calls using chan_websocket with ARI ExternalMedia. Audio plays fine at lower call volumes (30-50 calls) but degrades at scale. I’ve done extensive debugging on my application side and narrowed the issue down to Asterisk’s internal behavior.
Architecture
Client App (TTS/AI)
↓ (WebSocket - audio + mark events)
RTP Engine (Node.js, running in K8s pod, 4 vCPU / 4GB RAM)
↓ (WebSocket per call - binary audio frames)
Asterisk 22.8.2 (Docker container, 16 vCPU / 16GB RAM)
↓ (RTP)
SIP Phone / Caller
-
Codec:
slin16(optimal_frame_size = 640 bytes = 20ms per frame) -
Transport: WebSocket (
chan_websocket) -
Direction: Sending audio TO Asterisk for playback to the caller
-
Using
START_MEDIA_BUFFERINGmode for non-opus formats -
Using
MARK_MEDIA/MEDIA_MARK_PROCESSEDfor playback position tracking -
Using
FLUSH_MEDIAfor barge-in/interruption
What I’ve Verified (Not the Problem)
-
Node.js event loop is healthy. My
setInterval(20ms)timer fires with avg 20.00ms gap, max 21.2ms. Zero drift. -
No WebSocket backpressure.
ws.bufferedAmounton the Asterisk-facing sockets stays near 0. Data reaches Asterisk instantly. -
Asterisk task processors are clean.
core show taskprocessorsshows 0 items in queue for all pjsip distributors. No backlog. -
Network is fine. No packet loss or significant latency between my RTP engine pod and Asterisk container.
-
My application sends audio in correct frame-aligned multiples (exact multiples of 640 bytes for slin16).
What I Observe
XOFF/XON cycling with consistent 2-second XOFF duration
When sending audio at a burst rate (e.g., 5x real-time = 3200 bytes per 20ms tick), I see XOFF lasting exactly ~2030ms every cycle:
XOFF lasted 2038ms
XOFF lasted 2029ms
XOFF lasted 2021ms
XOFF lasted 2030ms
XOFF lasted 2036ms
XOFF lasted 2027ms
This makes sense given the hardcoded thresholds in chan_websocket.c:
c
#define QUEUE_LENGTH_MAX 1000 // 20 seconds of audio
#define QUEUE_LENGTH_XOFF_LEVEL 900 // XOFF at 18 seconds
#define QUEUE_LENGTH_XON_LEVEL 800 // XON at 16 seconds
XOFF→XON gap = 100 frames × 20ms = 2000ms. Matches exactly.
The puzzle: buffer should never run empty, but audio still distorts
During the 2-second XOFF period, there are still 800+ frames (16 seconds of audio) sitting in Asterisk’s internal buffer. The phone should never run out of audio to play. Yet callers hear distortion/lag/choppy audio.
Distortion onset varies with system load
-
Call A (first call on a fresh system): audio starts lagging after ~14-15 seconds
-
Call B (started while 93 other calls are active): audio starts lagging within 4-5 seconds
-
At lower concurrency (30-50 calls): no distortion at all
Different send rates, same problem at scale
| Send Rate | Behavior at 90+ calls |
|---|---|
| 1x (640 bytes/tick) | Choppy — no cushion, micro-gaps between packets |
| 2x (1280 bytes/tick) | Distortion after 14-15 seconds per call |
| 5x (3200 bytes/tick) | Distortion still occurs, XOFF/XON cycling every ~2 sec |
| 4-5x burst with XOFF | Audio quality slightly better but still degrades |
My Questions
-
Is there a way to configure the XOFF/XON thresholds (
QUEUE_LENGTH_XOFF_LEVEL,QUEUE_LENGTH_XON_LEVEL) without recompiling Asterisk? I don’t see anychan_websocket.confsetting for buffer sizes. -
Could the frame timer thread be the bottleneck? With 90+ channels each having 800-900 frames queued, Asterisk needs to pop ~4500 frames/second across all channels. Could the timing thread fall behind under this load, causing uneven frame delivery to the RTP output?
-
Is there a known concurrency limit for
chan_websocketchannels? I understand regular SIP/RTP channels can handle hundreds of calls, butchan_websocketwithSTART_MEDIA_BUFFERINGinvolves additional queue management per channel. Is there a practical ceiling? -
Could
START_MEDIA_BUFFERINGmode behave differently under high concurrency compared to passthrough mode? Should I be sending audio differently? -
Are there any Asterisk configuration settings (e.g.,
http.conf, timer settings, thread pool settings) that could improvechan_websocketperformance at this scale?
Environment
-
Asterisk: 22.8.2 (Docker container)
-
Host: 16 vCPU, 16 GB RAM
-
Channel driver:
chan_websocket -
Codec:
slin16 -
Concurrent calls: 90-95
-
RTP Engine: Node.js application in Kubernetes pod (4 vCPU, 4GB RAM)
Relevant core show taskprocessors output (during 90+ calls)
All pjsip distributors show 0 items in queue, max depth 3-5, which appears healthy.
Any insights from the community would be greatly appreciated. Happy to provide additional diagnostics or logs.