I’m building a real-time audio relay between Asterisk and AI/voice services using WebSockets only —
Each call has its own WebSocket carrying 20 ms SLIN16 audio frames (640 bytes).
The system works for low concurrency but breaks under load (~1000+ calls):
-
Audio speeds up or distorts (“chipmunk effect”) due to timer drift and Node.js GC pauses.
-
Occasional gaps or bursts when the media queue drains or flushes too quickly.
-
Event-loop latency and backpressure make precise 20 ms pacing unreliable.
Current setup: Node.js + ws, per-stream media queue, setInterval(20ms) sender.
Goal: Smooth, loss-less real-time audio at scale (1–2 k calls), WebSocket-only transport.
Looking for suggestions or examples of:
-
Scalable architectures for this (Node vs Rust/Go media layer).
-
Best practices for per-stream pacing and jitter buffering over WebSocket.
-
Anyone running large-scale Asterisk → WebSocket audio bridges.
using asteirsk 22.6.0 version of asterisk