How to Send Call Media to a Bot via WebSocket and Receive Responses in Asterisk

Hi everyone,

I’m working on a project where I want to integrate Asterisk with a bot through a WebSocket connection. My goal is to:

  1. Stream the audio (RTP) from an active call in Asterisk to a WebSocket server (where the bot processes the audio).
  2. Receive responses (audio or text converted to audio) from the bot via the WebSocket and send them back into the active call.

What I’ve done so far:

  1. Enabled ARI (ari.conf) and verified that I can connect to it successfully.
  2. Set up a Stasis app in extensions.conf to route calls into ARI.
  3. Created a WebSocket server using Node.js to process the audio.

My challenges/questions:

  1. How can I capture the audio stream from the call using ARI (e.g., SnoopChannel)?
  2. What’s the best way to send the audio frames to my WebSocket server in real time?
  3. How do I inject the audio responses (received from the bot via WebSocket) back into the call? Should I use Playback, or is there a better approach?
  4. Do I need to handle any specific audio codec conversions (e.g., G.711 to PCM or vice versa) for this integration?
  5. Are there any examples or best practices for a setup like this?

My setup:

  • Asterisk Version: 20.11.0
  • WebSocket Server: Node.js
  • Bot: Google Dialogflow

Any guidance, sample implementations, or suggestions would be greatly appreciated!

Thank you in advance!

This same question (or very similar because AI is also being used to produce the post itself) is being asked at a minimum once a week at this point, so read the past threads.