Asterisk to OpenAI Realtime: The Definitive MVP (In Progress)

  1. What is this?

I want to share with the community how I’m building a Minimum Viable Product (MVP) to enable calls from Asterisk FreePBX to OpenAI Realtime models with the lowest possible latency. This will be an easy-to-configure and ready-to-use solution.

OpenAI Realtime Reference:

https://platform.openai.com/docs/guides/realtime

  1. How to reach the goal and what is it?

I believe the most user-friendly experience will come from using FreePBX, which provides easy access to features like extensions, trunks, reports, and IVRs. My plan is to build a FreePBX module that includes:

  • An ARI ExternalMedia App
  • OpenAI APIs integration
  • A Control Panel in FreePBX

The Goal:

Enable users to call extension 3000, where an OpenAI Realtime model answers with its voice. The model will listen to your commands, understand them, and respond appropriately with low latency.

  1. Why?
  • To demonstrate how an Asterisk ARI app can integrate with OpenAI’s Realtime provider.
  • To create a foundational project for integrating other AI providers (e.g., Grok 3 Voice) into a single app.
  • To meet new customers who need ARI customizations and professional services.
  1. Who am I?

I’m a Cloud Architect with 15 years of experience working with servers, networks, cloud, and AI. I’ve leveraged AI tools to complete other ARI (Asterisk REST Interface) implementations with AI bots, such as Google DialogFlow and AWS Bedrock. This project is an exciting challenge for me, and I’d love to share it with the community.

  1. Requirements
  • OpenAI API keys and credits, with access to Realtime models.
  • FreePBX installed.
  1. Outcomes/Roadmap
  • Enable seamless, low-latency communication with OpenAI Realtime models via Asterisk.
  • Develop a FreePBX Addon.
  • Build an ARI app using ExternalMedia.
  • Prepare the solution for scalability.
  1. Tested On
  • Debian 12 with FreePBX 17
  1. To-Do List

8.1. Research OpenAI Realtime capabilities and requirements. In Progress - Done

8.2. Build a backend Node.js app replicating the OpenAI Realtime Playground to validate all required steps. In Progress - Done

It works!. I can send and receive voice in streaming using OpenAI websockets. Here are the mini-apps that help me test everything:

8.3. Extract all necessary fields for implementation. In Progress

8.4. Build the first ARI app version 0.1.

8.5. Test and debug.

8.6. Create the FreePBX Module. In Progress

8.7. Test and debug.

8.8. Build the second ARI app version 0.2.

8.9. Test and debug.

8.10. Integrate the most popular requested features.

  1. What can you do?
  • Subscribe to this post for updates until the MVP is complete.
  • Help me test it and report bugs or issues to the GitHub repo once it’s ready.
  • Suggest functionalities! I’ll prioritize the most relevant ones for future versions.
6 Likes

Hi @InfinitoCloud ,

i feel interested to your project, how can we link up for further discussion.

Thanks,

Ok, I continue with the development of the module, step 8.3:

I have just completed an ARI application that integrates the RTP server, this application answers the call, receive the RTP audio, records its input audio for 10 seconds, saves as wav and plays it by injecting it through RTP towards asterisk. For people asking how to get ARI to work with RTP:

2 Likes

Hello everyone,

I see you’re discussing the possibility of integrating Asterisk with OpenAI Realtime.

Could you please tell me if it’s possible to use Node.js with ARI to send a media stream to OpenAI so that it can process the call in real-time?

Hello, that’s exactly what I’m doing, today I was working on the ARI application and I was able to listen for the first time to the openai model real time from an asterisk call, but it fails when playing the following open ai rt audio responses, well, I’m making progress, I’ve made quite a bit of progress I would say. I’m close, follow the updates of this post, at the end I will share the functional ARI application. Greetings.

2 Likes

fantastic job! was also my research and you and you got ahead of me! my research was oriented to include a n8n flux, i follow your work and I will be vailable to extend it to create something awesome

1 Like

Hi, nice idea, it could be integrated with n8n. I’m close to complete the 0.1 version of the ARI app, so, keep tuned.

2 Likes

Will this app allow for a realtime conversation between a user via telephone with an openai bot?

Yes exactly, this ARI application and the subsequent FreePBX module will connect Asterisk with OpenAI Realtime services.

1 Like

UPDATES!

8.4. Build the first ARI app version 0.1 - DONE!

You can see the ARI app working here:

Also supports interruptions:

3 Likes

Do you have the latest code for this? I see you havent pushed any changes to your repo in a bit.

love the work <3, nice demo

1 Like

Hi, yes, the repo is updated now:

3 Likes

Thanks for your demo :slight_smile:

1 Like

Hmm, slightly interesting. Now what would make it scads more interesting would be a demo app that instead of just talking back to the user with general AI responses, would also return a copy of the conversation text transcript using the speech-to-test demo model that OpenAI has online.

1 Like

Of course, anything is possible. This is a basic application, but it required a lot of features just to start interacting properly with OpenAI RT. From here, you can create AI agents that query external data, AI-powered interactive voice response (IVR) systems, RAG-powered customer service agents, and more.

No Audio is heared or created

asterisk_to_openai_rt# node asterisk_to_openai_rt.js
(node:31353) [DEP0040] DeprecationWarning: The punycode module is deprecated. Please use a userland alternative instead.
(Use node --trace-deprecation ... to show where the warning was created)
[
‘This API is using a deprecated version of Swagger! Please see Home · swagger-api/swagger-core Wiki · GitHub for more info’
]
N/A | 2025-04-04T12:23:40.885Z [INFO] Connected to ARI at http://127.0.0.1:8088
N/A | 2025-04-04T12:23:40.888Z [INFO] ARI application “stasis_app” started
N/A | 2025-04-04T12:23:40.890Z [INFO] RTP Receiver listening on 127.0.0.1:12000
N/A | 2025-04-04T12:23:44.708Z [INFO] StasisStart event received for channel 1743769424.9, name: SIP/5000-00000003
N/A | 2025-04-04T12:23:44.709Z [INFO] SIP channel started: 1743769424.9
N/A | 2025-04-04T12:23:44.716Z [INFO] Channel 1743769424.9 answered
N/A | 2025-04-04T12:23:44.718Z [INFO] ExternalMedia channel 1743769424.10 created and mapped to bridge e7014e22-d699-4e5c-825d-1bbd81e07e79
N/A | 2025-04-04T12:23:44.719Z [INFO] Attempting to start OpenAI WebSocket for channel 1743769424.9
N/A | 2025-04-04T12:23:44.752Z [INFO] StasisStart event received for channel 1743769424.10, name: UnicastRTP/127.0.0.1:12000-0x7f3280010100
N/A | 2025-04-04T12:23:44.753Z [INFO] ExternalMedia channel started: 1743769424.10
N/A | 2025-04-04T12:23:44.756Z [INFO] ExternalMedia channel 1743769424.10 added to bridge e7014e22-d699-4e5c-825d-1bbd81e07e79
N/A | 2025-04-04T12:23:45.035Z [INFO] RTP Source assigned for channel 1743769424.9: 127.0.0.1:11430
N/A | 2025-04-04T12:23:45.076Z [INFO] Audio processed for channel 1743769424.9 | RMS: 0.000 | Max sample before: 16, after: 16
C-0000 | 2025-04-04T12:23:45.770Z [INFO] [Client] OpenAI WebSocket connection established for channel 1743769424.9
N/A | 2025-04-04T12:23:45.771Z [INFO] Initializing RTP stream to 127.0.0.1:11430 for channel 1743769424.9
S-0000 | 2025-04-04T12:23:45.776Z [INFO] [Server] First event received for channel 1743769424.9 | Type: session.created | Duration: N/As | Status: Received
S-0001 | 2025-04-04T12:23:45.776Z [INFO] [Server] Session created for channel 1743769424.9 | Duration: N/As | Status: Received
N/A | 2025-04-04T12:23:45.870Z [INFO] RTP stream fully initialized for channel 1743769424.9
N/A | 2025-04-04T12:23:45.871Z [INFO] StreamHandler initialized for channel 1743769424.9 | Ready: true
C-0001 | 2025-04-04T12:23:45.872Z [INFO] [Client] Session updated with VAD settings for channel 1743769424.9 | Threshold: 0.8, Prefix: 300ms, Silence: 700ms
C-0002 | 2025-04-04T12:23:45.877Z [INFO] [Client] Sending audio chunk to OpenAI for channel 1743769424.9 | Size: 9.38 KB | RMS: 0.000
N/A | 2025-04-04T12:23:47.065Z [INFO] Received 100 RTP packets from 127.0.0.1:11430, total bytes: 17200, rate: 16.19 packets/s
N/A | 2025-04-04T12:23:47.085Z [INFO] Audio processed for channel 1743769424.9 | RMS: 0.000 | Max sample before: 24, after: 24
C-0003 | 2025-04-04T12:23:47.877Z [INFO] [Client] Sending audio chunk to OpenAI for channel 1743769424.9 | Size: 9.38 KB | RMS: 0.284
S-0004 | 2025-04-04T12:23:48.169Z [INFO] [Server] Speech started detected for channel 1743769424.9 | Duration: N/As | Status: Received
N/A | 2025-04-04T12:23:49.064Z [INFO] Received 100 RTP packets from 127.0.0.1:11430, total bytes: 17200, rate: 50.03 packets/s
N/A | 2025-04-04T12:23:49.086Z [INFO] Audio processed for channel 1743769424.9 | RMS: 0.288 | Max sample before: 26496, after: 26496
C-0004 | 2025-04-04T12:23:49.878Z [INFO] [Client] Sending audio chunk to OpenAI for channel 1743769424.9 | Size: 9.38 KB | RMS: 0.000
S-0006 | 2025-04-04T12:23:50.356Z [INFO] [Server] Speech stopped detected for channel 1743769424.9 | Duration: N/As | Status: Received
N/A | 2025-04-04T12:23:50.356Z [INFO] RTP stream stopped for channel 1743769424.9
N/A | 2025-04-04T12:23:50.356Z [INFO] Stopped RTP stream due to user speech for channel 1743769424.9
N/A | 2025-04-04T12:23:50.356Z [INFO] Initializing RTP stream to 127.0.0.1:11430 for channel 1743769424.9
N/A | 2025-04-04T12:23:50.357Z [INFO] Finished RTP stream for channel 1743769424.9 | Total duration: 4.59s | Total bytes sent: 3680 | Total packets: 10
S-0011 | 2025-04-04T12:23:50.409Z [INFO] [Server] Response completed for channel 1743769424.9 | Duration: N/As | Audio Fragments: 0 | Text Fragments: 0 | RTP Packets: 5 | RTP Bytes: 1200
C-0005 | 2025-04-04T12:23:50.410Z [INFO] [Client] Cleared OpenAI audio buffer for channel 1743769424.9
N/A | 2025-04-04T12:23:50.456Z [INFO] RTP stream fully initialized for channel 1743769424.9
N/A | 2025-04-04T12:23:50.456Z [INFO] StreamHandler initialized for channel 1743769424.9 | Ready: true
N/A | 2025-04-04T12:23:51.065Z [INFO] Received 100 RTP packets from 127.0.0.1:11430, total bytes: 17200, rate: 49.98 packets/s
N/A | 2025-04-04T12:23:51.106Z [INFO] Audio processed for channel 1743769424.9 | RMS: 0.000 | Max sample before: 16, after: 16
N/A | 2025-04-04T12:23:51.699Z [INFO] Channel 1743769424.9 removed from sipMap at start of StasisEnd
N/A | 2025-04-04T12:23:51.699Z [INFO] Send timeout cleared for channel 1743769424.9
N/A | 2025-04-04T12:23:51.699Z [INFO] RTP stream stopped for channel 1743769424.9
N/A | 2025-04-04T12:23:51.699Z [INFO] StreamHandler stopped for channel 1743769424.9 in StasisEnd
N/A | 2025-04-04T12:23:51.699Z [INFO] Channel 1743769424.9 hung up, checking playback status before cleanup
N/A | 2025-04-04T12:23:51.699Z [INFO] Finished RTP stream for channel 1743769424.9 | Total duration: 1.34s | Total bytes sent: 3680 | Total packets: 10
N/A | 2025-04-04T12:23:51.801Z [INFO] WebSocket closed for channel 1743769424.9 in StasisEnd
N/A | 2025-04-04T12:23:51.809Z [INFO] Bridge e7014e22-d699-4e5c-825d-1bbd81e07e79 destroyed
N/A | 2025-04-04T12:23:51.809Z [INFO] Channel ended: 1743769424.9
N/A | 2025-04-04T12:23:52.207Z [INFO] RTP stream stopped for channel 1743769424.9
C-0006 | 2025-04-04T12:23:52.207Z [INFO] [Client] OpenAI WebSocket connection closed for channel 1743769424.9 | Status: Finished

No audio is heared or created

asterisk_to_openai_rt/test-tools# node 4_app_voicechat_rt.js
2025-04-04T12:11:15.515Z - Starting connection…
2025-04-04T12:11:15.516Z - Output file initialized: ./output_audio_2025-04-04T12-11-15-514Z.pcm
2025-04-04T12:11:16.923Z - Connection established
2025-04-04T12:11:16.923Z - Audio loaded, total size: 80400 bytes
2025-04-04T12:11:16.926Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.027Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.128Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.220Z - Voice detection started
2025-04-04T12:11:17.229Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.330Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.432Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.533Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.634Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.735Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.836Z - Sent chunk: 4800 bytes
2025-04-04T12:11:17.937Z - Sent chunk: 4800 bytes
2025-04-04T12:11:18.037Z - Sent chunk: 4800 bytes
2025-04-04T12:11:18.137Z - Sent chunk: 4800 bytes
2025-04-04T12:11:18.238Z - Sent chunk: 4800 bytes
2025-04-04T12:11:18.338Z - Sent chunk: 4800 bytes
2025-04-04T12:11:18.438Z - Sent chunk: 4800 bytes
2025-04-04T12:11:18.538Z - Sent chunk: 3600 bytes
2025-04-04T12:11:18.632Z - Audio transcribed (optional): No transcription
2025-04-04T12:11:18.640Z - Audio confirmed
2025-04-04T12:11:18.640Z - Response request sent
2025-04-04T12:11:18.849Z - Audio saved to: ./output_audio_2025-04-04T12-11-15-514Z.pcm
2025-04-04T12:11:19.173Z - Error: Error committing input audio buffer: buffer too small. Expected at least 100ms of audio, but buffer only has 0.00ms of audio.
2025-04-04T12:11:19.173Z - Connection closed

1 Like

Hi, I have Same probelme here. any clues thanks

Hi. I have a problem.

asterisk@VM-9d8a061d-0352-4e8a-97c3-2e541c08a880:/asterisk_to_openai_rt$ nodejs asterisk_to_openai_rt.js
[
‘This API is using a deprecated version of Swagger! Please see Home · swagger-api/swagger-core Wiki · GitHub for more info’
]
N/A | 2025-04-06T19:22:35.191Z [INFO] Connected to ARI at http://127.0.0.1:8088
N/A | 2025-04-06T19:22:35.204Z [INFO] ARI application “stasis_app” started
N/A | 2025-04-06T19:22:35.210Z [INFO] RTP Receiver listening on 127.0.0.1:12000
N/A | 2025-04-06T19:22:40.171Z [INFO] StasisStart event received for channel 1743967360.75, name: PJSIP/4000-00000030
N/A | 2025-04-06T19:22:40.171Z [INFO] SIP channel started: 1743967360.75
N/A | 2025-04-06T19:22:40.178Z [ERROR] Error in SIP channel 1743967360.75: {
“message”: “Write access denied”
}
N/A | 2025-04-06T19:22:42.004Z [INFO] Channel ended: 1743967360.75
N/A | 2025-04-06T19:28:43.739Z [INFO] StasisStart event received for channel 1743967723.80, name: PJSIP/4000-00000033
N/A | 2025-04-06T19:28:43.739Z [INFO] SIP channel started: 1743967723.80
N/A | 2025-04-06T19:28:43.742Z [ERROR] Error in SIP channel 1743967723.80: {
“message”: “Write access denied”
}
N/A | 2025-04-06T19:28:57.793Z [INFO] Channel ended: 1743967723.80

I have same issue, no errors and no audio either. Anyone figured what was issue ? Or any solutions to try

Did you find any solution to this issue?