How to implement conversation AI agent in Asterisk

techdev · January 16, 2025, 4:32pm

Hello everyone.

Hope you are doing well.
Currently, I am going to build a Conversational AI Agent using ARI.
But not sure how to build it from scratch.

Any help, any advice would be appreciated.

david551 · January 16, 2025, 4:33pm

Read the many many previous threads on this topic.

Pooh · January 16, 2025, 4:38pm

On Thursday 16 January 2025 at 17:33:39, techdev via Asterisk Community wrote:

Currently, I am going to build a Conversational AI Agent using ARI.
But not sure how to build it from scratch.

Have you tried Google Search to see whether
anyone has discussed this previously, and possibly provided some clues to how
they did it?

Other search engines may provide even more results.

Antony.

–
This email was created using 100% recycled electrons.

mbradeen · January 16, 2025, 4:38pm

This topic will be covered at Astricon this year. Hopefully we will see you there!

dewdude · January 16, 2025, 4:47pm

I would ask this question:

Do you have a few hundred spare GPUs with a terabyte of VRAM?

I can’t run 14b size LLMs on my laptop; you need approximately 32gigs of VRAM for that. 14b is considered a small model. If you can’t fit it all in VRAM, then you can’t run it all in VRAM and that’s going to be a problem.

speech-to-text, which will be needed for the LLM to do anything, is also going to be pretty computational in real time.

I’ve seen demos of AI agents. They’re running on big cloud providers dedicated for this type of development. There’s till 2 or 3 seconds of lag between responses. It’s not quite ready for prime-time. That is not to mention that with the lousy audio quality of calls, AI agents get very confused when the calls aren’t clear. We are currently using AI at my job to do call transcription and it’s been a lot of difficulty getting it to work right. Even then…it’s not right.

All the demos and stuff are being done with…no surprise…excellent audio into the thing. Real world is going to be a problem.

techdev · January 16, 2025, 4:54pm

Thanks for all of your many quick replies.

Yes, I’ve already seen several topics. Here is what I have known about it from this topic - How to get real time audio streams of both Calling party and called party independently - #5 by shamnusln.

Get the bridgeId of the current active call
Take each channels in that bridge
For each channel, create a snoopChannel.
Then, create an externalMedia channel, corresponds to each snoopChannel.
Finally create a new bridge and add these externalMedia channels to this bridge.

Regarding the step 1, should I get the bridge id from bridge list using setInterval and then go to the next step from there?

HazemMeqdad · January 16, 2025, 8:19pm

Hey there,

That’s a good question! I have a few suggestions that might help.

If you’re working with limited resources, I recommend checking out something like Ollama. It allows you to switch models in the future when you have access to better resources without requiring significant changes to your code.

Additionally, I’ve built an AI agent with excellent response times, leveraging two GPUs with a total of 32 GB VRAM, along with local STT and TTS capabilities.

If you’d like to see a demo, you can explore bland.ai or check out Vocode, an open-source project.

simone686 · January 17, 2025, 9:11am

I’m planning to put an instance of Kamailio + rtpengine in front of Asterisk, to be able to have in the caller’s audio, the reproduction of a background audio file with the classic noise of a standard office, with hissing, typing on the keyboard and similar things… The latency in this case does not give that sense of anguish that silence does.

Pooh · January 17, 2025, 9:31am

On Friday 17 January 2025 at 10:22:39, simone686 via Asterisk Community wrote:

I’m planning to put an instance of Kamailio + rtpengine in front of
Asterisk

Understood. Do you have a question?

Antony.

–
Bill Gates has personally assured the Spanish Academy that he will never allow
the upside-down question mark to disappear from Microsoft word-processing
programs, which must be reassuring for millions of Spanish-speaking people,
though just a piddling afterthought as far as he’s concerned.

Lynne Truss, “Eats, Shoots and Leaves”

techdev · January 17, 2025, 2:03pm

Thanks for so many valuable replies for my question.

I read all of your opinions carefully so I think I can implement conversational AI agent in two ways.

Asterisk AudioSocket
Asterisk ARI (audiio streaming thru external media)

Now, I am not sure which is the best way for conversation AI agent.
I did some more research yesterday and found that while there was a lot of material on ARI, there wasn’t enough material on Asterisk AudioSocket.
Given the topic that is coverated at Astericon next month, personally I am more intersted in Asterisk AudioSocket.

I’d appreciate it if you could give me any advice about this.

Thank you.

jcolp · January 17, 2025, 2:06pm

That talk and usage is through ARI and External Media, fyi. We don’t use AudioSocket at Sangoma, it’s a community supported part of Asterisk.

techdev · January 21, 2025, 5:18am

For the conversational ai. What about taking a 3rd party sip library. Use that to connection to the ai agent? And then I can just register it as a sip extension
Not sure it will work or not?

Any help would be appreciated.

system · February 20, 2025, 5:19am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can we implement AI in our IVR system using Asterisk? Asterisk Integration	15	6188	May 6, 2024
Choosing the Best Approach for Integrating an AI Agent with Asterisk Asterisk APIs	1	295	February 11, 2025
Build a real-time AI voicebot with Asterisk + AudioSocket + STT/LLM/TTS or STS AI	6	131	July 27, 2025
Integrating AI Receptionist with Asterisk - ARI vs. AGI Approach Asterisk APIs	6	2531	January 23, 2025
Asterisk With Ultravox.ai Asterisk SIP	3	237	May 10, 2025

How to implement conversation AI agent in Asterisk

Related topics