Howto implement speech recognition barge-in with ARI

eyalhasson · June 30, 2017, 1:00pm

Hello,
We have developed an ARI application that uses Google’s Speech Recognition API (we need Hebrew SR, so no other option). The app records the user’s speech and sends it to Google for recognition. Can you give us some tips on how to implement barge-in in this configuration?

Thanks.

eyalhasson · July 2, 2017, 4:06pm

Maybe I’ll ask that in this way: Is there a way to get an event from ARI when voice is detected?

jcolp · July 2, 2017, 8:18pm

The TALK_DETECT dialplan function[1] can be used to detect when talk is detected. This will raise an ARI event, and it can be set on a channel in ARI using the normal channel variable route.

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+13+Function_TALK_DETECT

eyalhasson · July 3, 2017, 5:33am

Seems exactly what I’m looking for. I’ll try and update. Thanks!

eyalhasson · July 5, 2017, 5:40am

A little bit tricky - main problem is that if I start recording on ChannelTalkingStarted event I loose the start of the user’s sentence. I can lower the loss if I increase the sensitivity of the detection but then I receive more false detections. Is there a way to buffer the last second or so?

BTW: I thing the documentation on the wiki page above has a mistake on the descriptions of the parameters. If I understand correctly the first is the time of silence to be identified as end of talk, and the second is the energy to be considered as talking.

jcolp · July 5, 2017, 9:06am

There is no way to buffer the last second. The dialplan function is strictly to know when talking starts and ends.

As for the documentation please leave a comment on the wiki page and we’ll look into it.

eyalhasson · July 5, 2017, 3:43pm

I am trying to record the user on call start, and recognize on ChannelTalkingStarted event minus a second. Strange thing is that when I start to record the user, playing prompts stops working. What’s going on here?

jcolp · July 5, 2017, 3:46pm

You can’t do two things at once to a channel. Record in ARI is just that it, it records the channel as if you were calling Record() in the dialplan. It is not a MixMonitor equivalent. The foundation is there to implement such a thing though using a Snoop channel and Record on the Snoop channel.

eyalhasson · July 5, 2017, 4:53pm

I see. Is there any preference between creating and recording a snoop channel or a holding bridge?

jcolp · July 5, 2017, 4:55pm

I don’t understand the question. They are separate things. While in a bridge you also have limited control over the channel.

eyalhasson · July 5, 2017, 4:57pm

I meant that I can record the user using a snoop channel, or adding him to a bridge and record the bridge. I am wondering what is better, regarding resource usage.

jcolp · July 5, 2017, 4:58pm

You’d need to profile and see for your use case what would be better.

eyalhasson · July 5, 2017, 4:59pm

O.K., I’’ try and see. Thanks.

eyalhasson · July 10, 2017, 5:28am

Working well now with snoop channel. Thanks!

gayake.sambhaji · August 21, 2018, 4:04pm

Hi,

Will it be possible for you to share source code of your application? I am curious to know how it is done. You can e-mail it to me on gayake.sambhaji@gmail.com

Thanks!

DS1 · August 27, 2018, 2:55pm

Do you have some advice to do the same things?

eyalhasson · October 30, 2018, 2:20pm

Basically, what we do is creating a snoop channel, which immediately starts recording the user, and we save the time the recording started. Then, when we get the ChannelTalkingStarted on the snoop channel, we save this time, too. When finally the ChannelTalkingFinished arrives, we stop recording, and copy the recorded file from one secong before the ChannelTalkingStarted event (since we record in ulaw, no problem doing so). Then we send this file to Google speech recognition.

civicharles · January 4, 2019, 2:31pm

I think the secret is to jump into the Asterisk internals and see if you can build access to the dialplan speech applications through ARI. That would be ideal

Topic		Replies	Views
How to capture ChannelTalkingStarted & ChannelTalkingFinished events in asterisk using ARI Asterisk APIs	3	196	May 19, 2024
Receiving ChannelTalkingStarted event continuously if there is background noise Asterisk General	2	39	March 16, 2025
Speech recognition in ARI Asterisk APIs	5	1111	January 9, 2019
Issues with TALK_DETECT and ARI Asterisk APIs	10	1776	July 25, 2019
Pickup Talk_Detect events inside a channel Asterisk APIs	10	967	December 23, 2021

Howto implement speech recognition barge-in with ARI

Related topics