Conversation with the phone system, continuous speech recogn

alsacc · April 23, 2013, 9:59am

Hi,
I would like to develop a phone application where the user gets called from the system and engage a conversation with it. The system talks using pre-recorded audio files, the user input should instead be continuously monitored and analyzed by a speech recognition engine. I have some experience with sphinx, and I have seen that someone got it to work with asterisk. Still, I haven’t found the possibility of handling all incoming audio as a stream in order to recognize continuous speech. Is this possible? I’d like to launch an instance of a custom program each time a call is started, and send the audio data to it, through sockets or pipes.
Looking forward to your comments about the question, and about the possible limits of the system.
Cheers
Alessandro

ambiorixg12 · April 23, 2013, 9:50pm

Wow it sound like Artificial Intelligence, I don’t know if this might help zaf.github.io/asterisk-speech-recog/ but its really an interesting project and i was thinking something similar.

alsacc · April 24, 2013, 7:28am

Thanks for your answer, there wouldnt be much natural language processing but mostly a different grammar for each of the states of the system, that is each time the user is prompted for an interaction. The first thing I have noticed in your link is “Records from the current channel untill 3 seconds of silence are detected”. That’s no good.
I have found this that looks very promising

code.google.com/p/unimrcp/wiki/asteriskUniMRCP

The question could be more general though: say that I know how to do speech recognition given a stream of audio data, is it possible in asterisk to have some custom code that processes the audio data of the phone call in real-time? I basically just need that, then it’s matter of adapting sphinx or pocket sphinx.
Thanks
Alessandro

ianplain · April 24, 2013, 8:17am

Hi

You might want to look at this zaf.github.io/asterisk-speech-recog/ we have used it, on single words and very short sentances its ok , starts to trip up on longer ones.
There is a reason for this and thats todo with teh way Google do speech recognition. They use a predictive model. This can lead to odd results. googleresearch.blogspot.co.uk/20 … ng-in.html

ianplain · April 24, 2013, 8:55am

This might help a bit for centos but th theory is the same
cyber-cottage.co.uk/en/2013/02/i … entos-6-3/

nsh · April 24, 2013, 1:24pm

UniMRCP is exactly the the thing you need, it handles continuous stream and processes the data as soon as it arives. UniMRCP has pocketsphinx plugin which is easy to setup and which provides the decoding you need. It’s not free from issues, but once you integrate an important patch from

code.google.com/p/unimrcp/issues/detail?id=149

It will provide you a good accuracy.

If you have any troubles with UniMRCP or Pocketsphinx, feel free to ask.

vijkaush · February 19, 2024, 5:41pm

Hi Sir,
Could you help me to achieve the below project with respect to Asterisk
call–>PSTN–Contact Center(Genesys/Cisco/Avaya)—> Asterisk---->NLP+STT+TTS

vijkaush · February 19, 2024, 5:43pm

Hi,
Were you able to solve this problem?

vijkaush · February 19, 2024, 5:45pm

I am also working on similar project but not sure how can I achieve.
I have developed a BOT but not sure how a telephone call will land to this BOT?

Topic		Replies	Views
Asterisk sip trunk real-time audio and speech to text Asterisk Integration	1	2715	February 24, 2019
Communication mobile phone <-> server in order to perform sp Asterisk Support	0	173	October 16, 2009
Need a suggestion for voice speech recognition solution Asterisk General	6	403	May 8, 2014
Speech to text in Asterisk Asterisk Support	8	3990	May 13, 2017
Sphinx and Asterisk Asterisk Support	0	347	December 10, 2008

Conversation with the phone system, continuous speech recogn

Related topics