Asterisk sip trunk real-time audio and speech to text

shihabkb · February 24, 2019, 7:36am

We have installed Asterisk (Version 13) in out local Ubuntu 16.4 Box. We are able to make calls from two softphones (ekiga). We are trying to integrate out speech recognition engine with Asterisk, so that the both callers get real-time transcription. We are planning to write an external program (in java or python) which listens in an ip like 10.100.99.22:5060. In the asterisk server we will configure the SIP server and give the ip address of the java/python program as the register, so that the java/python program will get the rtp stream and we can pass it to our real-time speech recognition engine. I would like to know, what is are trying to achieve is really doable or possible thing? Please correct if we are doing something wrong. Following will the sample configuration for the sip trunk. Could you please guide us on this?

register => <<>>

[fooprovider]

type=friend

secret=<<>>

username=<<>>

host=sip.provider.foo

dtmfmode=rfc2833

canreinvite=no

disallow=all

allow=ulaw

allow=alaw

allow=gsm

insecure=port,invite

fromdomain=sip.provider.foo

context=incoming

david551 · February 24, 2019, 10:40am

Seems a lot of people want to do real time transcription, but I don’t think current technology is capable of this without a high error rate. You need to train the recognizer on a considerable amount of speech from a speaker to get good recognition and you need to look ahead to gain more context to properly establish the likely words.

I assume you mean Asterisk daemon, as you actually seem to want to use Asterisk as SIP client here.

You will have to use a SIP stack in your Java program, with your current design, as SIP registration, in no way, sets up a media connection.

There is a chan_rtp that I believe will send a raw RTP stream. Not many people will have used that.

As to the rest of the logic for capturing real time audio, I have given a much detail as I think reasonable for a peer support forum in a recent thread, although EAGI streaming of raw media was used in that case.

Your example “SIP trunk” configuration contains several elements of bad practice, and obsolete parameter names. It is the typical sort of configuration created by ITSPs a decade ago and designed to minimise support calls to them, not to be secure. It is also using a SIP channel driver that is no longer supported.

You need to go back to first principles. However, as I said your actual proposed design wouldn’t use SIP>

Topic		Replies	Views
Get real-time audio stream from asterisk Asterisk APIs	11	7098	February 21, 2019
Creating A SIP client without Success Asterisk SIP	15	1184	November 14, 2019
Live Speech Recognition using Google or AWS Asterisk Support	15	6849	October 23, 2023
One Way Voice - Dialing out via SIP trunk Asterisk Support	1	565	August 8, 2017
Asterisk Connecting to a Stream Asterisk Integration	5	346	August 25, 2022

Asterisk sip trunk real-time audio and speech to text

Related topics