I have an asterisk box up & running with our stasis app, and we want to use Google Speech API or AWS Transcribe service.
I found some topics on how to setup the speech recognition, but here’s the catch, we need realtime transcription of the call, not just recognizing a sentence.
Is there something that I can use to send the call audio via HTTP/2 to Google Speech API or AWS Transcribe in realtime?
I didn’t find anything that can do the “realtime” part yet.
I have used speecj recogniton API with Asterisk it works wonderful, but only in Synchronous mode. Asynchronous mode is posible but there are some limitations, the following links will help you
@jcolp: could you explain a bit further your idea? We already use Chanspy but I don’t get how it can help for live speech recognition.
Also, I found that UniMRCP can do speech recognition via the Google Speech plugin, but it seems that it can’t do “live” speech recognition, anyone is using this?
As there is nothing built into Asterisk to do live as you desire the only real sensible way is to send the audio outside of Asterisk to a third party application which can then read and do what it needs to do. The UnicastRTP module when combined with a Local channel and Chanspy can be used to do this. The UnicastRTP module sends the RTP to a given IP address and port[1].
I create a repo for using GCP to do text to speech and speech to text using python and dockerizing asterisk.
I think that starting from my work using EAGI instead of AGI you should do it:
My goal is to have a real time speech to text system with my Asterisk
I followed these instructions (GitHub - alphacep/vosk-asterisk: Speech Recognition in Asterisk with Vosk Server) and the result is that in /var/spool/asterisk/voicemail/default//INBOX folder I started to have new files called, for example, msg0000.wav , msg0000gsm, msg0000.txt
I assumed the txt files were the call transcripts but they are almost empty. There was only info about the files. No transcriptions.
Feels so. Vosk-asterisk implements Asterisk Speech API (res_speech) module. You can use it in dialplan with SpeechBackground command. It has no relation to voicemail.