Using custom TTS from external server with ARI

eyalhasson · January 15, 2020, 5:51pm

Hello,
We have a cusome TTS engine (SMI) which runs on a Windows server, which we need to use with our ARI application (Asterisk 13). Currently what I do is create a PCM file and make Asterisk play it. Problem is, it takes too much time since I need to wait for the TTS server to fully synthesize the whole sentense before passing it to Asterisk. I thought of 2 options to solve this, but have no idea how to do that with Asterisk/ARI:

Make Asterisk start play the file when its first chunk is ready, while I continue writing to it the rest of the speech. Tried that, but it seems Asterisk just uses whatever is on the file while the play command was issued, and ignores the rest.
Stream the results to Asterisk. Could not find anything about how to do that.

Do you have any idea how to use one of these ideas, or better, have another idea?

Thanks,
Eyal Hasson.

eyalhasson · January 20, 2020, 3:20pm

I tried playing the stream with http on Asterisk 16 (the TTS engine provides a http stream which it writes the synthisized audio into). It seems Asterisk waits for the whole stream to complete before starting palyback, so no time saving here. Is there a way to force Asterisk to immediatly start playback?

danjenkins · January 20, 2020, 3:59pm

Hi @eyalhasson

Seeing as you’re already using the ARI could you use the new External Media stuff in ARI? It allows you to write audio into Asterisk instead of playing back file after file. I used it and wrote about it here - https://dan-jenkins.co.uk/dana-audioserver-transcription/ and in the next couple of weeks I’ll be demo’ing typing Dialogflow up to Asterisk using it. You’ll want the AudioServer part of it which lives on github https://github.com/nimbleape/dana-tsg-rtp-stt-audioserver - you probably wouldnt even need to change your tts application at all, just have this audioserver call the URI instead and use the ability in whatever language you use (doesnt have to be node) to read the response before its finished downloading and push it straight back into Asterisk.

eyalhasson · January 20, 2020, 4:24pm

Hello @danjenkins,

Thanks - I was not aware of this, and it is really something long time needed. Does your AudioServer support streaming into Asterisk also?

danjenkins · January 20, 2020, 5:36pm

The code currently isn’t there to do so but will be in about 2 weeks time (I’ll be giving that talk at ITExpo about using dialogflow). But if you wanted to do it, you absolutely can, where I read from the stream and then write out to Google’s Speech To Text engine, you’d want to write to it, simple as that really, sure, you’d want to do some buffering going out too so that you didnt chuck a load of media down that udp socket but in essence it should be that simple. Get your media via http, pipe the stream from the http response to the audio stream I’m reading from

eyalhasson · February 4, 2020, 1:36pm

Hello @danjenkins,

I started playing with the External Media support as you suggested, using a C# written UDP server. I am able to send back my audio, but it is not heard on the channel. I think because I have to wrap it with RTP protocol. Am I correct? If so, can you direct me to an example of how to do that?

danjenkins · February 4, 2020, 2:13pm

Hey @eyalhasson I’m literally just about to write something that does this with node for my demo at ITExpo next week so I’ll get back to you soon (or nudge me in a couple of days if I havent gotten back to you)

Topic		Replies	Views
Streaming from ARI snoop channel for Speech recognition Asterisk APIs	14	5450	October 22, 2020
Audio pipe to external program Asterisk APIs	3	809	July 27, 2023
Externalmedia ari to speechto text not working Asterisk APIs	1	335	January 9, 2024
How Can Asterisk Play the Real-Time Audio Stream? Asterisk Support	2	2768	October 30, 2024
ARI playback audio from memory buffer Asterisk APIs	4	1989	December 4, 2019

Using custom TTS from external server with ARI

Related topics