Need help connecting asterisk to python and GCP

I have build a python code that takes in speech find something and gives response. It uses google speech to text. The speech comes in as input from a file that is recorded from, microphone .
Now I need to do the same with telephone call. I am using asterisk to connect the call to the machine. Now how will I connect the code to rasterize such that it can i can listen to the phone call and respond to the caller. I have set up sip trunk but confused about how to write the dial; plan for this.

So, the phone should answer, play back the recorded sound, and you want to be able to talk as well? Is that your scenario?

yes exactly. I am trying to build a voice bot over the phone

you are lucky a new module was added to recent asterisk, couple of years ago we were hopeless, please check this link for an idea about the audio socket module : https://github.com/NormHarrison/audiosocket_server

1 Like

Thanks man checking… Now I need to learn how to set this up

What language are you using for this?

You don’t really have to that in order to get going. You could define a ConfBridge (instead of using Dial) without any announcements and add arbitrary sound with the originate command (either cli or call file). It’s even possible to trigger that from the outside.

Ok and how do you get audio out of confbridge into a python script ?

You don’t. You control the events with your script. I am doing something similar, but in my case bash is fully sufficient to orchestrate things. That said, you have to write your own code.

OK I am saying we can get audio out of asterisk into an external application using audio socket module, you are saying we don’t need this module and the only hint you giving us is that you are using bash and confbridge to do the job , can you enlighten us ?

It’s pretty basic stuff. You can inject arbitrary audio with the the originate command, which can be used via cli or ami.

What abt getting audio out ?

I don’t know what you mean. You’d be more precise about what you are trying to do.

The OP wants to access the media stream from his program, in real time, as against the normal tactic, which is to record the response, and then have the program read the recorded file.

EAGI gives you real time access to the incoming media, but outgoing media still has to be played as complete files. As noted above, the normal approach to speech with Google, is to read responses into a file, and then interpret the file, rather than to do on the fly interpretation.

I didn’t find clear documentation on the codec used by EAGI, so I don’t know if you get the untranscoded media, or something like SLIN. If I were doing this in anger, I would look at the source code, to understand what it actually does.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.