We are writing an application (python/java) which transcribes an audio stream into text. Our transcription engine accepts web sockets. We want to get the real-time audio stream from an asterisk call (for example calls using ekiga soft phone) to sent it to transcription engine. Is there any way asterisks supports getting real-time audio of a conversation?
You could originate to Monitor() on a local channel and use EAGI to access the stream.
I don’t know if ARI provides more direct access to audio hooks.
You will get better speech to text if the recognizer is allowed to look ahead.
ARI does not provide audio access currently, it’s a control mechanism really.
can you have any example for “You could originate to Monitor() on a local channel and use EAGI to access the stream.”. I am totally a newby to this
I don’t do bespoke design work for free (and no longer for fees, either).
Is SIP trunk is an option for this problem?
Asterisk has no concept of a SIP trunk, and what some people call a SIP trunk would just be another subject for monitoring.
Oh is it. I saw few links which support SIP trunk in Asterisk, I didn’t understand it properly due to my lack of knowledge. Can we do something like port mirroring to get the voice data from a call?
Basically I didn’t understand what is meant by " Monitor() on a local channel and use EAGI", is there any documentation available can you please point on that?
By the way, we can get the recoded audio files right? I saw in the documentation the endpoint like /recordings/stored
https://wiki.asterisk.org/wiki/display/AST/Local+Channel
Monitor was a mistake, I meant ChanSpy. https://wiki.asterisk.org/wiki/display/AST/Application_ChanSpy
https://wiki.asterisk.org/wiki/display/AST/Application_EAGI
http://asteriskdocs.org/en/3rd_Edition/asterisk-book-html-chunk/AGI-variants.html#AGI_id262406
@jcolp Can we use /recordings/stored to get the recorded audio?
Please don’t tag me. If I have anything to add to a post, then I will do so.
The recorded audio will only have valid metadata (unless you are using a raw format) at the end of the call (when the recording is terminated, or the channel closed, and will be subject to stdio buffering, so will only appear one buffer size at a time.