Trying to transcribe a telephone conversation into speech

gabriel123 · May 31, 2017, 9:10am

I have this project where I have to do a speech-to-text of a telephone conversation between two sip accounts.

I am able to render speech to text using google’s speech recognition engine.

But I need to record and transcribe a telephone conversation into text.

Can someone help me?

david551 · May 31, 2017, 10:39am

Hire a palantype operator. Speaker independent, continuous, voice recognition is still a research topic, in reality, let alone multiple speakers degraded by telephone bandwidth.

gabriel123 · May 31, 2017, 11:19am

I don’t understand your answer.
With this code, I can record a conversation:
exten => _8.,1,SetVar(CALLFILENAME=${EXTEN:1}-${TIMESTAMP})
exten => _8.,2,Monitor(wav,${CALLFILENAME},m)
exten => _8.,3,Dial(ZAP/g1/${EXTEN:1})
exten => _8.,4,Congestion
exten => _8.,104,Congestion

With this code, I can return what I say in text:
exten => 1235,1,Answer()
exten => 1235,n,agi(googletts.agi,“Say something in English, when done press the pound key.”,en)
exten => 1235,n(record),agi(speech-recog.agi,en-US)
exten => 1235,n,Verbose(1,Script returned: ${confidence} , ${utterance})

Now I just need to return a full telephone conversation in text

Someone pls help me

david551 · May 31, 2017, 11:27am

The answer was that what you are trying to do is not realistically possible yet, by any porcess that does not use a human brain to interpret the speech.

Telephones make things different, as they destroy the distinctions between s, sh, and similar sounds.

(Actually, if you look at sub-titles on live TV, you will see that it is not possible to do well, even with a human brain in circuit to capture the phonemes. You need the ability of human brains to interpret whole passages of speech, to establish context.

gabriel123 · May 31, 2017, 11:35am

But it is possible. There are many services that interpret speech.
For example I am using google speech recognition which can recognize and return exactly(if it is clear) what you speak in text.
I’ve already done that.
But I don’t know how to do it in a conversation

david551 · May 31, 2017, 11:39am

If you still want to try, I would suggest that you code directly to the Google speech API, using the recorded conversation, and ignore the AGI application, which is really there for IVR use. https://cloud.google.com/speech/

gabriel123 · May 31, 2017, 11:43am

Ok thank you david551

gabriel123 · May 31, 2017, 11:46am

Another question, I am also configuring a WebRTC for making calls from an asterisk server to webRTC.
I have tried but when I make a call, it says it is busy.
Can I get some help with it

david551 · May 31, 2017, 11:54am

Please start a new thread.

WebRTC is not for the faint hearted, and you have not supplied nearly enough information for someone to realistically help you.

jersonjunior · June 6, 2017, 5:45pm

Look this:

Speaker labels let you identify which individuals spoke which words in a multi-participant exchange. You can use the information to develop a person-by-person transcript of an audio stream, such as contact to a call center, or to animate an exchange with a conversational robot or avatar. The feature works best for audio files of telephone conversations that involve two people in an extended conversation. For best performance, the audio should be at least a minute in length. (Labelling who spoke and when is sometimes referred to as speaker diarization.)

The feature is optimized for two-speaker scenarios. It can handle up to six speakers, but more than two speakers can result in variable performance. Two-person exchanges are typically conducted over narrowband media, but the feature is supported for the following models:

en-US_NarrowbandModel and en-US_BroadbandModel
es-ES_NarrowbandModel and es-ES_BroadbandModel
ja-JP_NarrowbandModel and ja-JP_BroadbandModel

https://www.ibm.com/watson/developercloud/doc/speech-to-text/output.html#speaker_labels

Topic		Replies	Views
Speech to text in Asterisk Asterisk Support	8	3899	May 13, 2017
Speech-to-text transcription Asterisk Integration	23	5843	May 14, 2019
Unable to UniMRCP SR(Speech recognize) Recognize two-way conversation Asterisk Support	1	569	May 4, 2019
How to call and also include agi file at same time? Asterisk APIs	15	1163	February 20, 2021
Call recording Transcripting - Google/OpenAI Asterisk General	8	949	May 15, 2024

Trying to transcribe a telephone conversation into speech

Related topics