Real-time Call Translation Issue (English ↔ Urdu) using EAGI with Google Cloud

I am working on a requirement where I need real-time bi-directional translation during a live call. The flow should be:

If the caller speaks in English, it should be translated into Urdu and passed to the callee.

If the callee speaks in Urdu, it should be translated into English and passed to the caller.

I have implemented this using an EAGI script with Google Cloud Speech-to-Text for transcription. The EAGI is invoked after the call is connected, and the dialplan is as follows:

[call_answered_agent]
; ${ARG1} - Spool ID
; ${ARG2} - Unique ID
; ${ARG3} - Exten (dialed number)
; ${ARG4} - Channel Name
exten => s,1,Set(__time_connect=${EPOCH})
same => n,Set(IBDB_ANSWERED(${ARG1})=${ARG2},${time_connect})
same => n,Set(IBDB_ANSWERED2(${ARG1})=${ARG4})
same => n,MixMonitor(${UNIQUEID}.wav)
same => n,EAGI(/usr/ictbroadcast/bin/translate_ict.eagi)
;same => n,GoSub(virtual_queue_log,s,1(${ARG1},${ARG2},${ARG3}))
same => n,Return()

EAGI Script:
The PHP script uses google/cloud-speech to process audio from FD3:

#!/usr/bin/php

<?php require '/usr/ictbroadcast/vendor/autoload.php'; use Google\Cloud\Speech\V1\SpeechClient; use Google\Cloud\Speech\V1\RecognitionConfig; use Google\Cloud\Speech\V1\StreamingRecognitionConfig; use Google\Cloud\Speech\V1\StreamingRecognizeRequest; use Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding; putenv('GOOGLE_APPLICATION_CREDENTIALS=/usr/ictbroadcast/etc/translator_google_key.json'); $speechClient = new SpeechClient(); $config = new RecognitionConfig([ 'encoding' => AudioEncoding::LINEAR16, 'sample_rate_hertz' => 8000, 'language_code' => 'en-US', ]); $streamingConfig = new StreamingRecognitionConfig([ 'config' => $config, 'interim_results' => true, ]); $requests = [ new StreamingRecognizeRequest([ 'streaming_config' => $streamingConfig ]) ]; $stream = fopen("php://fd/3", "rb"); if (!$stream) { fwrite(STDERR, "Failed to open audio stream.\n"); exit(1); } while (!feof($stream)) { $chunk = fread($stream, 320); if (!$chunk) { usleep(100000); continue; } $requests[] = new StreamingRecognizeRequest([ 'audio_content' => $chunk ]); } $responses = $speechClient->streamingRecognize($requests); foreach ($responses as $response) { foreach ($response->getResults() as $result) { if ($result->getIsFinal()) { $transcript = trim($result->getAlternatives()[0]->getTranscript()); file_put_contents('/tmp/file.txt', "[SPOKEN]: $transcript\n", FILE_APPEND); } } } fclose($stream); $speechClient->close(); Problem: When the EAGI script runs, both call legs lose audio — The agent can’t hear the receiver’s voice. The receiver can’t hear the agent’s voice. It seems the audio stream is being consumed by the EAGI script without being passed through to the other side of the call. Question: How can I capture audio for real-time processing in EAGI while keeping the audio flowing between both call participants? Is there a recommended approach in Asterisk for real-time speech translation that doesn’t break the audio path?

An obvious answer would be to make each leg a separate call: one leg to the Urdu speaker, the other to the English speaker, with the EAGI in the middle, doing the relaying.

A FastAGI connection might be preferable to regular AGI, but there is no such thing as “FastEAGI”. An alternative would be to use AudioSocket connections instead.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.