Asterisk Audio Stream

I want to get the audio from call without saving it in audio file and use for Speech to text.
for example:
i want to record Bob response direct from call to and convert it into text and then converted text will be use for further processing

asterisk/asterisk-external-media (github.com)

This project stream audio to a rtp server and send the raw data to google STT.

To achieve this, you would typically use a technique known as real-time speech recognition or live transcription.

here’s a short example in Python using the Google Cloud Speech-to-Text API and the Twilio API for call handling:

import os
from twilio.rest import Client
from google.cloud import speech_v1p1beta1 as speech

# Twilio credentials
TWILIO_ACCOUNT_SID = 'your_twilio_account_sid'
TWILIO_AUTH_TOKEN = 'your_twilio_auth_token'
TWILIO_PHONE_NUMBER = 'your_twilio_phone_number'

# Google Cloud credentials
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path_to_your_google_credentials.json"

# Initialize Twilio client
client = Client(TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN)

# Function to transcribe call
def transcribe_call(call_sid):
    call = client.calls(call_sid).fetch()
    audio_url = call.recordings.list()[0].uri
    
    # Initialize Google Speech-to-Text client
    speech_client = speech.SpeechClient()

    # Configure audio settings
    audio = speech.RecognitionAudio(uri=audio_url)
    config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=8000,
        language_code="en-US",
    )

    # Perform transcription
    response = speech_client.recognize(config=config, audio=audio)

    # Print transcription
    for result in response.results:
        print("Transcript: {}".format(result.alternatives[0].transcript))

# Example usage
call_sid = 'your_call_sid'
transcribe_call(call_sid)

This code assumes you have a call SID from Twilio that you want to transcribe. It uses the Twilio Python library to fetch the call recording URL, then uses the Google Cloud Speech-to-Text API to transcribe the audio from the recording URL into text. Finally, it prints the transcription. You'll need to replace 'your_twilio_account_sid', 'your_twilio_auth_token', 'your_twilio_phone_number', 'path_to_your_google_credentials.json', and 'your_call_sid' with your actual Twilio and Google Cloud credentials and the specific call SID you want to transcribe.

Best regard
Danish Hafeez | QA Assistant
[ICTInnovations](https://www.ictinnovations.com)

Is there any other way… like from dialplan or writing AGI.
Actually i’m trying to get the audio in AGI … Recording audio stream in buffer and then pass it to STT.
Or any dialplan application that can get audio stream store in buffer rather than saving auido in file by using record, monitor or mixmonitor

Yes, you can capture audio in an Asterisk AGI script without saving it to a file by using a tool like sox or arecord to capture the audio stream and then passing it to your speech-to-text (STT) engine.

#!/usr/bin/env python3

import os
import subprocess

def capture_audio(duration):
    # Using arecord to capture audio for the specified duration
    process = subprocess.Popen(["arecord", "-d", str(duration), "-f", "S16_LE", "-r", "16000", "-c", "1", "-t", "raw"], stdout=subprocess.PIPE)
    audio_data, _ = process.communicate()
    return audio_data

def main():
    # Define the duration for capturing audio (in seconds)
    duration = 5

    # Capture audio
    audio_data = capture_audio(duration)

    # Pass the captured audio to your STT engine for processing
    # Replace this line with your code to send the audio data to your STT engine

if __name__ == "__main__":
    main()
You can call this AGI script from your dialplan to capture audio and then process it with your STT engine. Make sure that the necessary permissions are set for executing the AGI script and capturing audio. Additionally, you may need to install the sox or arecord package if it's not already installed on your system.

i’m using azure and asterisk is deployed on vm
ALSA lib confmisc.c:767:(parse_card) cannot find card ‘0’
ALSA lib conf.c:4745:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:4745:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name
ALSA lib conf.c:4745:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5233:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM default
arecord: main:830: audio open error: No such file or directory

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.