Asterisk Speech To Text engine

Hi,

Are there any possible method to convert customer voice into text?

There are multiple ways to achieve this using Azure, Vosk or any supported STT.

record the voice and give the recorded voice to your STT API.

1 Like

Hi,

in Asterisk you can already capture the customer’s voice and pass it to an ASR (Automatic Speech Recognition) engine to get the transcript.

If you want to go further, you could try AgentVoiceResponse (AVR): it not only handles STT (speech-to-text), but also lets you send the transcript to an LLM for processing, then converts the response back with TTS and streams the audio to Asterisk. In this way, you get a real AI-powered voice agent running on your PBX.

:open_book: Documentation: https://wiki.agentvoiceresponse.com

:laptop: GitHub: https://github.com/agentvoiceresponse

:speech_balloon: Join the Discord community: Agent Voice Response

1 Like

Hi All,

Thank you for your responses. I successfully completed the real-time voice-to-text integration using the Google Speech-to-Text API.

Much appreciated.

Thank you

1 Like

@hirushi FYI

Are there free ASR engine one you can adopt and test in a lab environment?

COLLINS ONYEGBADO | B.Tech; Msc | CCNA; CCNP
Head, Hardware Maintenance Operations & Networking Unit.
Mobile: +2348064550911 | Voip:4001 | www.fuotuoke.edu.ng
ICT CENTER
FEDERAL UNIVERSITY OTUOKE, BYELSA STATE

On Wed, Sep 3, 2025, 23:58 gcareri via Asterisk Community <notifications@asterisk.discoursemail.com> wrote:

gcareri
September 3

Hi,

in Asterisk you can already capture the customer’s voice and pass it to an ASR (Automatic Speech Recognition) engine to get the transcript.

If you want to go further, you could try AgentVoiceResponse (AVR): it not only handles STT (speech-to-text), but also lets you send the transcript to an LLM for processing, then converts the response back with TTS and streams the audio to Asterisk. In this way, you get a real AI-powered voice agent running on your PBX.

:open_book:Documentation: https://wiki.agentvoiceresponse.com

:laptop:GitHub: https://github.com/agentvoiceresponse

:speech_balloon:Join the Discord community: Agent Voice Response


Visit Topic or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, click here.

Yes, there are free and open-source ASR engines you can adopt and test in a lab environment. A good starting point is Vosk ( VOSK Offline Speech Recognition API ), which supports multiple languages and runs locally without requiring cloud APIs.

In our case with AgentVoiceResponse (AVR), we already integrated Vosk as one of the ASR modules, so you can test it right away without additional licensing costs. This makes it very convenient for lab environments and PoCs.

If you’re interested, you can find more details in our documentation here:

I’ve published a technical guide on Medium explaining how to integrate Asterisk PBX with a fully local conversational AI stack using AVR, Vosk, Ollama, and Kokoro:

:small_blue_diamond: Vosk – lightweight, offline, multilingual ASR (speech-to-text)

:small_blue_diamond: Ollama – run open LLMs locally (Llama, Mistral, TinyLlama, etc.)

:small_blue_diamond: Kokoro – efficient, natural-sounding local TTS

:small_blue_diamond: AVR (Agent Voice Response) – the integration layer with Asterisk via AudioSocket

This approach offers:

:white_check_mark: Full on-premise deployment with no dependency on external APIs

:white_check_mark: Data privacy and compliance, since all processing happens locally

:white_check_mark: Cost control (no per-minute or per-token charges)

:white_check_mark: Flexibility to plug in different ASR, TTS, and LLM providers

Full article here:

Hope you find it useful and happy reading! :rocket:

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.