Real-time AI models, a price comparison

InfinitoCloud · October 11, 2025, 5:03pm

Hello Asterisk Community!

I was updating pricing information from AI models in real time (speech-to-speech interaction with the model, without intermediate STT or TTS conversion, with low latency) and comparing them to have up-to-date data. Sample: 1 Hr. call duration. Created with Grok 4 Fast. Here it is, Updated!:

Model	Economic Version	Cost 1 Hour (USD)	Notes
Google Gemini	2.5 Flash Live	~0.68	Based on 45k audio tokens in/out (25/sec); $3/1M input audio, $12/1M output audio. Source: https://cloud.google.com/vertex-ai/generative-ai/pricing
OpenAI Realtime	gpt-realtime-mini	~1.35	Based on 45k audio tokens in/out (25/sec); $10/1M input audio, $20/1M output audio. Source: https://openai.com/api/pricing/
Microsoft Azure Speech	Standard Real-time	~1.92	STT $1.20/h + TTS ~$0.72/h. Source: https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/
Hume.ai EVI	Starter	~4.40	$3 for 40 min + $0.07/min additional. Source: https://www.hume.ai/pricing
ElevenLabs	Starter Agents	~6.00	$5 for 50 min, equivalent $0.10/min. Source: https://elevenlabs.io/pricing

Today, October 11, 2025, Google Gemini 2.5 Flash Live could bet the most affordable option for real-time agents. Good to know!

Greetings!

Looking for Asterisk to AI Realtime Agents Integration? Lets talk: hello@infinitocloud.com

david551 · October 11, 2025, 7:31pm

You appear to have used the pricing for text input and output for gpt-realtime-mini, but those a for audio input and output are much higher. Reading between the lines, the text pricing implies queued processing, and therefore some delay. I assume the audio pricing accounts for both high priority and for the costs, above the core model, of doing speech to tokens and tokens to speech.

You seem to have assume 100 tokens per minute, but the number of tokens is normally more than the the number of words, and the average words per minute for conversational American English is about 150 words per minute. On the other hand, you have assumed that one side is talking over the other (and with a reverse effect, that nothing is done to truncate the response).

Also there are different models because they are good at different things, A model is not economical if it isn’t up to the job required.

InfinitoCloud · October 11, 2025, 8:49pm

Hi! Yes, you’re right.
The comparison considered the text output for Openai Realtime Mini. I’ve updated the table with additional information about the particular metrics.
I’ll also run a couple of 10-minute calls to Openai Realtime and Google Dialogflow to check their billing and compare it again.
Of course, everyone is free to use whatever tool they want.
Greetings!

system · November 11, 2025, 8:50pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Asterisk to OpenAI Realtime PRO Edition AI	1	268	July 29, 2025
Is there any one who did AI live audio chat with gemini live api and asterisk Asterisk APIs	4	326	June 4, 2025
Asterisk to OpenAI Realtime: The Definitive MVP (In Progress) AI	33	5472	June 28, 2025
Asterisk to OpenAI Realtime - Auto-install Script AI	3	347	September 30, 2025
AI assistant in real-time mode AI	5	524	July 4, 2025

Real-time AI models, a price comparison

Related topics