I’m using ARI’s ExternalMedia endpoint to create a stream of data. I’ve been using ulaw format for a while, but now that I’m replacing google’s STT with Jax Whisper, I’m wondering how the performance might change if I use a different format, I listed all of the formats here:

What I want to avoid the most is resampling, since I’m really focusing on speed for this project. So my question is, what happens internally when I use a different format?

My current implementation uses ulaw and transforms it into wav bytes that is then read by Ffmpeg, but I’m wondering if there is a more direct approach I might be missing.

It will transcode into the format as requested. Ultimately the source is what determines the quality, so if the call is negotiated as ulaw then bumping up the external media channel to something higher won’t do much.

Is there a file I could look at in the modules to really see whats happening? I dont care much about quality, mostly about the speed. Are you saying that changing the format param wont affect the performance significantly? @jcolp Thanks for the reply!

Depending on the format chosen it will have an effect on the performance, because it costs more through taking longer for certain things to transcode. ulaw is cheap, g722 is cheap, opus is expensive. You can see the translation paths using “core show translation” including translation times.

I’m not really sure what you mean by “Is there a file I could look at in the modules to really see whats happening” but codec modules that transcode are in the codec directory. Asterisk gives them audio in one format, they return it in another.

The appropriate choice of codec also depends on where the speaker is. If they are are on the other side of the PSTN, on a landline, G.722, or in a few cases G.722, are the only sensible choices.

If they are using a standard mobile phone, that will be the limiting factor to the accuracy of the the speech decoding.

Is there a way to change the source format? or find out which are we using? @jcolp

Please don’t tag me. If I have anything of value to add to threads then I respond. I also don’t know what “source format” means within this context. If you mean what is being used with a remote side, then that’s dependent on what is allowed (such as in pjsip.conf for the endpoint) and what is supported by the remote side.

