I’ve been successful at doing asynchronous semi-realtime transcription with Google Cloud Speech but now I’m trying to make it work with AWS Transcribe, as a way to compare them.
AWS Transcribe only accepts “16-bit Linear PCM encoding” for realtime transcriptions. I’m assuming that this is what asterisk defines as the slin
codec.
So I’ve called externalMedia
with both slin
and slin16
to no avail.
I know my stack works because I can change the provider to Google Cloud Speech and specify ulaw
(which GCS does accept) it works.
BUT, if I change the format to slin
or slin16
which GCS also does accept, GCS fails to work (which is the same case as with AWS Transcribe).
This makes me think that I’m wrong in thinking that “16-bit Linear PCM encoding” is slin
or that I need to fiddle a bit with the packets (endianess?).
So, is “16-bit Linear PCM encoding” what asterisk defines as slin
/slin16
?
Thanks