Hello, I want to stream both the parties audio separately to a web socket for real time transcription and diarization(speaker labelling). I am able to record the audio separately using monitor for both agent and costumer but i want to steam the audio
@shamnusln . Can you please help me out with this as you have achieved it
Asterisk is not able to do the job directly, as far as I know. You would most likely need a Stasis application or simply start a process that takes the dumped audio files, and streams them to the transcription service.
As an alternative you can process the files offline, after the call.
But without knowing exactly what youāre doing, itās not easy to suggest how you can go about getting it done.
i have done this,
To do this you would need a audiosocket server running . Check audiosockets in asterisk
create a channel to audiosocket server
create a brdige
copy the channel on which one party is speaking
put them in the bridge.
Your audio socket server will start getting audio stream then forward that audio stream to a transcription service i used a third party service like deepgram.
Thanks
Hi
I use Google STT but the quality is not so good.
I tried to use DeepGram but it worked with nova2-phonecall only in Englishā¦
Any clue to use it in other languages?? (I need it in french but the general enhanced model that supposed to work do not work for meā¦ I use externalMedia to send the audio buffer to a websocket and I send this audio to google or deepgram)
Thanks for your help
To do this, you would need to use a raw format, as meta data, for, say, .wav, doesnāt get back filled until the file is closed.
Thanks you so much .but can I do parallel live transcription for both the speakers at the same time .
Sure, you can. You can use two snoop channels and put them into the bridge with externalMedia. Snoop channel receives only one client audio. See here for example
or here
Than you so much for the help. Please let me know if this code is correct or do i need to make some changes and is possible please tell if we required to make changes in extensions.conf `#!/usr/bin/python3
import anyio
import asyncari
import logging
import aioudp
import os
import vosk
import array
Environment variables for Asterisk ARI configuration
ast_host = os.getenv(āAST_HOSTā, ā127.0.0.1ā)
ast_port = int(os.getenv(āAST_ARI_PORTā, 8088))
ast_url = os.getenv(āAST_URLā, āhttp://%s:%d/ā % (ast_host, ast_port))
ast_username = os.getenv(āAST_USERā, āasteriskā)
ast_password = os.getenv(āAST_PASSā, āasteriskā)
ast_app = os.getenv(āAST_APPā, āhello-worldā)
Load Vosk speech recognition model
model = vosk.Model(lang=āen-usā)
channels = {}
class SnoopChannel:
def init(self, client, parent_channel, direction):
self.client = client
self.parent_channel = parent_channel
self.direction = direction
self.rec = vosk.KaldiRecognizer(model, 16000)
async def rtp_handler(self, connection):
async for message in connection:
data = array.array('h', message[12:])
data.byteswap()
if self.rec.AcceptWaveform(data.tobytes()):
res = self.rec.Result()
else:
res = self.rec.PartialResult()
print(f"{self.direction} channel result: {res}")
async def start(self):
self.port = 45000 + len(channels) * 2 + (0 if self.direction == 'in' else 1)
self.udp = aioudp.serve("127.0.0.1", self.port, self.rtp_handler)
await self.udp.__aenter__()
snoop_channel = await self.client.channels.snoopChannel(
channelId=self.parent_channel.id,
app=self.client._app,
spy=self.direction,
whisper="none"
)
media_id = self.client.generate_id()
await self.client.channels.externalMedia(
channelId=media_id,
app=self.client._app,
external_host='127.0.0.1:' + str(self.port),
format='slin16'
)
bridge = await self.client.bridges.create(type='mixing')
await bridge.addChannel(channel=[media_id, snoop_channel.id])
async def stasis_handler(objs, ev, client):
channel = objs[āchannelā]
await channel.answer()
if 'UnicastRTP' in channel.name:
return
local_channel_in = SnoopChannel(client, channel, direction='in')
local_channel_out = SnoopChannel(client, channel, direction='out')
await local_channel_in.start()
await local_channel_out.start()
channels[channel.id] = (local_channel_in, local_channel_out)
async def main():
async with asyncari.connect(ast_url, ast_app, ast_username, ast_password) as client:
async with client.on_channel_event(āStasisStartā) as listener:
async for objs, event in listener:
await stasis_handler(objs, event, client)
if name == āmainā:
logging.basicConfig(level=logging.DEBUG)
anyio.run(main)
`
Thank you so much @abhinax4991 . If Possible can you provide some sample code i have posted one code in one of the replies . please check if the code is correct or not
If possible can anybody check this code and let me know if I need to make any changes in this
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.