Accessing audio stream for processing


I have a SIP to SIP channel setup through my Asterisk server. I need to access the audio stream that is coming from one SIP phone to the other and process it before sending it on an outgoing channel to the dialled SIP phone. Can anybody help me in understanding how to do this. I understand that this can be achieved by some functions from channel.C but do not have any details of the implementation.

Thanks in advance


I think you will need upwards of several man days of consultancy. My guess is that the Jobs forum would be more appropriate.

My guess is that you will need to modify something like dsp.c or write a custom channel based on chan_local.c.

It would help to know why you want to do this, as there may be an easier route.

If you want a fast authorative answer, but you intend to do the coding, I think you would do better to use the developer mailing lists or IRC channels, as the number of people who can answer that sort of question without a lot of research are likely to be in the low double figures and are unlikely to be here.

Looking at your other postings, I would suggest investigating whether there is a file format, supported by Monitor(), which can easily be read as it is being written, or creating such a format. I basically mean a raw linear format, rather than one in a WAV wrapper.

(If you take my earlier response, it might be better to augment chan_local, rather than cloning it, as it does tend to have a special relationship with the rest of Asterisk.)

Thanks for your reply. I will post the question on other forums too.
The reason why I want to do this is the because I have developed a separate algorithm which when applied to sound waves can change them and the listener will get to hear something other than the original sound (I cant really post the details here because of propriety issues). But essentially, I need to capture the incoming speech and do my processing on it and then send it on the outgoing channel.

By linear format, you must be implying a format which would allow me to cut the sound wave at any point to yield smaller fragments. Right?

That’s what I mean by “raw”. Linear means that the numerical amplitude is directly proportional to the microphone input voltage, although simple companded formats, like mu-law and a-law would also be useful.

I would be rather surprised if your algorithm doesn’t require a linear input, so you would have to convert anything else to linear.

Our algorithm requires a wav file as an input. I am able to currently run a system that makes the user record his voice through an Asterisk dataplan. I then use this recorded voice in my algorithm and playback the transformed wave. However, to do this ‘online’, I need the feature for which I have requested advice.

Your code may require a .wav. I doubt that the core algorithm does! (On the other hand, I suspect your code may not be able to cope with an MP3 .wav file.)

It is interesting if your algorithm can process file by 160 samples at once. If yes - it should not be big deal to modify voice on the fly.
Else - you can mention here how long are samples you process at once. You will get better answers.