Codec for Websocket channel when data is 8-bit PCM

I’m wanting to use the Websocket channel driver to handle audio connections to my application where the data is 8-bit PCM. Which codec should I define for the websocket config in the dial plan to support this? Alternatively, if its not possible with 8-bit PCM then I assume 16-bit signed linear PCM should be possible? Is there a list of the codec values possible for the dialplan so that I can see what the options are?

There’s no codec that supports 8-bit linear PCM directly. You don’t mention the sample rate but the “slin” codec is 16-bit-signed linear with an 8KHz sample rate which is probably what you want.

You can use the Asterisk CLI command core show codecs to list all of the codecs Asterisk knows about. The “FORMAT” column contains the identifier to use in Asterisk configurations including the dialplan. Be aware though that the ones actually supported by your installation of Asterisk is dependent on which codec modules are loaded. You can see the list of loaded codec modules using module show like codec. The slin codecs are built-in so they won’t appear in the module list.

Where did you get the 8 bit PCM specification from? My guess is that PCM here is being used to refer to a non-linear coding, whereas the normal assumption, in the Asterisk world, is that PCM refers to linear codes. 8 bit linear PCM doesn’t have a enough dynamic range to be useful for telephony.

My guess is that they are referring to G.711, but G.711 has two variants: µ-law and A-law (ulaw and alaw, in Asterisk). If that is the case, you will need to find out which, as the audio will be distorted, if you use the wrong one.

“PCM” is used in the full title of G.711.

Actually, it’s “ulaw” not “mulaw” wherever you specify a codec.

Should have remembered that, as it doesn’t read correct phonetically.

There’s no codec that supports 8-bit linear PCM directly. You don’t mention the sample rate but the “slin” codec is 16-bit-signed linear with an 8KHz sample rate which is probably what you want.

Yes, you are correct. Our sample rate is 8KHz

Where did you get the 8 bit PCM specification from?

It’s a bit of an unusual case. We’re building an interface between VHF radio and VOIP. So I’m told the hardware is what is producing the 8-bit PCM samples and I’m writing the application to act as the interface between the digital format for the hardware and the Asterisk server.

Everything I’ve been reading suggests that the websocket channel driver is the best way to handle this process, I may just need to manipulate the audio a bit for compatibility I think

Yep, you’re on the right track. Fortunately converting between 8 bit and 16 bit is simple math. You don’t even have to do it yourself. On a Linux system, you could just run sox to do the conversion.

I think you still think that the requirement is for linear encoding. PCM has been used, as a telephony term, since before Microsoft even existed, to invent the .wav format, and in that context was used for the piecewise linear, but overall logarithmic, G.711 variants.

For 8 to 16, you’d generally use a lookup table. The same is probably true the other way, given that memory is cheap these days. However, Asterisk knows how to transcode G.711.

This is actually one of the many threads where critical information comes in in dribs and drabs. In particular, we have been told “VHF Radio”, but not which model, to allow us to search for documentation, and not the actual wording of the interface specification. The biggest problem here is that “PCM” is not a specific enough term.

I can understand the frustration and this is a topic that I’m learning a lot about as I go since I’ve not worked in this area before. I also have to tread carefully with commercial confidence limitations too.

What I can tell you is that the VHF radio hardware and software is being built by my company for a fairly specialised purpose. It is most definitely not an off-the-shelf commercial product. I can influence the designs of what they output somewhat but there are hardware limitations and fairly heavy performance metrics that have to be considered. For example, I know the hardware can only produce 12-bit audio samples and not a full 16-bits. Speed and packet size are also highly important at the hardware end, hence the decision to downsize to 8-bits before transmission to my application. We’ll test heavily to see the impact of this downsizing. I think if too much quality is lost then I should be able to persuade them to send the full 12-bits instead and then adjust that but there are same major bandwidth limitations to be considered. Another option would be to perform the logarithmic encoding ourselves at the hardware end but that may be an issue given the limited processing power we can call on there.

I’ve done some experimentation with the example audio samples I have and the slin codec is working well for me for now. I’ll continue to implement this so that we can test and develop a baseline for comparison.

Thanks for all the help.

If they have a 12 bit codec, I’d suggest that they should be using G.711 A-law, as it will approach traditional PSTN quality only limited by the resolution of the codec. (They could change the parameters, to get less distortion, but that would not help if they then transcode to G.711 to pass through the phone network, and would require a special codec at the other end.)

If they stay with 8 bit linear, they should consider adding noise at the sending end, to dither the low order bits.

I’m writing a Python application that takes raw 8-bit unsigned samples (at 8k) and passing that into the websocket channel in Asterix that is setup to use theslincodec. Note that these samples are not compressed or anything like that. Just straight raw values of 0-255 which, as I currently understand it, is 8-bit PCM audio.

If I pass in the values as is, then I hear the correct audio but its a bit distorted and sounds sped up which I assume is the effect of it interpreting every two samples as a single 16-bit sample.

However, I’m not sure how to convert these 8-bit samples into the equivalent 16-bit values as I’m not sure of the exact format that the codec uses. From my research, I’m assuming that I need to map the 8-bit unsigned integer to a 16-bit signed integer (i.e. 0 → -32768, 255 → 32767). And from my testing, it seems to a big-endian 2’s-complement format because I can just hear the audio track mixed in with a lot of static / white noise. Anything else and I just get pure static without any hint of the original audio.

So, how do I clean up the converted values to get a better version of the audio with minimal static? I’m quite a novice at dealing with audio samples so any thoughts or advice would be appreciated

If you are using audiosocket with asterisk this is how my vibe coded project implemented it

Although this is technically a form of PCM, as it is a time domain digital representation, it is not PCM as understood in the telephony industry, which would be G.711, and encoding it unsigned, with a DC offset, is particularly non-standard - 8 bit linear would normally be in signed format, and probably two’s complement.

I’d suggest the easiest way, for you to do the conversion to 16 bit, signed linear, network byte order, would be a table lookup, as there are only 256 values. However, to do it algorithmically, you would cast to take the unsigned 8 bit value, assign it to 16 bit signed, subtract 128, multiply by 256 (or arithmetic shift left by 8) and then call the host to network byte order function (unless you are already on a big endian machine, and don’t want the code to be portable).

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.