Support for 16K SLIN Audio

Hi:

I’m developing IAX2 audio links that use 16-bit linear audio at sampling rates that are higher than 8K. Obviously, the goal is better sound quality across the network. I want to do this in a way that is well-defined in the IAX2 RFC. I don’t think RFC5456 is clear on how to achieve this, but I want to get input from experts in this field.

Problem

  • I’m using higher-quality audio sampling rates in my IAX2 implementation, but I want to stay fully-compatible with existing Asterisk implementations.
  • IAX2 RFC5456 defines a media format for 16-bit linear audio (0x00000040), but it isn’t completely clear on what the sampling rate for this media format should be.
  • Most of the other IAX2 media formats have an implicit sampling rate.
  • I think (not 100% sure) that the Asterisk implementation of the 16-bit linear media format (0x00000040) assumes an 8K sampling rate. Anyone who knows better should correct me.
  • I can see that the Wireshark dissector for IAX2 is assuming that undocumented media format 0x00008000 is marked as “16K 16-bit linear audio,” so apparently someone has already gone down this path and claimed an unused media format codepoint for this purpose.
  • The RFC defines an information element called SAMLPINGRATE which is obviously relevant, but it is never explained how this element should be used.
  • The expert at IANA who is responsible for editing the RFCs and related numeric assignments is hesitant to allocate a new media format for 16K linear audio if the existing SAMPLINGRATE element could be clarified and used for its intended purpose.
  • It’s possible that I’d want linear audio at other higher sampling rates (for example 22.05, 44.1, 48k), so the point about leveraging SAMPLINGRATE is well taken. This is especially true given that the FORMAT element is limited to 32-bits.
  • Informative point: SIP addresses this problem by allowing sampling rates to be specified in the SIP INVITE message (example: a=rtpmap:96 G729/8000).

Proposal

  • The existing SAMPLINGRATE information element will be added as an optional element on the NEW and ACCEPT message (RFC5456 tables 6.2.2 and 6.2.3 respectively).
  • If specified, the SAMPLINGRATE element on the NEW message allows the caller to define the maximum sampling rate specified for any capable media formats for which the sampling rate isn’t implicit. (For example: the SAMPLINGRATE element isn’t relevant for G.711 uLaw because the 8K assumption is implicit in the ITU specification, but it might be relevant for G.729 where different sample rates are allowed).
  • If specified, the SAMPLINGRATE element in the ACCEPT messages allows the callee to define the sampling rate that will be used for the call, but only in cases where the media format has flexibility in this regard. The accepted SAMPLINGRATE must not exceed the maximum sampling rate provided by the caller.
  • The default sampling rate for any media format that has flexibility in this regard will be 8K. The most important example is 16K linear media format (0x00000040), which will default to 8K if no SAMPLINGRATE element is used.

Thanks for reading this far. :grinning_face: Any input/thoughts would be welcomed. I realize people don’t care about IAX2 much, but I’m working in the AllStarLink ecosystem where IAX2 is the standard.

And I don’t use Asterisk, so there are no technical constraints. I’m using a device that implements the IAX2 protocol.

Bruce

AST_FORMAT_SLINEAR16 was defined in Asterisk 1.8 or earlier. The include file structure changed some time after that, and I don’t have time to find the definition in current versions, but I assume it is the same.

Remember that IAX2 is largely Asterisk internal structures, so the codes will reflect the internal ones used in the Asterisk back bone (although there may have been extensions, beyond the original bitmap, which might not be supported by IAX. IAX2 has not had significant changes for many years.

Hi:

Thanks very much for this information. That file also provides confirmation that the 0x00000040 16-bit linear format assumes an 8K sample rate. So I’m going to try to get the IANA registry updated to (a) state the sample rate on 8K explicitly and (b) get the 16K format added to the list. I appreciate the help.