Sending Call Metadata Along with Audio Buffers in ARI External Media

Problem Statement:

I am using External Media in Asterisk ARI to send call audio buffers to an RTP server. However, along with the audio data, I also need to send metadata related to the call (such as caller ID, call start time, channel information, etc.) with every packet.

What I Have Tried:

  1. Using RTP Headers: RTP itself does not support custom metadata fields directly.
  2. Custom Payloads: Considered embedding metadata within RTP packets, but Asterisk does not seem to provide a direct way to do this.

Questions:

  • Is there a built-in way in ARI to send call metadata along with RTP packets?
  • Can Asterisk’s External Media module be configured to include additional data in the RTP stream?
  • Would modifying Asterisk’s RTP implementation be necessary, or is there a more straightforward approach?

Any insights or alternative approaches would be greatly appreciated. Thanks!

Any one RTP packet can only have one payload type.

More generally, Asterisk expects SIP to be used for metadata, and at the session level, not at the individual media frame level.

1 Like

It sends media. That’s it. Any metadata is outside the scope of it and would be done in the ARI application using other mechanisms.

1 Like

Thanks for your response, @david551. Could you elaborate on what you mean by using SIP for metadata in this context?

Currently, I’m receiving proper metadata from AGI in my ARI application. However, the issue is how to transmit this metadata to the RTP server when using External Media in ARI.

Since External Media only sends raw audio frames over RTP, I’m unsure how SIP fits into this workflow. Are you suggesting that metadata should be exchanged via SIP signaling before the ARI application takes over? If so, how would that work alongside ARI and AGI?

Would appreciate any insights on how to integrate SIP for metadata handling here.

Thanks for the clarification @jcolp. Since External Media only handles media transmission, what are the other mechanisms typically used in an ARI application to send metadata alongside the RTP stream?

Is there an industry-standard or commonly adopted approach for this? Should I look into separate signaling via SIP, WebSockets, or some other method? Would love to know what’s considered the best practice for this use case.

ARI is a set of APIs. You use the APIs to retrieve information or do things. There are also events that contain information. You can retrieve channel details which includes caller id:

https://docs.asterisk.org/Asterisk_22_Documentation/API_Documentation/Asterisk_REST_Interface/Channels_REST_API/#get

I would highly suggest going over the ARI documentation and looking at what exists, experimenting, understanding how it works.

2 Likes

I mean that the whole purpose of SIP, and its SDP payload, is to provide metadata about media streams.

RTP server isn’t a term people would normally use. RTP is generally used in conjunction with other protocols and in a completely symmetric way, so there is no distinction between client and server at the RTP level.

1 Like

This is the answer! anshvert, you can extract any metadata querying to the rest api, check the docs.