High-volume constant-tone alarms distort over RTP in Asterisk 20.2.1

Hi all,

I’m experiencing an issue when broadcasting alarm sounds over RTP multicast from Asterisk 20.2.1 to IP speakers.

Problem description:

  • Alarm sounds are usually single-tone or constant dual-tone signals.

  • When played at high volume, the tone intermittently drops or “pumps” – amplitude momentarily dips and then recovers.

  • Speech announcements or more complex audio do not exhibit this behavior.

  • Playing the same sound directly on my PC (ALSA/desktop player) works fine – no distortion.

Files tested:

  • GSM source files (original alarm tones) → decoded and re-encoded to G.711 μ-law for RTP

  • 16-bit PCM WAV at 8 kHz mono → Asterisk encodes to μ-law RTP

  • SoX generated tones:

    • Single-tone 1000 Hz high-volume

    • Dual-tone 900+1300 Hz

    • Sweep tone 700–1200 Hz

Observed:

  • GSM source tones also exhibit distortion when broadcast

  • 16-bit PCM WAV tones still produce noticeable “pumping” at high volume

  • Wireshark RTP logs show no packet loss or jitter significant enough to explain this

Things I have considered / tested:

  1. File format compatibility – all files converted to 16-bit PCM 8kHz mono WAV or μ-law WAV

  2. Re-encoding / bypassing GSM → μ-law pipeline

  3. Adjusting SoX volume to lower levels

  4. Checked RTP buffers / softmix → simple_bridge switching in logs

Questions:

  • Is this a known behavior with constant-frequency tones in Asterisk when broadcasting via RTP to IP speakers?

  • Could the distortion be caused by Asterisk’s audio pipeline, codec quantization, or the IP speaker’s AGC/limiter?

  • Are there recommended settings, workarounds, or best practices for broadcasting high-volume alarm tones reliably through Asterisk?

  • Any suggestions to tune Asterisk’s streaming or format handling for stable playback of constant tones?

Thanks in advance for your advice!

You haven’t really stated how you’re doing things, but if you’re really just playing a file then it is possible that Asterisk is not touching the audio at all. Sending it out over RTP doesn’t transcode. Transcoding only occurs when codecs differ.

Did you… play the RTP stream in Wireshark?

actually I haven’t played the rtp packets from wireshark before. I just listened the rtp packets from wireshark, the distortion occurs also here

I would suggest providing more detailed information and show the actual console output. If transcoding is occurring, then avoiding that would from an audio contents perspective sent the audio content untouched.

I am trying to play alarm sounds from Asterisk to IP speakers. The alarms must be played at a relatively high volume so that they are clearly audible in the environment.

here is the asterisk cli output:

I play GENERAL.ulaw file. there is any error file itself, when I play it on my computer, there is no error heard. but while playing it from asterisk, I hear disortion from my ip speaker and also hear from wireshark rtp analyzer.

Even though the 3 errors above are visible, I can hear the sound.

Okay, so you’re actually making a mixing bridge because you have multiple channels. It wouldn’t be forwarding the audio through as-is but instead mixing it. There are no open issues with this, but it’s not something that I bet many people would be doing (high audio volume). Your version is also almost 3 years old, so maybe something has changed since then - I don’t know, and that’s not something I’m going to go through history looking at.

I think it’s not best way to do it for high quality sound streaming to speakers by using asterisk right? I will have dozens of speakers and my alarm sounds are loud, narrow-band tonal signals consisting of constant or periodic fixed frequencies (e.g., single or dual tones), rather than typical speech audio. What do you suggest? Is there a way to solve this with Asterisk?

I can’t answer any of that.

Send them direct to the speaker, from a recording made in the codec used by the speaker. In that case, the speaker will receive exactly what was in the file.

I’m surprised that the volume is varying.

Could I check that you aren’t wearing hearing aids, as hearing aids will treat pure tones as feedback whistles, and will mute them?

The core problem jcolp identified is correct — when you Page() to multiple channels, Asterisk creates a mixing bridge rather than forwarding audio directly. The mixing bridge runs a softmix engine that resamples, mixes, and re-encodes every audio frame, even if there’s only one source. That process is optimized for speech (broadband, varying amplitude, short correlation windows) and does poorly with constant-frequency tones.

Here’s why pure tones specifically break: the softmix engine uses a fixed-size mixing buffer with automatic gain normalization. A sustained full-amplitude sine wave at, say, 1000 Hz has zero amplitude variation across frames, so the gain adjustment logic hunts — it briefly reduces gain because the signal looks “hot,” then the level drops, so it compensates upward, and you get that pumping/breathing artifact. Speech never triggers this because its amplitude envelope is naturally dynamic. The effect gets worse at higher volume because you’re closer to the digital ceiling where the normalization has less headroom.

**Two approaches that actually fix this:**

**1. Bypass the mixing bridge entirely.** Instead of using Page(), send audio directly to a single MulticastRTP channel with Playback():

```

exten => s,1,Answer()

same => n,Playback(MulticastRTP/basic/239.0.0.1:1234,alarm-tone)

```

With a single channel and no bridge, Asterisk reads the ulaw file and writes it straight to the RTP stream with no mixing. Your speakers subscribe to the multicast group and play what they receive. No softmix, no gain adjustment, no artifacts.

**2. Pre-encode and use the native format.** As david551 mentioned, if you must use a bridge (e.g., you need to page SIP endpoints alongside multicast), make sure the source file is already in the exact codec every endpoint negotiates. For ulaw endpoints, generate your tones directly in ulaw:

```

sox -n -r 8000 -c 1 -t ul alarm-tone.ulaw synth 5 sine 1000 vol 0.8

```

The `vol 0.8` is important — don’t go full scale. Leave 2-3 dB of headroom so the mixing math doesn’t clip. Full-amplitude tones through an 8-bit companded codec with mixing on top will always distort.

Also worth noting: Asterisk 20.2.1 is nearly three years old. The bridge mixing code has had several fixes since then. If upgrading is feasible, 20.11 or the current 22.x series would be worth testing — some of the softmix buffer handling was reworked.

If neither approach works and you just need reliable tone delivery to IP speakers, you might also consider cutting Asterisk out of the tone path entirely and using a lightweight multicast sender (ffmpeg can do this in one line) for the alarm audio, reserving Asterisk for voice announcements where its mixing actually makes sense.

1 Like