One Second break in audio every so often

I have been plagued with a problem on our asterisk server where we periodically get breaks in audio a little over one second long. Sometimes it is more often than others (every few minutes) and sometimes it may take a half hour and you won’t hear a single pause.

I was able to capture a wireshark file from both phones and from the asterisk server when it happened. The RTP stream FROM Asterisk for that channel just quit for about 1 second. The phone capture shows the same thing, audio going to asterisk but none going from asterisk to the phone. The same thing happened for the other leg of the call to the other phone (Asterisk stopped transmitting to that IP, but packets continued to be received). There were other calls going on at that time as well and Asterisk continued to send to those IPs for longer, and they only had a 300 mS gap.

Once the packets started flowing again, they resumed at the next SEQ number and flowed at a much faster rate to catch up (normally they are 20 ms/pkt - I got 30 pkts in 10 mS got it caught up quickly).

So what I see is that something caused asterisk to hiccup for a second and it got back on track. The phones didn’t have enough jitter buffer to handle the gap, so the audio was missed.

There are no other time intensive things running on this server. Mainly just asterisk. There weren’t any packets requesting things besides RTP information. It doesn’t seem to be a Network issue as the packets are backed up within asterisk itself, not traversing the net.

We do have call recording turned on with the audio merge on hangup option, but I would expect that to be something that could be handled independently and I didn’t see a BYE or any SIP packet around that time.

There were only about 6 channels active at the time, so I don’t think it was a load issue.

This was happening with version 1.6.x branch and we upgraded to Ast 11.0 branch with no change in the symptoms.

Any ideas what could be happening?

I’ve never seen anything like that. I would suspect an OS problem.

As you have a concurrent inbound stream, I don’t think it is a timing source problem, but it might just be worth checking your internal timing sources.

I am assuming that you are running on real hardware, as this sort of thing could happen on VMs.

Yes. Real hardware. Running Centos 5.6

Not sure what you mean, “check internal timing sources”. This is running a digium quad pri card and two of those spans have 0 for internal clock and two others have 1,2 for primary secondary.
The CLI command “timing test” reports Using the ‘DAHDI’ timing module for this test. and the results look right.