Asterisk not hanging up channels properly (hung channels)

I want to ask a question about a behaviour that I am seeing on multiple versions of Asterisk 1.8. The bug has to do with SIP channels not being hung properly in Asterisk. These are the symptoms:

The “core show channels” shows the correct ammount of concurrent calls and active channels on the system:

Channel              Location             State   Application(Data)
0 active channels
0 active call
33158 calls processed

But the “sip show channelstats” returns a totally different output:

EXTRALUX-IP-PBX*CLI> sip show channelstats
Peer                  Call ID         Duration   Recv: Pack  Lost       (     %) Jitter               Send: Pack  Lost       (     %) Jitter
172.16.8.155     88c454870bf           0000004564  0000000000 ( 0.00%) 0.0000 0000004577  0000000000 ( 0.00%) 0.0000
192.168.2.30     4b075265579           0000011854  0000000000 ( 0.00%) 0.0000 0000011846  0000000000 ( 0.00%) 0.0001
172.16.8.145     52f58e9a56f           0000001707  0000000000 ( 0.00%) 0.0000 0000001729  0000000000 ( 0.00%) 0.0000
10.10.10.10       0c2da8a6079           0000003638  0000000000 ( 0.00%) 0.0000 0000003616  0000000000 ( 0.00%) 0.0001
172.16.8.164     7bbda4cce6a           0000011846  0000000000 ( 0.00%) 0.0000 0000011854  0000000000 ( 0.00%) 0.0000
172.16.8.112     4543acfe0a3           0000000581  0000000000 ( 0.00%) 0.0000 0000000594  0000000000 ( 0.00%) 0.0001
10.10.10.10       31792474079           0000001729  0000000000 ( 0.00%) 0.0000 0000001707  0000000000 ( 0.00%) 0.0000
172.16.8.11       3837e7e8335           0000004577  0000000000 ( 0.00%) 0.0000 0000004564  0000000000 ( 0.00%) 0.0000
172.16.8.162     d5b6a4934b7           0000000594  0000000000 ( 0.00%) 0.0000 0000000581  0000000000 ( 0.00%) 0.0001
10.10.10.10       0f75d0d6079           0000008660  0000000000 ( 0.00%) 0.0000 0000008637  0000000000 ( 0.00%) 0.0000
172.16.8.163     22d65b7c6f2           0000008637  0000000000 ( 0.00%) 0.0000 0000008659  0000000000 ( 0.00%) 0.0000
172.16.8.162     363bc836336           0000003616  0000000000 ( 0.00%) 0.0000 0000003638  0000000000 ( 0.00%) 0.0001
12 active SIP channels

Interesting thing is that the number of SIP channels shown by the commands does not math. One more way to identify these “hung channels” is also, that there is no value for Duration in the “sip show channelstats” output. For the calls, that actually are active, the Duration parameter is visible and increasing by time.

A command “sip show channels” is also showing a strange output:

172.16.8.155     670              88c454870bf7e6d  0x0 (nothing)    No       Rx: BYE                    670
192.168.2.30     790              4b075265579e86f  0x0 (nothing)    No       Rx: BYE                    790
10.10.10.10   0031111111     a1388d0c07ab11e  0x8 (alaw)       No       Rx: ACK                    RM_T2_Trun
172.16.8.145     681              52f58e9a56f494e  0x0 (nothing)    No       Rx: BYE                    681
10.10.10.10   004222222       0c2da8a6079b11e  0x0 (nothing)    No       Rx: BYE                    RM_T2_Trun
172.16.8.164     671              7bbda4cce6a9dd9  0x0 (nothing)    No       Rx: BYE                    671
172.16.8.112     678              4543acfe0a3332c  0x0 (nothing)    No       Rx: BYE                    678
10.10.10.10   005333333       31792474079a11e  0x0 (nothing)    No       Rx: BYE                    RM_T2_Trun
172.16.8.11      06444444      3837e7e833554a7  0x0 (nothing)    No       Rx: BYE                    GSM_Trunk_
172.16.8.145     (None)          6c33cd8d03dd5dc  0x0 (nothing)    No       Rx: REGISTER            <guest>
172.16.8.162     668               d5b6a4934b772ee  0x0 (nothing)    No       Rx: BYE                    668
172.16.8.151     (None)          82ba2d4384daca3  0x0 (nothing)    No       Rx: REGISTER            <guest>
10.10.10.10   0075555555      0f75d0d6079811e  0x0 (nothing)    No       Rx: BYE                      RM_T2_Trun
192.168.1.10     (None)          84ccd25614f811b  0x0 (nothing)    No       Rx: REGISTER              <guest>
172.16.8.161     (None)           f909811561ddffd  0x0 (nothing)    No       Rx: REGISTER              <guest>
172.16.8.112     678              2b21d7c073f1b45  0x8 (alaw)       No       Tx: ACK                          678
172.16.8.163     669              22d65b7c6f28804  0x0 (nothing)    No       Rx: BYE                         669
172.16.8.162     668              363bc836336a37d  0x0 (nothing)    No       Rx: BYE                         668
18 active SIP dialogs

Does anyone know how to explain these strange outputs?

I think that the issue must be quite common, since I am seeing this on majority of my production systems (small PBX-es, up to 30 users). The problem is harmless and can be cured by a simple asterisk restart, but it is sill bothering me.

SIP dialogues remain for several seconds after they cease to be usable, because Asterisk needs to recognise late and duplicate responses.

However lots of last TX BYE suggests a problem with the other side recognising the BYE.

These outputs are from a PBX that at the moment of issuing of the commands had 0 active calls a lot more than just a couple of seconds.

I am specially wondering about the output of the “sip show channelstats”. If Asterisk didn’t receive a reply to a SIP Bye (ACK did not make it to the server), shouldn’t Asterisk clear/end the call internally anyways after some sort of timeout?

It should clear after about 30 seconds.

I am not too sure about the “sip show channels” command output, but I am 100% sure that the command output of “sip show channelstats” never clears up. Only a Asterisk restart clears it up.

Do you think the issue is valid for posting on the bug tracker?

I think it is OK, but remember that you need to think about why this is not being seen widely. What is special about your system may be a clue to the cause. It will also determine the severity (lower, the more your system differs from that of a typical user).

You will need to confirm it with the latest released version on the branch, and you will need to provide verbose and debug output and sip set debug or wireshark captures. You will get a better response if you can prune those to show just one case where the call gets stuck, as nobody likes trawling through large log files.

Hmmmm, I must admit that I am surprised that this is not seen more widely. I do not have many PBX-es that I am administering, but I see the problem on the majority of them (all are small systems). Each have individual setups, so at the moment I can not put my finger on what might be the problem. The second part of the problem is, that the issues happens quite randomly, so it’s hard to catch.

Thank’s for your help David. It seems I owe you one :wink:

I did some further work on that and I see this problem only on systems that have some VoIP gateways that we use. For the system that I reported this bug, a change of the switch port for the Asterisk server did the trick. After I connected the server and the IP phones to the same switch, the problem stopped happening. All calls are cleared successfully.

From that I suspect that Asterisk does not handle well the case where Asterisk sends a SIP Bye to the remote end and the SIP ACK (that remote end sends back) does not make it to the server. I think that Asterisk should clear/end the call internally anyways after some sort of timeout, but it does not. To be more specific, the call is cleared as far as telephony channels is concerned (channel is not active - call minutes are not running), but it is not cleared from the “sip show channels” and “sip show channelstats commands”. Only a Asterisk restart clears those calls.
I have no idea on how to simulate this scenario. Doing a packet capture and a detailed debug is very hard, because the problem is very sporadic and hard to replicate. But I hope that somebody can pick up on this.

I have a smiliar issue. A call comes in PSTN to my 1.8 Asterisk. Asterisk sends an INVITE to an App server that proxies the invite on to a softphone. Call sets up ok but the final 200 is repeatedly send by the softphone becasue Asterisk is sending the ACK to the proxy (which doens’t forward it) rather than to the softphone as it should.

Any ideas? Is this a knwon issue?

ha ha ha

I have the same issue , when remote device response disnormal , the call is hung .asterisk try to end the call leg , it will expire the call via register again or sent bye . but asterisk not end the call leg with user device . My Asterisk version is 11.7 .