Help with Asterisk deadlock (possible bug)

A premise: I’m not able to reproduce it, but it happens frequently (like once or twice a day) on some installations with Asterisk 11.20-11.21

Symptoms are that Asterisk take 100% CPU, phones and channel stop working and from asterisk console I’ve no response on command like "sip show peers"
Restarting Asterisk seems the only way to get it back working.
Another clue: all those happens using WebRTC but I’m not sure that it’s connected with the problem. There’s a similar bug, #25456 but I’m not sure that is the same issue.

Usually there’s no message in my asterisk full log when this happens, sometimes this:

[2016-02-01 01:44:57] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:44:58] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:44:59] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:45:01] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:45:02] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:45:03] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:45:04] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:45:06] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2
[2016-02-01 01:45:07] WARNING[19221][C-000001f9] channel.c: Exceptionally long voice queue length queuing to Local/884@from-queue-0000088d;2

Here http://pastebin.com/xfTxu9Cz the output of

ps -LlFm -p /sbin/pidof asterisk

and

pstack /sbin/pidof asterisk

Here http://pastebin.com/eyt8SZgh the strace -ff -p /sbin/pidof asterisk

Anyone have any ideas?? Thankyou

The provided information isn’t useful, except to show that something is not servicing a Local channel. To determine if it’s actually a deadlock you can follow the instructions on the wiki[1] to get a backtrace and provide it.

[1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace#GettingaBacktrace-GettingInformationForADeadlock

http://pastebin.com/xfTxu9Cz <- different commands, same output of gdb -ex “thread apply all bt” --batch /usr/sbin/asterisk pidof asterisk > /tmp/backtrace-threads.txt

for the second one,

$ asterisk -rx "core show locks"
No such command ‘core show locks’ (type ‘core show help core show locks’ for other possible commands)

Do I miss something?

Asterisk has to be built with the DONT_OPTIMIZE flag set in order to have a useful backtrace. Since it hasn’t the backtrace is not useful. In the case of “core show locks” you have to build with DEBUG_THREADS enabled.

Ok, I’ll recompile and wait for the next lock
thanks

Here the backtrace
http://pastebin.com/ddkQBGnk

This appears to be the same issue as https://issues.asterisk.org/jira/browse/ASTERISK-25275 but manifesting in a different way. I’d suggest adding yourself as a watcher on the issue to see when any progress is made.

Ok, thankyou. I’m trying workaround suggested in the bug…

[update] workaround described here https://issues.asterisk.org/jira/browse/ASTERISK-25275 seems to solve the issue. Thanks @jcolp


diff -Naur asterisk-11.21.1.ori/res/pjproject/pjnath/src/pjnath/stun_session.c asterisk-11.21.1/res/pjproject/pjnath/src/pjnath/stun_session.c
— asterisk-11.21.1.ori/res/pjproject/pjnath/src/pjnath/stun_session.c 2016-02-03 21:23:32.000000000 +0000
+++ asterisk-11.21.1/res/pjproject/pjnath/src/pjnath/stun_session.c 2016-02-09 10:04:25.439002972 +0000
@@ -866,7 +866,7 @@
pj_stun_tx_data *tdata)
{
pj_status_t status;

  • cache_res=PJ_FALSE;
    PJ_ASSERT_RETURN(sess && addr_len && server && tdata, PJ_EINVAL);

pj_log_push_indent();