MarkL
April 19, 2017, 10:26pm
1
I’m using asterisk on Centos 6.5 for a few years now. Last time, it “freezes” often when doing a “core reload”.
After the “freeze” the Recv-Q from port 5060/UDP is filling up and all registrations and devices which are on UDP are going offline.
Only solution to get it up and running again is to kill the asterisk process and start it over again.
We have this issue a few months now but in the beginning it was occasionly, last week it occurs in more than 50% off the reloads.
We tried different asterisk versions, from 11.7 to 11.25.
What could we do to troubleshoot this issue?
Thanks,
Have you try using the Asterisk 13.X LTS version.
MarkL
April 19, 2017, 10:31pm
3
Asterisk 13.5.0 has the same issue
jcolp
April 19, 2017, 10:32pm
4
Getting a backtrace[1] would show the state of the system and why it is hung up. If you can provide that we can take a look.
[1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace#GettingaBacktrace-GettingInformationForADeadlock
No sure but this sounds like a DNS issue when reading your configuration after the reload, but honestly not sure about it
MarkL
April 19, 2017, 10:56pm
6
Backtrace attached.
I noticed it’s only the trafic on port 5060/UDP which stops processing. Devices on 5060/TCP are not experiencing any problems.
core-show-locks.txt (61.5 KB)
backtrace-threads.txt (102.1 KB)
jcolp
April 19, 2017, 11:00pm
7
Are you using pbx_ael or pbx_realtime?
MarkL
April 19, 2017, 11:01pm
8
Don’t know.
How can I check this?
jcolp
April 19, 2017, 11:01pm
9
How are you configuring your dialplan?
MarkL
April 19, 2017, 11:02pm
10
It’s a freepbx installation
jcolp
April 19, 2017, 11:03pm
11
They don’t use AEL or realtime. There seems to be some sort of deadlock situation. I’d confirm it exists under the latest version of 13 (13.15.0 is the latest) and then file an issue[1] with the backtrace and console log.
[1] https://issues.asterisk.org/jira
MarkL
April 19, 2017, 11:10pm
12
No DNS requests are done after the reload, just checked it.
MarkL
April 19, 2017, 11:45pm
13
Just tried to reproduce in 13.15.0, problem seems not to be present here.
However our FreePBX version isn’t compatible with 13.15.0 so I have to go back to 11.25.0.
Any other ideas to debug the problem here?
jcolp
April 19, 2017, 11:46pm
14
You would need to identify the fix that resolved the issue and backport it.