I am running FreePBX distro with asterisk 13.17.0. After I performed an upgrade to this version, asterisk became unstable and crashes once or twice a week.
I have around 500 extensions all on the PJSIP, with approx. 250 online. Usually no more than 40 calls are present at any given time. There is no problem with CPU utilization (0.5 from 8), memory, swapping or network issues I can detect.
Backtrace from the last crash core can be found here: backtrace
Before an upgrade, I ran Asterisk 13.15 with no significant issues.
Any suggestion on how to track the problem and if possible fix it is welcome.
Your backtrace is optimized so it’s not really useful except to say it may have been something in SIP. The wiki[1] has details on how to enable the things to ensure that the backtrace is useful.
The backtrace is not only optimised, it is from a build without debug symbols.
FreePBX users tend to use pre-built versions of Asterisk, so generally have problems creating a debug build, especially one that matches the one in use.
Stuff still seems to be optimized and not overly useful. I’m not familiar with the FreePBX environment so I don’t really know how best to get it within their world.
I downgraded to asterisk 13.16 and it took it 14 days before the crash occured. I created a backtrace that is hopefully of some use now. You can find it in the link below. Could you please tell me if there is something that is possible to learn from the backtrace so that I can prevent crashes from occuring? In syslog, there is a message:
asterisk[18584]: segfault at 7f3632322e6e ip 00007fafc228472f sp 00007fafbcc80710 error 4 in libasteriskpj.so.2[7fafc2204000+180000]
Therefore I assume it is connected to PJSIP somehow and probably switching back to chan_sip might solve the issue. My preference is to stick with PJSIP though.
Someone has to dig in and try to identify the particular case and situation that is causing the crash. I can say it’s not a crash I’ve ever seen before. You can file an issue on the issue tracker[1] but I have no time frame on when it would get looked into.
Well, I downgraded to 13.16 and the problem became much less frequent (with 13.17 I got a crash every 3-4 days, now I am on once a fortnight). 13.15 worked for me with same configuration for two months with just higher CPU spikes than I am used to. PJSIP is probably not so production-ready as it is presented everywhere and needs a lot of optimization effort devoted to it.
In syslog, I always get pjproject errors in time of crash, so I strongly believe my issue is connected directly or indirectly with PJPROJECT. But I cannot be sure, since my knowledge of Asterisk code is very limited and level of understanding of backtraces is close to zero.
To the second question, yes I run it on KVM. I have been through a lot of sources telling the old kernel 2.6.32 (FreePBX has this one) can have some trouble on KVM, so I hoped upgrading to SNG7 might mitigate the issue, but there is no production ready upgrade path as far as I know.
You should contact Sangoma. They made the decision to create their own OS and their own RPMs for their distro so they need to provide support for things like this themselves.
I was running that version of Asterisk on KVM for awhile and haven’t had any problems but I compile it from source.
My experience is that you have to disable BUILD_NATIVE for KVM. I don’t think that is your problem because as far as I know the RPM’s already do that.