Segmentation fault Asterisk 13.19.0

Hello Team!

Hope you can help me on this.
I have Asterisk instance running on VMWare. 4GB RAM, 1 core Intel® Xeon® CPU E5-2630L v3 @ 1.80GHz

We’ve been running Asterisk 13 for about six months already and had no issues. We are using res_pjsip with approx 100 endpoints with static contacts and registrations. First we had version 13.16, then updated to 13.19 as soon as it was released.
Recently it started to segfault with errors like this:

kernel: asterisk[24600]: segfault at 0 ip 00007fe28480e921 sp 00007ffdee61a238 error 4 in libc-2.17.so[7fe2846a9000+1b8000]
kernel: asterisk[25526]: segfault at 0 ip 00007f0b08345921 sp 00007ffde8fe01d8 error 4 in libc-2.17.so[7f0b081e0000+1b8000]
kernel: asterisk[13140]: segfault at 423f0a8 ip 00007ffbd8355036 sp 00007ffbb6689718 error 4 in libc-2.17.so[7ffbd8220000+1b8000]
kernel: asterisk[2443]: segfault at 4158e08 ip 00007f8e9b2f8036 sp 00007f8e815dd718 error 4 in libc-2.17.so[7f8e9b1c3000+1b8000]
kernel: asterisk[13476]: segfault at 78 ip 00007f5f5d71f0f9 sp 00007f5f4116d560 error 4 in libasteriskpj.so.2[7f5f5d689000+152000]
kernel: asterisk[29763]: segfault at 2ad33a0 ip 00007fbbcd81e0c8 sp 00007fbbabb55718 error 4 in libc-2.17.so[7fbbcd6e9000+1b8000]
kernel: asterisk[21986]: segfault at 78 ip 00007f6267a5a0f9 sp 00007f6243483560 error 4 in libasteriskpj.so.2[7f62679c4000+152000]
kernel: asterisk[11503]: segfault at 78 ip 00007fba94a200f9 sp 00007fba6bffe560 error 4 in libasteriskpj.so.2[7fba9498a000+152000]

It produced core dump (latest backtrace attached). Asterisk crashes and safe_asterisk is unable to restart it so i had to manually start it again (with pgrep script later).
I tried to update it to 13.20, 15.2 and downgrade it to 11.21, disabling pjsip and enabling only SIP, but nothing helps it still crashed almost every hour even in idle state when there are no calls at all.
Then i made fresh CentOS install on different(physical) ESX host and compiled asterisk 13.19 from scratch again. It worked the whole night without issues but crashed again in the morning with segfault. Asterisk give no messages, on debug it just unregisters all modules and stops.

I would really appreciate your helps. Let me know if i need to provide any more information or logs.

Thank you.backtrace.txt (81.1 KB)

There is no team here. It is a peer support forum.

You need to rebuild Asterisk with compiler optimisation disabled for the backtraces to be of much use.

Hello David,
I will recompile it and get backtraces from the next crash.
Thank you.

I recompiled Asterisk and it was working fine up until now. We experienced several crashes for the past 2 days, i’m attaching backtraces from the last one. Will these do any help?

Thank you.

core-thread1.txt (11.5 KB)
core-locks.txt (1.2 KB)
core-brief.txt (67.8 KB)
core-full.txt (282.1 KB)

I forgot to mention that along with recompiling we updated version, so those crashed happen on 13.20.0

You didn’t compile with optimisation disabled!

It looks like chan_pjsip is passing a null pointer for a reference count variable. I’m not familiar with the internal of chan_pjsip.

Thread 1 (Thread 0x7fe43940b700 (LWP 5049)):
#0  pj_atomic_dec_and_get (atomic_var=0x0) at ../src/pj/os_core_unix.c:962
        new_value = <optimized out>
#1  0x00007fe4d7e87d55 in pjsip_transport_dec_ref (tp=0x7fe47018f398) at ../src/pjsip/sip_transport.c:1046
        tpmgr = 0x2d59c40
        key = {type = 2, rem_addr = {addr = {sa_family = 2}, ipv4 = {sin_family = 2, sin_port = 25310, sin_addr = {s_addr = 1912541612}, sin_zero = "000000000000000000000"}, ipv6 = {sin6_family = 2, sin6_port = 25310, sin6_flowinfo = 1912541612, sin6_addr = {s6_addr = "000000000000000000000000360275356327344177000", u6_addr32 = {0, 0, 3622747632, 32740}}, sin6_scope_id = 1275870392}}}
        key_len = 24
        tp = 0x7fe47018f398
#2  0x00007fe4d7e8813f in pjsip_transport_send (tr=0x7fe47018f398, tdata=tdata@entry=0x7fe44c0c37d8, addr=addr@entry=0x7fe44c0c39c8, addr_len=addr_len@entry=16, token=token@entry=0x7fe45402eac8, cb=cb@entry=0x7fe4d7e83050 <stateless_send_transport_cb>) at ../src/pjsip/sip_transport.c:859
        status = 70004
#3  0x00007fe4d7e83262 in stateless_send_transport_cb (token=token@entry=0x7fe45402eac8, tdata=tdata@entry=0x7fe44c0c37d8, sent=<optimized out>, sent@entry=-70002) at ../src/pjsip/sip_util.c:1257
        status = <optimized out>
        cont = 1
        cur_addr = 0x7fe44c0c39c8
        cur_addr_len = 16
        via = <optimized out>
        stateless_data = 0x7fe45402eac8
......

I Used Asterisk wiki for compiler options: https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

On the pic attached options that were enabled during recompile and core show settings from asterisk itself which indicates that flags are enabled. Unfortunately i don’t know why core dump is still optimized out. I will recompile again and wait for crush to see if this time i can get proper backtrace.


Your problem appears to be within PJSIP itself, somehow in its transport layer in relation to how you are specifically using Asterisk. Are you using TCP? TLS? Websockets?

Yes, i’m using TCP for SIP for local endpoints and UDP for external trunks, no TLS, no Websockets.

Here is my transports section settings:

[global]
type=global
user_agent=PBX
regcontext=registerstate
keep_alive_interval=0
disable_multi_domain=yes
mwi_tps_queue_high=5000
mwi_tps_queue_low=-1

[transport-udp]
type=transport
protocol=udp
bind=LOCAL_IP:5060
local_net=NETWORK

[transport-tcp]
type=transport
protocol=tcp
bind=LOCAL_IP:5060
local_net=NETWORK

I use transport-tcp for endpoints (phones) and transport-udp for trunks.
Can this be an issue or using TCP in general is an issue?

The issue is somewhere in the PJSIP TCP transport or usage of it. There’s been recent tweaks done to it, so I’d suggest using the absolute latest version and if that does not resolve the problem then filing an issue[1] with details.

[1] https://issues.asterisk.org/jira

I will update to 13.21.0 to see if that helps.

Thanks a lot for your help.