I have Asterisk 16.20.0 Installed on Centos 7 Server running chan_pjsip
I have the users connected on pjsip extension with TLS and I have also SIP connectivity configured on PJSIP with UDP. I have noticed that the asterisk restarted and I have received the below error in Linux messages.
Oct 29 12:45:14 ip-10-10-5-23 kernel: asterisk[1872]: segfault at 2 ip 0000000000000002 sp 00007f62e4f0cc88 error 14 in asterisk[400000+2d1000]
Oct 29 12:45:14 ip-10-10-5-23 abrt-hook-ccpp: Process 1701 (asterisk) of user 0 killed by SIGSEGV - dumping core
Also, many crashes are delayed from the underlying fault, so look for anything unusual being logged leading up to the crash.
Also, most crashes, in the field, happen when people are doing something unusual, so please say what was happening at the time, and any way in which your usage of Asterisk is unusual.
Finally note that 16.22.1 has two fixes for crashes associated with AEL reloads and one fix for crashes associated with Read().
@david551 - we are not done any ael reload during the time.
Getting a Backtrace - Asterisk Project - Asterisk Project Wiki this should run now itself or we need to run when it crashed itself. because it was crashed 3 hr back and whether we will be able to get the required data on the core dump and will be able to find the root cause of the crash.
During the crash, we have not observed anything extra on the asterisk. we didn’t have the issue with chan_sip - is there anything specific to chan_pjsip ?
Thread 1 (Thread 0x7f62e4f0d700 (LWP 1872)):
#0 0x0000000000000002 in ()
#1 0x000000000059347e in ast_taskprocessor_execute (tps=tps@entry=0x1fa5260) at taskprocessor.c:1235
local = {local_data = 0x1fa29c0, data = 0x7f634c01c568}
t = 0x7f634c04a270
__PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#2 0x0000000000593520 in default_tps_processing_function (data=data@entry=0x1fa21d0) at taskprocessor.c:209
listener = 0x1fa21d0
tps = 0x1fa5260
pvt = 0x1f98030
sem_value = 0
__PRETTY_FUNCTION__ = "default_tps_processing_function"
#3 0x00000000005a2638 in dummy_start (data=<optimized out>) at utils.c:1428
__cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {33157920, -8714024614472384669, 0, 512000, 0, 140062724511488, 8778739609105137507, -8714024939614568605}, __mask_was_saved = 0}}, __pad = {0x7f62e4f0cdb0, 0x0, 0x0, 0x0}}
__cancel_arg = 0x7f62e4f0d700
__not_first_call = <optimized out>
ret = <optimized out>
a = {start_routine = 0x5934e0 <default_tps_processing_function>, data = 0x1fa21d0, name = 0x1f9f320 "default_tps_processing_function started at [ 226] taskprocessor.c default_listener_start()"}
__PRETTY_FUNCTION__ = "dummy_start"
#4 0x00007f636a6d6ea5 in start_thread () at /usr/lib64/libpthread.so.0
#5 0x00007f6369a769fd in clone () at /usr/lib64/libc.so.6
Asterisk wasn’t built for debugging, and it looks like a task processor request has been overwritten, which will make it very difficult to debug if you don’t know what was likely to be happening at the time.
It’s trying to do a callback, but the callback subroutine address is 0:
In principle, you might get some more information by running:
frame 1
print *t
in gdb, but I suspect that the whole of *t is zeroes, in which case you won’t be able to find out what task it was trying to do. Actually, I’m not sure that anything except the routine with the corrupted address knows what is being done.