Segfault in asterisk 1.6.1.0

Several times a day, we are getting a segfault in our asterisk/FreePBX server, resulting in a restart. Safe_asterisk does restart the system, however we would like to find the cause of the issue and resolve it.

We are running CentOS 5.3 x86_64 (Kernel 2.6.18-128.1.10.el5) with all current updates, Asterisk 1.6.1.0, FreePBX 2.5.1. Asterisk and FreePBX were built by hand following the outlines for CentOS 5.1 from the FreePBX website, with minor modifications to handle differences in 5.3. The server is a Dell 1850 with two dual-core Xeon procs at 3.20GHz, 4GB Dram, two 146GB SCSI disks on the integrated 3i RAID controller in RAID 1 configuration.

When it segfaults, it outputs one line of the following to syslog:

[quote]asterisk[31794]: segfault at 00000000400d9fc8 rip 00000000004e1f11 rsp 00000000400d9fd0 error 6
asterisk[31865]: segfault at 0000000041c48fc8 rip 00000000004e1f11 rsp 0000000041c48fd0 error 6
asterisk[32109]: segfault at 0000000041e85fc8 rip 00000000004e1f11 rsp 0000000041e85fd0 error 6
asterisk[32263]: segfault at 0000000041617fc8 rip 00000000004e1f11 rsp 0000000041617fd0 error 6[/quote]
if I load the core file into gdb I get the following (trimmed for your convenience)

[quote]Core was generated by `/usr/sbin/asterisk -f -U asterisk -G asterisk -vvvg -c’.
Program terminated with signal 11, Segmentation fault.
[New process 32712]
[New process 32571]
** Several other new processes listed by number here**
[New process 3508]
[New process 3504]
#0 0x00000000004e1f11 in tzload (name=0x5457c0 “posixrules”, sp=0x4145c180, doextend=0) at stdtime/localtime.c:292
292 if ((strlen§ + strlen(name) + 1) >= sizeof fullname)[/quote]
if I then issue a bt to get a BackTrace I get the following output:

quote bt
#0 0x00000000004e1f11 in tzload (name=0x5457c0 “posixrules”, sp=0x41632170, doextend=0) at stdtime/localtime.c:292
#1 0x00000000004e16a9 in tzparse (name=0x41627d75 “”, sp=0x41632170, lastditch=) at stdtime/localtime.c:811
#2 0x00000000004e2883 in tzload (name=, sp=0x2aaacc010190, doextend=1) at stdtime/localtime.c:450
#3 0x00000000004e2b1b in ast_tzset (zone=0x2aaac0449577 “UTC”) at stdtime/localtime.c:1029
#4 0x00000000004e3d3c in ast_localtime (timep=0x41642d20, tmp=0x41632170, zone=0x0) at stdtime/localtime.c:1142
#5 0x00002aaac0441923 in leave_voicemail (chan=0x2aaac8005ab0, ext=, options=0x41642f30)
at app_voicemail.c:4354
#6 0x00002aaac04431d6 in vm_exec (chan=0x2aaac8005ab0, data=0x41645050) at app_voicemail.c:9513
#7 0x00000000004a729c in pbx_exec (c=0x2aaac8005ab0, app=0x2aaaac048360, data=0x41645050) at pbx.c:957
#8 0x00000000004b2300 in pbx_extension_helper (c=0x2aaac8005ab0, con=, context=0x2aaac8005e68 “macro-vm”,
exten=0x2aaac8005eb8 “s-NOANSWER”, priority=2, label=0x0, callerid=0x2aaac8004280 “intellyssiptrunk”, action=E_SPAWN,
found=0x416483cc, combined_find_spawn=1) at pbx.c:3198
#9 0x00000000004b2820 in ast_spawn_extension (c=0x5457c0, context=0x0, exten=0x0, priority=1, callerid=,
found=, combined_find_spawn=1) at pbx.c:3648
#10 0x00002aaab8004a21 in _macro_exec (chan=0x2aaac8005ab0, data=0x2aaaac12bbc0, exclusive=0) at app_macro.c:335
#11 0x00000000004a729c in pbx_exec (c=0x2aaac8005ab0, app=0x1cbe5a60, data=0x4164a4e0) at pbx.c:957
#12 0x00000000004b2300 in pbx_extension_helper (c=0x2aaac8005ab0, con=, context=0x2aaac8005e68 “macro-vm”,
exten=0x2aaac8005eb8 “s-NOANSWER”, priority=18, label=0x0, callerid=0x2aaac8004280 “intellyssiptrunk”, action=E_SPAWN,
found=0x4164d85c, combined_find_spawn=1) at pbx.c:3198
#13 0x00000000004b2820 in ast_spawn_extension (c=0x5457c0, context=0x0, exten=0x0, priority=1, callerid=,
found=, combined_find_spawn=1) at pbx.c:3648
#14 0x00002aaab8004a21 in _macro_exec (chan=0x2aaac8005ab0, data=0x2aaaac12fe60, exclusive=0) at app_macro.c:335
#15 0x00000000004a729c in pbx_exec (c=0x2aaac8005ab0, app=0x1cbe5a60, data=0x4164f970) at pbx.c:957
#16 0x00000000004b2300 in pbx_extension_helper (c=0x2aaac8005ab0, con=, context=0x2aaac8005e68 “macro-vm”,
exten=0x2aaac8005eb8 “s-NOANSWER”, priority=1, label=0x0, callerid=0x2aaac8004280 “intellyssiptrunk”, action=E_SPAWN,
found=0x4165200c, combined_find_spawn=1) at pbx.c:3198
#17 0x00000000004b5453 in __ast_pbx_run (c=0x2aaac8005ab0, args=0x0) at pbx.c:3648
#18 0x00000000004b6a9b in pbx_thread (data=0x5457c0) at pbx.c:4024
#19 0x00000000004ef2fc in dummy_start (data=) at utils.c:968
#20 0x00000033f5206367 in start_thread () from /lib64/libpthread.so.0
#21 0x00000033f46d2f7d in clone () from /lib64/libc.so.6
(gdb)[/quote]

Unfortunately the output of the debugger means exactly nothing to me. Any help would be greatly appreciated.

Oh, I have run the debugger against 4 or 5 random core files and the end statements are always the same except for the value of 'sp=0x########.

Thank you,
Tim

You need to rebuild with optimisation disabled (see the bug reporting guidelines) and provide the full required details on issues.asterisk.org.

However, the stack trace indicates that Asterisk appears not to be doing anything wrong, and it is libc that is crashing. What may well be the case is that you have serious memory corruption, in which case, when you submit the bug report, you will be asked to follow the procedures in valgrind.txt (hope that is the right name).

In the mean time, you may want to check whether there is anything weird about your time zone configuration.

Long time ago - we just started to play with Asterisk - we got very serious problems. And at the end - we started checking memory - changing memory solved our problems.