Asterisk collapsing every 2hours

Hello,

we are using Asterisk 1.4.44
for some resone from the 12.10 this month (no changes been made)
asterisk is falling every 2-3 hours.

i have a core file Analysed with gdb with this error :

"
Core was generated by `/usr/sbin/asterisk -f -U asterisk -G asterisk -vvvg -c’.
Program terminated with signal 11, Segmentation fault.
#0 0x0807cd8b in ?? ()

"

can someone tell me where is the issue? i know the answer relays in this "
#0 0x0807cd8b in ?? () " but i dont have a clue how to read it.

we have no other errors on messges full or dmesg and there is no overload on the CPU.
what more can i do?

help please

Asterisk 1.4 has not been supported for quite some time. As for your backtrace it doesn’t really show any information I’m afraid.

Whilst it is true that 1.4.44 is too old for anyone to be interested, he didn’t actually obtain a backtrace. A backtrace may give a clue as to what to avoid. All he’s given is the message that is printed when gdb is started.

Thanks for the fast replay.
i understand it is not supported but still do you have some leads to give me to check
based on your exspirience?

if so please please write me some clue or leade that i
can check to see why it is hapaning?

this is the logs in /tmp/ for today:

-rw------- 1 asterisk asterisk 34013184 Oct 18 19:55 core.24810
-rw------- 1 asterisk asterisk 31006720 Oct 19 07:13 core.19685
-rw------- 1 asterisk asterisk 138383360 Oct 19 13:40 core.4270
-rw------- 1 asterisk asterisk 117116928 Oct 19 15:14 core.25909
-rw------- 1 asterisk asterisk 124432384 Oct 19 15:36 core.15748
-rw------- 1 asterisk asterisk 105873408 Oct 19 15:44 core.1189
-rw------- 1 asterisk asterisk 139321344 Oct 19 16:11 core.7251
-rw------- 1 asterisk asterisk 162299904 Oct 19 17:31 core.27889
-rw------- 1 asterisk asterisk 136986624 Oct 19 23:25 core.22940

i hope somone could help :frowning:

got some more info :

#0 0x0807cd8b in ast_channel_defer_dtmf (chan=0x0) at channel.c:1050
1050 ast_set_flag(chan, AST_FLAG_DEFER_DTMF);

Core was generated by `/usr/sbin/asterisk -f -U asterisk -G asterisk -vvvg -c’.
Program terminated with signal 11, Segmentation fault.
#0 0x009c8dc3 in strcasecmp () from /lib/libc.so.6
(gdb) where
#0 0x009c8dc3 in strcasecmp () from /lib/libc.so.6
#1 0x001ceec1 in ast_bridge_call (chan=0x8a68708, peer=, config=0x158acd4) at res_features.c:2659
#2 0x008d3a53 in dial_exec_full (chan=0x8a68708, data=, peerflags=0x158ae64, continue_exec=0x0) at app_dial.c:1894
#3 0x008d4b92 in dial_exec (chan=0x8a68708, data=0x158ced8) at app_dial.c:1942
#4 0x080cf42b in pbx_exec (c=0x8a68708, con=0x0, context=0x8a68888 “macro-dialout-trunk”, exten=0x8a688d8 “s”, priority=28, label=0x0,
callerid=0xb3a01528 “442036080253”, action=E_SPAWN) at pbx.c:550
#5 pbx_extension_helper (c=0x8a68708, con=0x0, context=0x8a68888 “macro-dialout-trunk”, exten=0x8a688d8 “s”, priority=28, label=0x0,
callerid=0xb3a01528 “442036080253”, action=E_SPAWN) at pbx.c:1893
#6 0x004f22d9 in _macro_exec (chan=0x8a68708, data=0x1591f38, exclusive=0) at app_macro.c:352
#7 0x080cf42b in pbx_exec (c=0x8a68708, con=0x0, context=0x8a68888 “macro-dialout-trunk”, exten=0x8a688d8 “s”, priority=5, label=0x0,
callerid=0xb33650a0 “\220O6\263”, action=E_SPAWN) at pbx.c:550
#8 pbx_extension_helper (c=0x8a68708, con=0x0, context=0x8a68888 “macro-dialout-trunk”, exten=0x8a688d8 “s”, priority=5, label=0x0, callerid=0xb33650a0 “\220O6\263”,
action=E_SPAWN) at pbx.c:1893
#9 0x080d1ceb in ast_spawn_extension (c=0x8a68708) at pbx.c:2367
#10 __ast_pbx_run (c=0x8a68708) at pbx.c:2461
#11 0x080d2dde in pbx_thread (data=0x8a68708) at pbx.c:2688
#12 0x08103bfb in dummy_start (data=0x83cf840) at utils.c:856
#13 0x00ac1912 in start_thread () from /lib/libpthread.so.0
#14 0x00a2c4ae in clone () from /lib/libc.so.6

The channel is NULL when it is not expected to be. Why - I have no idea. Things have changed quite a lot since then, and it’s likely we’ve fixed it in later versions.

I don’t think you can say that the channel is NULL from the information provided.

On the other hand where this has failed suggests that the primary fault isn’t on the thread that crashed. As such it doesn’t give much clue as to what was the trigger, although it is likely to be something like a channel redirection or forwarding.

I’d probably approach it by trying to work out what happens every 2 hours. There is nothing in Asterisk itself that naturally happens at that interval.

The initial tiny backtrace showed a NULL channel when it was not expected to be and calling ast_set_flag with it will cause a crash. How it got there, dunno.

well thanks you all,
what have solved the crushes was to recompile the asterisk
and make in menu select *dont_optimaize
eneabled

for some reson this helped to stop the crushes.

Thanks!

That sounds like sloppy locking and synchronisation. Disabling optimisation forces the compiler to load late and store early, so, if there is an inadequate memory barrier, it may make operation more reliable. This may be a compiler memory barrier, but the other question is what hardware are you using. Intel hardware is very forgiving to cross thread accesses without locks, but ARM is a completely different matter.

In any case, that Asterisk is so old and you are unlikely to find and be able to avoid the problem part of the code.