Asterisk crashes due to Memory Allocation Failure on 2 boxes

NOTE: I’ve already searched multiple times through the forums and been unsuccessful in identifying any other threads that address this specific issue. Therefore, I have created a new thread.

NOTE: I am not a frequent user of forums so I’m still learning the do’s and don’ts of posting.

We have two different Asterisk servers running v1.6 which are randomly crashing. Both are running on very nice hardware – one, a production server running on an HP DL380 G2 (dual processor, 3GB RAM, 18GB RAID1) and the other a lab system running on top-notch workstation tower (Asus mobo w/Intel core 2 duo, 2GB RAM and 80GB drive). Both servers are SIP only (no Digium or other hardware for TDM traffic).

When these events happen, it appears that Asterisk just crashes and the process terminates. The fix has simply been to go in and restart Asterisk (/usr/sbin/asterisk) which seems to work fine until the next time it “randomly” crashes.

I say random because these two systems are both under ZERO user load – one of the systems is a test/development system and the crashes seems to happen randomly in the middle of the day or night. I cannot find/see any rhyme or reason to them.

I also thought it might be related to v1.6.0 so we upgraded one of the systems to v1.6.1 and we continue to see the problem. I’ve copied several of the events from /var/log/asterisk/messages below for everyone to see. Primarily, the Asterisk process seems to crash (die) in function ast_log more than any other function (even though other functions are also listed below).

[May 3 14:23:09] ERROR[24980] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function sip_alloc at line 5807 of chan_sip.c
[May 3 14:23:09] ERROR[24980] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_threadstorage_get at line 186 of /usr/src/asterisk-1.6.0/include/asterisk/threadstorage.h
[May 3 15:09:42] ERROR[24980] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function sip_alloc at line 5807 of chan_sip.c
[May 3 15:09:42] ERROR[24980] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_threadstorage_get at line 186 of /usr/src/asterisk-1.6.0/include/asterisk/threadstorage.h
[May 3 15:09:43] ERROR[24980] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function sip_alloc at line 5807 of chan_sip.c
[May 3 15:09:43] ERROR[24980] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_threadstorage_get at line 186 of /usr/src/asterisk-1.6.0/include/asterisk/threadstorage.h
[May 28 06:52:22] ERROR[3004] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_log at line 1049 of logger.c
[May 28 06:52:33] ERROR[3004] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_log at line 1049 of logger.c
[Jun 6 21:09:57] ERROR[21234] /usr/src/asterisk-1.6.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_log at line 1049 of logger.c
[Jun 26 21:57:19] ERROR[3650] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_log at line 1145 of logger.c
[Jul 6 00:42:02] ERROR[16342] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function __ao2_alloc at line 309 of astobj2.c
[Jul 6 00:42:14] ERROR[16342] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_log at line 1145 of logger.c
[Jul 6 00:42:23] ERROR[16342] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_log at line 1145 of logger.c
[Aug 2 21:28:43] ERROR[15799] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function ast_event_n
ew at line 936 of event.c
[Aug 2 21:33:44] ERROR[15819] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function __ao2_alloc
at line 309 of astobj2.c
[Aug 2 22:38:40] ERROR[15819] /usr/src/asterisk-1.6.1.0/include/asterisk/utils.h: Memory Allocation Failure in function __ao2_alloc
at line 309 of astobj2.c

I would assume it is a hardware problem except that it only seems to crash in function ast_log of utils.h on BOTH hardware boxes and it happens at times when there is NO traffic at all.

Any help that can be offered would be greatly appreciated. We are currently still evaluating Asterisk as an alternative to our current Mitel PBX and this issue is the only thing standing in the way of going live.

Thanks,
CARTER

The standard procedure in cases like this is to follow the procedure in doc/valgrind.txt and submit a bug report to issues.asterisk.org (after checking for existing ones). You probably also want to follow the procedures in doc/backtrace.txt.

Have you excluded the trivial case that you are simply out of memory (e.g. hit a ulimit limit)?

1144 : /* Create a new logging message */ 1145 : if (!(logmsg = ast_calloc(1, sizeof(*logmsg) + res + 1))) 1146 : return;

Actually, if you have corruption, you will normally get a sigabort termination, so this does look like a simple out of memory case.

Hi

Make sure that you rotate your logfiles, we have seen crashing when the messages or full logfile gets large.

Ian

First, thanks for the quick responses; I really appreciate everyone’s help.

So is it the case that I am running into a memory limit (RAM) or just a “memory” limit for the size of some actual log file somewhere? I wouldn’t expect a log file rotation to fix a RAM problem.

Unfortunately, I am a solid user of Linux, but not an expert or “sysadmin” (although I have root access and am the one the did the Asterisk installs). As such, I am not very familiar with ulimit (except to find out what OS version your are running?) or log rotation (although I understand the concept and use other programs that do/use log rotation).

I ran ulimit -a and got the following:

-bash-3.1$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
max nice                        (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 32762
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
max rt priority                 (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32762
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
-bash-3.1$ 

Since several posts have indicated this is likely a memory issue (or lack thereof), I am going to hold off on the doc/valgrind.txt and doc/backtrace.txt procedures.

On the particular system in question above, I ran the free command and got the result below:

-bash-3.1$ free
             total       used       free     shared    buffers     cached
Mem:       1944316     933512    1010804          0     202380     441284
-/+ buffers/cache:     289848    1654468
Swap:      2031608          0    2031608
-bash-3.1$ 

I will welcome any suggestions or recommendations about how to rotate [which?] log files and how to use ulimit to correctly configure memory limits so that I can cure this problem.

I figure it may help/make a difference if you know what OS I am running, so here it is: I am running RHEL5 on the lab server (all excerpts in this thread are from this server) and CentOS 5 on the production server.

From the lab server: -bash-3.1$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 5 (Tikanga) -bash-3.1$ cat /proc/version Linux version 2.6.18-8.el5 (brewbuilder@ls20-bc2-14.build.redhat.com) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-52)) #1 SMP Fri Jan 26 14:15:21 EST 2007 -bash-3.1$

From the production server: -bash-3.2$ cat /etc/redhat-release CentOS release 5.2 (Final) -bash-3.2$ cat /proc/version Linux version 2.6.18-92.el5 (mockbuild@builder16.centos.org) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)) #1 SMP Tue Jun 10 18:49:47 EDT 2008 -bash-3.2$

Is there anything else I could provide that would help you guys steer me in the right direction? I’m very familiar with the Linux man pages and am good with Google – so if I can understand the problem and what I need to do to fix it, I am optimistic I can find a way to make it happen. At the moment, I just don’t know what I am supposed to do with ulimit or which log file(s) need rotating.

CARTER

You don’t have a shortage of memory and your ulimits are unlimited for memory, so it could be some sort of corruption problem.

You have core dumps disabled, which will make debugging difficult if you actually have a crash.

Bump. I am reviewing doc/valgrind.txt and waiting for the crash to happen again so that I can provide additional information.

CARTER

Guys,

I followed the directions in doc/valgrind.txt to do a make menuselect (step #1) and add MALLOC_DEBUG and DONT_OPTIMIZE. After choosing save and exit, I did a make, then a make install (step #2).

Then, when I tried to run step #3 (valgrind --log-file-exactly=valgrind.txt asterisk -vvvvcg 2>malloc_debug.txt), I ran the command and nothing happened (no errors), both as my non-root user and as a superuser. No errors or any other status. I tried to connect to the remote console (/usr/sbin/asterisk -r) and it failed. Then I went back and updated the command to use --log-file option (instead of --log-file-exactly option) and I got the same behavior. I don’t understand what is happening or why and I don’t know where to look to find the answer. Here is a copy of the latest output:

[code]ostlap02CLI> stop now
ostlap02
CLI>
Disconnected from Asterisk server
Executing last minute cleanups
-bash-3.1$ pwd
/usr/local/mti/scripts
-bash-3.1$ cd
-bash-3.1$ pwd
/var/lib/asterisk
-bash-3.1$ valgrind --log-file-exactly=valgrind.txt asterisk -vvvvcg 2>malloc_debug.txt
-bash-3.1$ /usr/sbin/asterisk -r
Asterisk 1.6.1.0, Copyright © 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer markster@digium.com
Asterisk comes with ABSOLUTELY NO WARRANTY; type ‘core show warranty’ for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type ‘core show license’ for details.

== Parsing ‘/etc/asterisk/extconfig.conf’: == Found
Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)
-bash-3.1$ valgrind --log-file asterisk -vvvvcg 2>malloc_debug.txt
-bash-3.1$ /usr/sbin/asterisk
-bash-3.1$ /usr/sbin/asterisk -r
Asterisk 1.6.1.0, Copyright © 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer markster@digium.com
Asterisk comes with ABSOLUTELY NO WARRANTY; type ‘core show warranty’ for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type ‘core show license’ for details.

== Parsing ‘/etc/asterisk/extconfig.conf’: == Found
Connected to Asterisk 1.6.1.0 currently running on ostlap02 (pid = 2426)
Verbosity is at least 4
ostlap02CLI> stop now
ostlap02
CLI>
Disconnected from Asterisk server
Executing last minute cleanups
-bash-3.1$ su
Password:
[root@ostlap02 asterisk]# valgrind --log-file asterisk -vvvvcg 2>malloc_debug.txt
[root@ostlap02 asterisk]# /usr/sbin/asterisk -r
Asterisk 1.6.1.0, Copyright © 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer markster@digium.com
Asterisk comes with ABSOLUTELY NO WARRANTY; type ‘core show warranty’ for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type ‘core show license’ for details.

== Parsing ‘/etc/asterisk/extconfig.conf’: == Found
Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)
[root@ostlap02 asterisk]# cd /usr/sbin/
[root@ostlap02 sbin]# valgrind --log-file asterisk -vvvvcg 2>malloc_debug.txt
[root@ostlap02 sbin]# /usr/sbin/asterisk -r
Asterisk 1.6.1.0, Copyright © 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer markster@digium.com
Asterisk comes with ABSOLUTELY NO WARRANTY; type ‘core show warranty’ for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type ‘core show license’ for details.

== Parsing ‘/etc/asterisk/extconfig.conf’: == Found
Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)
[root@ostlap02 sbin]# exit
exit
-bash-3.1$ /usr/sbin/asterisk
-bash-3.1$ /usr/sbin/asterisk -r
Asterisk 1.6.1.0, Copyright © 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer markster@digium.com
Asterisk comes with ABSOLUTELY NO WARRANTY; type ‘core show warranty’ for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type ‘core show license’ for details.

== Parsing ‘/etc/asterisk/extconfig.conf’: == Found
Connected to Asterisk 1.6.1.0 currently running on ostlap02 (pid = 2534)
Verbosity is at least 4
ostlap02*CLI> exit
Executing last minute cleanups
-bash-3.1$
[/code]

So as you can see, I cannot figure out how to get asterisk to start using the valgrind command, but if I just run /usr/sbin/asterisk, it starts up fine (even as a regular user because I have asterisk configured to run as non-root).

Any additional help would be greatly appreciated.

Thanks,
CARTER

You don’t seem to have looked at the two log files.

David55 - Thanks! I didn’t understand what log files you were referring to at first, but I ended up figuring out you were referring to valgrind.txt and malloc_debug.txt. The malloc_debug.txt existed and said it couldn’t find asterisk. So I used the full path and it worked! :smiley:

That said, it is currently running in console mode and I really want/need it to run as a service if possible. Is it a problem if I ran the command as:

valgrind --log-file=valgrind.txt /usr/sbin/asterisk -vvvvg 2>malloc_debug.txt
instead of

valgrind --log-file=valgrind.txt /usr/sbin/asterisk -vvvvcg 2>malloc_debug.txt
as it’s currently running right now. Essentially, I am asking if valgrind requires asterisk to be running in console mode (instead of as a service)?

Please advise.

Thanks!!
CARTER[/code]

David55 - After thinking about it a bit, the problem is less about being in console mode (I tried it without console mode, and it looks like it still worked) and more about being able to start/run asterisk (preferably from a remote/SSH session) and not be required to keep the remote/SSH session open.

Normally, we run asterisk as a daemon (/etc/init.d/asterisk start) via “/sbin/chkconfig asterisk on” at boot up. We also have a script scheduled in crontab that checks the output of “/etc/init.d/asterisk status” and reports back to us, via email, if it sees that asterisk is not “running” (could be stopped or crashed).

By starting asterisk through valgrind, the etc/init.d script does not recognize it is running and we get notified every minute that asterisk is not “running”.

Of course, for now, I’ve simply remarked the cron job, but if there was a simple way to use valgrind with asterisk and still allow the cron job to work, that would be nice.

If you have any suggestions, please share. For now, asterisk is running through valgrind and I’m just waiting for a crash to occur. :smiley:

Thanks,
CARTER