Asterisk multiple processes

Hi,

Since my upgrade to asterisk 1.2.12.1, each time I start asterisk, I can see multiple processes :

32526 ? SLsl 0:00 /usr/sbin/asterisk -U asterisk 32536 ? Ss 0:00 /usr/sbin/asterisk -U asterisk 32538 ? Ss 0:00 /usr/sbin/asterisk -U asterisk

I don’t have this on another PC with same :
[ul]- OS : gentoo linux

  • zaptel version 1.2.8 (no ebuild for 1.2.9.1 and I’m not able to make mine working)
  • asterisk version 1.2.12.1
  • /etc/init.d/asterisk
  • /etc/conf.d/zaptel
  • /etc/conf.d/asterisk
  • tdm411P card[/ul]
    But not the same configuration (/etc/asterisk/*).

I didn’t have this problem with asterisk 1.2.11, and I suppose the problems I have now sometimes (asterisk freeze…) are due to those multiple processes.

Any idea for me please ?

PS : sorry for my bad english

I installed asterisk in deb - multiple processes too
I installed asterisk in fc5 - single process

work fine both. just curiuos about the nubmer of processes.

Thanks flash.

Nothing more the others ?

Let’s wait for the experts. Im n00b too.

I have this issue:

run: ps -o user,pid,ppid,command -ax | grep “/usr/sbin/asterisk”

asterisk 3186 3159 /usr/sbin/asterisk -U asterisk -G asterisk -vvvg
asterisk 22219 3186 /usr/sbin/asterisk -U asterisk -G asterisk -vvvg

This shows that the parent process is spawning the child asterisk process.

I discovered that without a DID number for my main number that asterisk spawned multiple copies of itself on start up and at certain junctures which I have not been able to find a reason in Asterisk logs or /messages.

After ensuring all incoming calls use a DID, this stopped on start up and now it is still a problem, but without cause.

I have one T110p Digium card, Asterisk 1.2.12.1 installed, and run safe_asterisk on an rpath flavor box.

Best I can do for now is create a script that kills the errant process.

The one thing the errant process has is that it displays “0:00” for its up-time:

asterisk 3186 1.1 1.4 25464 14004 ? Sl Oct15 46:32 /usr/sbin/asterisk -U asterisk -G asterisk -vvvg
asterisk 22219 0.0 1.4 25344 13952 ? S 13:27 0:00 /usr/sbin/asterisk -U asterisk -G asterisk -vvvg

Ok, this happened again this afternoon, and I happened to be watching the cpu usage.

The cpu hit 99-100 % for perhaps 20 seconds, and two asterisk processes were spawned from the parent.

The script I wrote terminated them safely. I run it every five minutes, and it only kills one errant process at a time.

Here it is:

#! /bin/bash

COUNT=ps aux | grep -c "0:00 /usr/sbin/asterisk"
#echo "asterisk process watch found $COUNT processes"
if [ “$COUNT” -gt 1 ]
then
echo "asterisk process watch found $COUNT processes"
DATA=ps aux | grep "0:00 /usr/sbin/asterisk"
PHPCOUNT=$DATA | grep -c php
if [ $PHPCOUNT = 0 ]
then
pid=$(echo $DATA | awk ‘/asterisk/{print $2}’)
echo "Killing the new unwanted process: $pid"
kill -9 $pid
echo "Killed, here is the data"
echo $DATA
fi
fi
exit 0

I could not view the processes that were peaking, as I was using vmstat. A tail of the asterisk and message logs does not tell me much.

So the cause is high processor usage, the error is * re-spawning itself, but the underlying cause of the spiking processor usage is still a mystery.

Jonathan Galpin

I am experiencing the exact same problems on a SLES 9.0 box. I first hit this when I performed a fresh load of 1.2.9. I then rolled back to 1.2.7 which has been running stable at other clients. It seemed to do much better but eventually it caused the same problem. I have recently upgraded to Asterisk 1.2.12.1 and it also does the same.

I’m using safe_asterisk from FreePBX and I will always start off with just one process and eventually I will end up with multiples.

I attached to the processes using gdb and it looks like a call recording check which has hung. I can only end these processes with a kill -9.

I’m also using FreePBX 2.1.2 which i know replaces some of the agi scripts (which is where the problem could also be). Are you using FreePBX at all?

I have dug through bugs.digium.com but can’t find anything similar, so I am considering opening a bug.

my gdb trace:

(gdb) info thread
1 Thread 1111317424 (LWP 9581) 0xffffe410 in ?? ()
(gdb) thread apply all bt full

Thread 1 (Thread 1111317424 (LWP 9581)):
#0 0xffffe410 in ?? ()
No symbol table info available.
#1 0x423bfe8c in ?? ()
No symbol table info available.
#2 0x0000003f in ?? ()
No symbol table info available.
#3 0x40018000 in ?? ()
No symbol table info available.
#4 0x40185f9b in __write_nocancel () from /lib/tls/libc.so.6
No symbol table info available.
#5 0x40137576 in _IO_new_file_write () from /lib/tls/libc.so.6
No symbol table info available.
#6 0x40137275 in new_do_write () from /lib/tls/libc.so.6
No symbol table info available.
#7 0x4013752f in _IO_new_do_write () from /lib/tls/libc.so.6
No symbol table info available.
#8 0x40138158 in _IO_new_file_overflow () from /lib/tls/libc.so.6
No symbol table info available.
#9 0x4013748d in _IO_new_file_xsputn () from /lib/tls/libc.so.6
No symbol table info available.
#10 0x4012da92 in fputs () from /lib/tls/libc.so.6
—Type to continue, or q to quit—
No symbol table info available.
#11 0x080c17ed in console_verboser (
s=0x812fb80 " – Registered indication country ‘nl’\n", pos=0,
replace=0, complete=1) at asterisk.c:974
tmp = “\033[1;30;40m – \033[0;37;40m\0005\000\000\000\f\017\037@ \030\037@\200?\022\b??;B\024?\023@ \030\037@+\000\000\000\f\017\037@+\000\000\000\200?\022\b??;B\226\032\024@”
#12 0x080581db in ast_verbose (
fmt=0x8118788 " – Registered indication country ‘%s’\n")
at logger.c:904
stuff = " – Registered indication country ‘nl’\n\000f’: Found\n\000ons Configuration)\n\000l-52e4,2\033[0;37;40m", “\033[1;35;40mrecordingcheck|20061019-203720|1161283040.10050\033[0;37;40m”) in new stack\n\0000;37;40m") in new st"…
len = 42
replacelast = 0
complete = 1
olen = 0
m = Variable “m” is not available.
#0 0xffffe410 in ?? ()

Apologies for the multiple posts I was getting the following error which led me to believe my post was not going through.

SQL Error : 1271 Illegal mix of collations for operation ’ IN ’

INSERT INTO phpbb_search_wordmatch (post_id, word_id, title_match) SELECT 34606, word_id, 0 FROM phpbb_search_wordlist WHERE word_text IN (‘experiencing’, ‘exact’, ‘same’, ‘problems’, ‘sles’, ‘0’, ‘box’, ‘first’, ‘hit’, ‘performed’, ‘fresh’, ‘load’, ‘1’, ‘9’, ‘rolled’, ‘back’, ‘7’, ‘running’, ‘stable’, ‘clients’, ‘seemed’, ‘eventually’, ‘caused’, ‘problem’, ‘recently’, ‘upgraded’, ‘asterisk’, ‘2’, ‘using’, ‘safeasterisk’, ‘freepbx’, ‘always’, ‘start’, ‘one’, ‘process’, ‘end’, ‘multiples’, ‘attached’, ‘processes’, ‘gdb’, ‘looks’, ‘call’, ‘recording’, ‘check’, ‘hung’, ‘kill’, ‘replaces’, ‘agi’, ‘scripts’, ‘dug’, ‘bugs’, ‘digium’, ‘com’, ‘anything’, ‘similar’, ‘i’, ‘considering’, ‘opening’, ‘bug’, ‘trace’, ‘info’, ‘thread’, ‘1111317424’, ‘lwp’, ‘9581’, ‘0xffffe410’, ‘apply’, ‘full’, ‘symbol’, ‘table’, ‘available’, ‘0x423bfe8c’, ‘0x0000003f’, ‘0x40018000’, ‘0x40185f9b’, ‘writenocancel’, ‘lib’, ‘tls’, ‘libc’, ‘6’, ‘0x40137576’, ‘ionewfilewrite’, ‘0x40137275’, ‘newdowrite’, ‘0x4013752f’, ‘ionewdowrite’, ‘0x40138158’, ‘ionewfileoverflow’, ‘0x4013748d’, ‘ionewfilexsputn’, ‘0x4012da92’, ‘fputs’, ‘type’, ‘return’, ‘continue’, ‘q’, ‘quit’, ‘0x080c17ed’, ‘consoleverboser’, ‘0x812fb80’, ‘quot’, ‘registered’, ‘indication’, ‘country’, ‘n’, ‘pos’, ‘replace’, ‘complete’, ‘974’, ‘tmp’, ‘033’, ‘30’, ‘40m’, ‘37’, ‘0005’, ‘000’, ‘017’, ‘037’, ‘030’, ‘200?’, ‘022’, ‘b??’, ‘024?’, ‘023’, ‘b??’, ‘226’, ‘032’, ‘024’, ‘0x080581db’, ‘astverbose’, ‘fmt’, ‘0x8118788’, ‘logger’, ‘904’, ‘stuff’, ‘000f’, ‘000ons’, ‘configuration’, ‘000l52e4’, ‘35’, ‘40mrecordingcheck’, ‘20061019203720’, ‘1161283040’, ‘10050’, ‘stack’, ‘0000’, ‘len’, ‘replacelast’, ‘olen’, ‘variable’)

Line : 251
File : functions_search.php

Ok, I still have the problem, but have a little more info to share in the hopes there is a fix.

I noticed that processes were spawned after soxmix ran and the processor peaked…so I stopped any recording. Soxmix merges two sound files after recording.

I believe I noticed the processor peaking and the processes spawing when tifftops was converting a tiff to a postscript for printing…I run hylafax as well.

I am using freepbx 2.1.1. I stopped the FOP to help isolate the issue, so I know it is not that.

Lately I have noticed the processes being spawned when the processor does not appear to peak and usage is very light such as after hours…

My script has occasionally killed the master asterisk, fortunately after-hours only, so no harm done, and safe_asterisk gets it up instantly…so the bash testing of the ps string is not always perfect.

I created another script which checks every minute for the issue and emails me a tail of the message log and the asterisk log, but so far no useful info.

I have one T1 line, about 1600 faxes a week and linux HA failover in place on a 1.3ghz machine.

Jonathan

My suspicion is that there is a problem with the recordingcheck agi script that comes with FreePBX. I have bypassed the recording check in extensions.conf and will see how it goes after a week or so.

I did come accross this link freepbx.org/trac/changeset/2143#file0 Which sounds very much like the problem we are seeing. I tried changing AGI to DeadAGI, but it did not help.

Bypassing recordingcheck did not solve my problem. dailparties.agi is now also producing additional asterisk processes. I have opened at ticket with FreePBX because at the moment it’s pointing at their scripts. There is however a bug (with a patch) on bugs.digium.com detailing similar problems.

freepbx.org/trac/ticket/1245
bugs.digium.com/view.php?id=8083

After some studying, I have concluded that the processes can also be spawned when the processor usage is between 65 and 75%.

That is with vmstat reporting every second. Yesterday, at least nine additional processes were spawned.

I have tuned my script a little since posting it to better identify the additional processes. Here it is:

#! /bin/bash

COUNT=ps aux | grep -c "0:00 /usr/sbin/asterisk -U asterisk -G asterisk -vvvg"
#echo "asterisk process watch found $COUNT processes"
if [ “$COUNT” -gt “1” ]
then
DATA=ps aux | grep "0:00 /usr/sbin/asterisk -U asterisk -G asterisk -vvvg"
PHPCOUNT=echo $DATA | grep -c php
if [ “$PHPCOUNT” -eq “0” ]
then
user=$(echo $DATA | awk ‘/asterisk/{print $1}’)
if [ “$user” == “root” ]; then
exit 0
fi
if [ -z “$DATA” ]; then
exit 0
fi
uptime=$(echo $DATA | awk ‘/asterisk/{print $10}’)
if [ “$uptime” != “0:00” ]; then
exit 0
fi
echo "asterisk process watch found $COUNT processes"
echo "uptime is $uptime"
pid=$(echo $DATA | awk ‘/asterisk/{print $2}’)
echo "Killing the new unwanted process: $pid"
kill -9 $pid
echo "Killed, here is the data"
echo "$DATA"
fi
fi
exit 0

Jonathan Galpin

I have been following this ticket and it seems the person who reported it claims the patch has fixed their problem. I have applied the patch to my Asterisk setup and so far it’s working for me too.

bugs.digium.com/view.php?id=8083

Hi emilec

Is there any reason why the patch cannot be applied to the * 1.2.13 version currently out?

Thanks for the pointer to the bug patch.

Jonathan

If i understand the bugpost correctly it was commited to the 1.2.12 tree, so it should be in 1.12.13. But it should be easy enough to check.

I have already had a process spawned this am, and over the weekend, so it is either not in the tree, or is not preventing the issue.

that patch isn’t in 1.2.13, but it’s easy enough to apply manually and recompile.

Thanks,

I applied the patch last night, and can say that this am, the additional processes have been increasing in their vigour.

So much so, that I had to start checking for them every 2 minutes with my cron job.

I did notice a dialplan ringroup that might have contributed. Basically, for a ringroup of fax modems, I had the group fail over to iteslf, the same ring group, meaning that the memory Hunt over three fax lines would continually loop.

Failing it to one of the fax modems had stopped the issue, and this has resulted in the process creation being slowed to perhaps normal.

So the patch exacerbated things…I’ll remove it tonight.

Jonathan

I am still digging into this. I have just found another bug post with a possible fix which I will try today. We have now also been able to relicate this quite easily using a queue with a ring all strategy and 6 static agents.

bugs.digium.com/view.php?id=8086

I have also found that when recording a system message the processes are reliably created.

They seem to come in pairs.

I beleive the recording agi might be the culprit?