Seeming random lockup of Asterisk

I have been seeing the pbx latch up at what appears to be random times. I’m not an expert by any means, but looking at the “full” log files I’m seeing a couple of things that seem indicative of an issue that I don’t know the source of. First, here’s the pbx info…

PBX
Asterisk: 1.2.18
Trixbox: V2.2
Kernel Version: 2.6.9-34.0.2.ELsmp (SMP)
Distro Name: CentOS release 4.4 (Final)

Hardware
Processors 2
Model Intel® Pentium® D CPU 2.66GHz
CPU Speed 2.67 GHz
Cache Size 1024 KB
System Bogomips 10674.09

Implementation
32 extensions with DID’s, VoIP supplied by Cbeyond

Here are excerps from the log files @ /var/log/asterisk/full that seem odd. First there are literally thousands of the following entries that occur throughout the day, even in the early morning hours. This seems independent of whether the PBX has active traffic or not.

Jun 25 04:04:19 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘2499b8fa078d121042ef11fa35a4a0dd@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:20 DEBUG[3340] chan_sip.c: Auto destroying call '7dc58f4147618d8e@10.0.0.122’
Jun 25 04:04:26 DEBUG[3340] chan_sip.c: Auto destroying call '23deed48753a709f620ebe9b23fbd700@127.0.0.1’
Jun 25 04:04:34 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘55543bbc7879455f39e48240630ddfb5@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:38 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘61f54b6b2b2a184719e70a9279239629@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:42 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘2adfe32b3bf185c055a92d803ad2d1fd@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:45 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘61ecdaeb107fc9d34d90082a4a2dc78b@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:45 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘33e30a510a2f8fb13347fc3a3a5f60b3@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:45 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘635c619409c1c00f08555b8874fa747a@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:45 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘4f0f44ba4d539dbe46ec94726c046750@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:46 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘4e687cf14f5c06446485ff84619da836@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:46 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘45a6feff060531111161b16f24e1005f@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:52 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘56fe87254872bfc2185bbec9280113eb@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:56 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘30c4021247f087fe29ffa55d30d95c14@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:56 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘55a53d8c1e3765fa6b894cf0638dbccc@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:57 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘6ef2e1e23f7ad7205585288c7ce2682c@10.0.0.10’ of Request 102: Match Found
Jun 25 04:04:58 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘49a48d5242de07147809069602dd3398@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘19cd964552baabb915d750df0c3bf23e@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘4ab2711019461e8b719b8c1452465526@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘31d47f042082c7655c84ad855f732816@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘6d9f76521c6063ee5ba11b5e6871daf8@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘5b874e4d1af7b06a22dad2494319b2f1@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘08e6ae4b45e6a29c4ad71eeb79c74098@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘00a582714929b50917b318b1779e2dc9@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘634f144231b51e3a3f95ff5322fc08f7@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘0967df8c23c7c4180d83aee95bc5993d@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘5226abdb7a123e8e4035373b71e17cd8@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘2955282334fa53eb0559270e034ae42e@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘1c10ee49187b31893575e6d1221f00e7@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:09 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘11a4c1e31cfe69671ea63480042dd28d@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:20 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘097b92d5630c512e0be1cd0e56b2e2e9@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:34 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘027289d7098e09196dc0a93b308b261a@10.0.0.10’ of Request 102: Match Found
Jun 25 04:05:38 DEBUG[3340] chan_sip.c: Scheduled a registration timeout for sipconnect.chi0.cbeyond.net id #154115
Jun 25 04:05:38 DEBUG[3340] chan_sip.c: Stopping retransmission on ‘23deed48753a709f620ebe9b23fbd700@127.0.0.1’ of Request 1039: Match Found
Jun 25 04:05:38 DEBUG[3340] chan_sip.c: Registration successful
Jun 25 04:05:38 DEBUG[3340] chan_sip.c: Cancelling timeout 154115

Like I said there are thousands of these debug statements, and I’m not sure why, but this doesn’t seem like it would be expected behavior especially given that they’re happening when no one would be using the system.

Second, prior to the latch up and reboot of the pbx (usually done by the customer) I’m seeing the following…

Jul 2 10:46:44 DEBUG[21617] channel.c: Dropping voice to exceptionally long queue on Local/424@from-internal-849e,2
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 DEBUG[3335] channel.c: Avoiding initial deadlock for 'SIP/424-0a1b6e50’
Jul 2 10:46:44 WARNING[3335] channel.c: Avoided initial deadlock for ‘0xa2de2e8’, 10 retries!

The following lines occur over 3400 times after that…

Jul 2 10:47:53 DEBUG[21617] channel.c: Dropping voice to exceptionally long queue on Local/424@from-internal-849e,2
Jul 2 10:48:14 WARNING[21917] interface.c: Junk at the beginning of frame 54414730
Jul 2 10:49:11 WARNING[21808] interface.c: Junk at the beginning of frame 54414730
Jul 2 10:49:57 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:49:57 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:49:57 DEBUG[21954] channel.c: Scheduling timer at 128 sample intervals
Jul 2 10:49:57 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:49:57 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:49:57 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:04 DEBUG[21954] channel.c: Scheduling timer at 14 sample intervals
Jul 2 10:50:04 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:04 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:04 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:06 DEBUG[21954] channel.c: Scheduling timer at 136 sample intervals
Jul 2 10:50:06 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:06 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:12 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:18 DEBUG[21954] channel.c: Scheduling timer at 14 sample intervals
Jul 2 10:50:18 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:18 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:18 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:20 DEBUG[21954] channel.c: Scheduling timer at 136 sample intervals
Jul 2 10:50:20 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:20 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:27 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:33 DEBUG[21954] channel.c: Scheduling timer at 14 sample intervals
Jul 2 10:50:33 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:33 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:33 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:35 DEBUG[21954] channel.c: Scheduling timer at 136 sample intervals
Jul 2 10:50:35 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:35 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:42 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:48 DEBUG[21954] channel.c: Scheduling timer at 14 sample intervals
Jul 2 10:50:48 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:48 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:48 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:50 DEBUG[21954] channel.c: Scheduling timer at 136 sample intervals
Jul 2 10:50:50 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:50 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:56 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:57 DEBUG[21954] channel.c: Scheduling timer at 154 sample intervals
Jul 2 10:50:57 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:57 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:57 DEBUG[21954] app.c: Locked path ‘/var/spool/asterisk/voicemail/default/424/INBOX’
Jul 2 10:50:57 DEBUG[21954] app.c: Unlocked path ‘/var/spool/asterisk/voicemail/default/424/INBOX’
Jul 2 10:50:57 DEBUG[21954] app_macro.c: Executed application: Voicemail
Jul 2 10:50:57 DEBUG[21954] app_macro.c: Executed application: Goto
Jul 2 10:50:57 DEBUG[21954] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:50:58 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:58 DEBUG[21954] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:50:58 DEBUG[21954] app_macro.c: Executed application: Playback
Jul 2 10:50:58 DEBUG[21954] cdr_addon_mysql.c: cdr_mysql: inserting a CDR record.
Jul 2 10:50:58 DEBUG[21954] cdr_addon_mysql.c: cdr_mysql: SQL command as follows: INSERT INTO cdr (calldate,clid,src,dst,dcontext,channel,dstchannel,lastapp,lastdata,duration,billsec,disposition,amaflags,accountcode,uniqueid) VALUES (‘2008-07-02 10:46:24’,’“Tania Zawadzki” <710>’,‘710’,‘424’,‘from-internal’, ‘SIP/710-b7a162b0’,‘SIP/424-b6d1d4d8’,‘Hangup’,’’,274,254,‘ANSWERED’,3,’’,‘1215013584.7980’)
Jul 2 10:50:58 DEBUG[21954] chan_sip.c: update_call_counter(710) - decrement call limit counter
Jul 2 10:51:25 DEBUG[21950] chan_sip.c: update_call_counter(13122960127) - decrement call limit counter
Jul 2 10:51:25 DEBUG[21950] chan_sip.c: Acked pending invite 103
Jul 2 10:51:25 DEBUG[21950] chan_sip.c: Stopping retransmission on '60b307d11e2e7df035250dec3ef24b84@sipco … beyond.net’ of Request 103: Match Found
Jul 2 10:51:25 DEBUG[21950] app_dial.c: Exiting with DIALSTATUS=NOANSWER.
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: Dial
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: Goto
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: NoOp
Jul 2 10:51:25 DEBUG[21950] channel.c: Prodding channel ‘SIP/478-b7a02c58’
Jul 2 10:51:25 DEBUG[21950] channel.c: Scheduling timer at 160 sample intervals
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: Playtones
Jul 2 10:51:25 DEBUG[21950] channel.c: Scheduling timer at 0 sample intervals
Jul 2 10:51:25 DEBUG[21950] cdr_addon_mysql.c: cdr_mysql: inserting a CDR record.
Jul 2 10:51:25 DEBUG[21950] cdr_addon_mysql.c: cdr_mysql: SQL command as follows: INSERT INTO cdr (calldate,clid,src,dst,dcontext,channel,dstchannel,lastapp,lastdata,duration,billsec,disposition,amaflags,accountcode,uniqueid) VALUES (‘2008-07-02 10:46:22’,‘7732901478’,‘7732901478’,‘913122960127’,‘from-internal’, ‘SIP/478-b7a02c58’,‘SIP/7739898100-0a1ecc00’,‘ResetCDR’,‘w’,303,0,‘NO ANSWER’,3,’’,‘1215013582.7978’)
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: ResetCDR
Jul 2 10:51:25 NOTICE[21950] cdr.c: CDR on channel ‘SIP/478-b7a02c58’ not posted
Jul 2 10:51:25 NOTICE[21950] cdr.c: CDR on channel ‘SIP/478-b7a02c58’ lacks end
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: NoCDR
Jul 2 10:51:25 DEBUG[21950] pbx.c: Expression result is '1’
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: GotoIf
Jul 2 10:51:25 DEBUG[21950] pbx.c: Expression result is '1’
Jul 2 10:51:25 DEBUG[21950] app_macro.c: Executed application: GotoIf
Jul 2 10:51:25 DEBUG[21950] chan_sip.c: update_call_counter(478) - decrement call limit counter
Jul 2 10:52:30 WARNING[21808] interface.c: Junk at the beginning of frame 54414730
Jul 2 10:57:57 DEBUG[21617] channel.c: Nobody there, continuing…
Jul 2 10:57:57 DEBUG[21808] channel.c: Nobody there, continuing…
Jul 2 10:57:57 DEBUG[21866] channel.c: Nobody there, continuing…
Jul 2 10:57:57 DEBUG[21917] channel.c: Nobody there, continuing…
Jul 2 11:00:40 VERBOSE[3327] logger.c: Asterisk Event Logger Started /var/log/asterisk/event_log
Jul 2 11:00:40 VERBOSE[3327] logger.c: Asterisk Dynamic Loader loading preload modules:
Jul 2 11:00:40 VERBOSE[3327] logger.c: Asterisk Management interface listening on port 5038

The entries that follow seem to be due to the reboot of the system.

I sure would appreciate it if someone could provide some guidance on this, like I said I’m somewhat new to Asterisk, and I apologize if I’ve missed something obvious, and to the length of this post but I wanted to provide as much info as I could… Please feel free to point my nose in the right direction. Also if any additional information would be helpful just ask and I’ll do my best to provide it.

Thanks in advance…

Hi

The only time I have seen the long queue message is when there are network issues.

heres the relevent code

/* Allow up to 96 voice frames outstanding, and up to 128 total frames */ 00929 if (((fin->frametype == AST_FRAME_VOICE) && (qlen > 96)) || (qlen > 128)) { 00930 if (fin->frametype != AST_FRAME_VOICE) { 00931 ast_log(LOG_WARNING, "Exceptionally long queue length queuing to %s\n", chan->name); 00932 ast_assert(fin->frametype == AST_FRAME_VOICE); 00933 } else { 00934 if (option_debug) 00935 ast_log(LOG_DEBUG, "Dropping voice to exceptionally long queue on %s\n", chan->name); 00936 ast_frfree(f); 00937 ast_channel_unlock(chan); 00938 return 0;

and the same goes for the retransmissions.

I think you need to check the network

Ian

[~trixbox] trixbox is a full linux distro that includes FreePBX and other 3rd party add-ons. It is all this extra stuff which makes trixbox VERY difficult to support, and is not supported in #asterisk. Try asking in #trixbox or on their forums & wiki at trixbox.org

that is straight from IRC. I am not trying to be mean but it is hard for everyone to deal with trixbox because of the extra software put into it

I’ve tried posting to both forums with no replies on either. It seems there’s some brew-haha happening between FreePBX and Trixbox, apparently as organizations they don’t work or play well together. I’m holding out hope that someone will respond, but so far not so good.

Thanks for the suggestions on checking the network, this PBX did have NIC issues about a month ago, but I thought that was corrected when the NIC was replaced… Well the major problem of the PBX going down was corrected, perhaps that wasn’t the only ghost in the machine. Any quick suggestions on the manner of testing I should employ to determine if the network is at fault?

I’ve “inherited” this pbx from another support person, and have found numerous things incorrectly configured, no regular backups, a ‘server’ class machine thats more of a high powered workstation that a real server (no RAID, no redundant power supplies or NICS), along with a rather upset customer. Needless to say it’s a mess. On top of that I’m just cutting my teeth on VoIP and not as of yet anywhere near an expert.

Oh well nuff whining, as it’s said “No problems no job.”

Thanks for taking the time to respond,
Ah

No its not, you just have to logical and look at each bit in on its own.

its still Asterisk and FreePBX in the end.

Ian

This really does sound like network connectivity\latency issues.

Most networking issues are found at the physical layer. You said you already swapped NIC’s but with what type was it a quality card or was it a bargin basement $10 card?

Have you tested and or replaced your network cables?

What are you running for hubs/switches?

Have you run any type of network sniffers to see whats up? Wireshark is a decent free one but personally I use and love Network Intruments Observer - definately not free but well worth it for the anaylsis help and excellent tech support.

Whats your connection to the internet?

Actually it’s an older 3Com card, and I’m thinking that may be suspect #1. I’ve read about the issues with NIC’s using the Realtek chipset, and as a general rule I avoid them in PBX implementations, thought the last support person did not. Now I’m finding out that the customer isn’t sure whether they want to change support vendors or not. Funny how well “spin” works on some people if you know what I mean. So really THAT’S the first issue I’d say… I’ve played the tech support “tug-o-war” before, and I’m not going there again, it’s us or them, your decision please Mr./Ms. customer…

But to answer your other questions…

The network wiring is newly installed and was tested at that time, at least from the standpoint of the PBX/switch/router physical layer. The switch is a 48 port 10/100/1000 (4 ports at 1G) Trendnet switch. The provider of both the connection to the cloud and the SIP accounts is Cbeyond.

Thanks for the recommendation on the Wireshark packet sniffer I’ve dl’ed it and am going to check it out.

Thanks for taking the time to respond with the helpful suggestions.
Ah.