I am having a problem with Asterisk 188.8.131.52 where one of my servers is having difficulty cleaning up after one SIP channel executes a Dial() to another SIP channel within the extension logic, and then one of the two channels hangs up.
I’m guessing it has something to do with my extension logic itself since other Asterisk servers I’m running (and with much simpler dialplans) at the same version level do not have this problem, and their sip.conf settings are relatively similar. But I’ll be darned if I can figure out what it is.
The problem is that the extensions.conf for the particular server in question is a huge, sprawling thing that hooks into SQL databases and HTTP datastores using func_curl, so it wouldn’t be prudent to post it here nor do I suspect it would be particularly easy for someone to debug without giving them remote access to the server. What I’m hoping is that someone can help point me in the right direction after giving them the pertinent clues.
Here’s what I can tell you:
- I’m only dealing with SIP channels and the SIP channel driver.
- I’m not even using chan_local anywhere.
- I have some macros but I’m basically using them as poor-man’s subroutines.
- I don’t Dial() from within Macros.
- I have no ‘h’ extensions defined in any contexts.
…and here are the symptoms:
When a call (either inbound or outbound…doesn’t matter which leg of the call) is in progress, and one of the legs hangs up, Asterisk just sits there like a lump without sending a BYE message to the remaining active channel.
If you wait and watch, and the remaining channel keeps its RTP stream to Asterisk open (phone off-hook still), the remaining leg will FINALLY get sent a BYE from Asterisk exactly 30 seconds after the first leg hangs up. It doesn’t matter how long the call lasted (seconds or hours), you can count on needing to wait 30 seconds from the instant the first channel is dead. When this happens, the channel is cleaned up as expected.
If you hang up the remaining channel (put phone back on-hook) BEFORE 30 seconds are up, Asterisk really doesn’t like this. A few seconds after you do this, you will see a “WARNING: chan_sip.c: __sip_autodestruct: Autodestruct on dialog ‘[…]’ with owner in place (Method: BYE)” show on the console, and the BYE dialog will remain “stuck” in ‘sip show channels’ until you restart Asterisk. After a while, these really begin to accumulate.
It’s not a hangup detection issue…if I add a simple one-liner ‘h’ extension in the appropriate context that simply calls Wait(0), I see that this gets executed right when the first leg calls it quits. (Besides, I’m not bridging to a Zaptel/DAHDI interface or something else that can’t always properly detect hangup events or where they are ambiguous.) But it is exactly 30 seconds after the ‘h’ extension has finished executing that I finally see my “Spawn extension ([context], [exten], [#]) exited non-zero on '[channel]” (assuming the remaining channel hasn’t also called it quits before 30 seconds are up).
The question is, of course, what the heck could be causing Asterisk to wait 30 seconds between when it gets the BYE from the first channel and when it decides to send a BYE to the remaining open channel. I’ve scoured my extensions.conf file and have no explanation for this. My understanding from my research so far is that execution of the dialplan gets stayed by Dial() until the Dial application has exited (when one of the parties hangs up/says BYE), at which point it jumps to the local ‘h’ extension before it hangs up on the other party, and that sometimes you can gum up things by taking your sweet time within ‘h’. Well, I have no ‘h’ extension except when I put in a dummy one for debugging (the aforementioned “exten => h,1,Wait(0)”).
Any ideas what could cause something like this or what I can be looking for? Any help is much appreciated!