Lost calls with asterisk upgrade

I have a weird problem at one of our customers. They had problems with some lost calls - i e they get dropped randomly. Sometimes after a minute, other times after half an hour, mostly they do not drop calls, but often enough to be a real problem.

We have looked through asterisks messages and can more or less only see that the calls get hung up - no actual error, warning or indication that something is wrong.

We tried a tcpdump and wireshark, but when we did the sound quality was so bad we could not get anything out of the test.

So yesterday we had set up a new asterisk on a new computer (in case something was shitty about the soft and/or hardware).
So from OpenSuse 11.1 with asterisk 1.6.0.15 upgraded to OpenSuse 11.2 (new kernel) with asterisk 1.6.0.18. All files copied from /etc/asterisk to the new comp. The old comp was removed from network and the new plugged in with the same IP.

The customer began to call some calls to control the functionality and it seemed to work. Some employees sat calling and it worked as it should. But when most of them arrived chaos was in their wake…

With ONE exception all agents calls got cut off after connection and a few seconds off talk. Max 50-60 seconds.

Obviously, something is terribly wrong with the setup. I’m thinking of some alternatives - please let me know what you think of this?

  1. Perhaps 1.6.0.18 interprets my AEL code differently than .0.15
  2. The new kernel does not like the new asterisk version.
  3. Some change is made in the asterisk code in 18 that alter the behaviour of my dialplan. (I noticed 4 changes in app_Dial for example)
  4. Perhaps my code is bad and (3.) changes just made it wrong every time.

They use X-Lite Phones (Latest 3 version)

In my AEL code we have three possible call types - I think all of them got calls cut off - here is one example of call type where I am certain it got cut off. They all use a Dial command with macro option. The CURL commands are calls to our database.

[code]context macro-answer_power {
s => {
reply=${CURL(http://${IP}/fmi/xml/FMPXMLRESULT.xml?-db=cm_data&-lay=data_User&-script.prefind=RT_Relogin&-recid=${ARG1}&user_AsteriskNumStatus_t=ANSWER&user_AsteriskChannel_t=${CHAN}&user_AstUser=${EXTID}&-edit)};
}
}

context out_cmpower_web {
1 => {
reply=${CURL(http://${IP}/fmi/xml/FMPXMLRESULT.xml?-db=cm_data&-lay=data_User&-script.prefind=RT_Relogin&-recid=${USERID}&user_AsteriskNumStatus_t=DIALING&-edit)};
Dial(SIP/${EXTID});
}
}

    s => {
	Set(__CHAN=${CHANNEL});
	NoOp(${BRIDGEPEER});
	NoOp(PowerDialling OUT - s-branch);
            Dial(SIP/${PHONENR}@${EXTID}extout,${TIMEOUT},mgM(answer_power^${USERID}));
	if(${DIALSTATUS}!=ANSWER){
	    Set(reply=${CURL(http://${IP}/fmi/xml/FMPXMLRESULT.xml?-db=cm_data&-lay=data_User&-script.prefind=RT_Relogin&-recid=${USERID}&user_AsteriskNumStatus_t=${DIALSTATUS}&-edit)});
	}
            Hangup;
    }
    failed => {
            reply=${CURL(http://${IP}/fmi/xml/FMPXMLRESULT.xml?-db=cm_data&-lay=data_User&-script.prefind=RT_Relogin&-recid=${USERID}&user_AsteriskNumStatus_t=NOANSWER&-edit)};
    }

h => {
	NoOp(PowerDialling OUT - h-branch);
	reply=${CURL(http://${IP}/fmi/xml/FMPXMLRESULT.xml?-db=cm_data&-lay=data_User&-script.prefind=RT_Relogin&-recid=${USERID}&user_AsteriskNumStatus_t=NOANSWER&user_AsteriskChannel_t=&-edit)};
}

}[/code]

Please help! I am now to test their setup at my office to se if I can recreate the problem and then downgrade asterisk to see if problem persists. Then downgrade Linux to earlier kernel.

Ok - I now have the answer.

After setting up a 1.6.0.18 at my office with the same files as my customer i started dialling…
with 100% accuracy all calls was hung up after X seconds when I had Advanced Network Options:
“Preserve bandwith during silence periods"
and
"In times of network disruption, automatically hang up calls after” - “X seconds”

Thoose settings with 1.6.0.15 did not cause this behaviour.

/Tomas