SIP XFERs via reINVITE/REFER and order of NOTIFYs

tron · April 13, 2020, 6:25pm

I’m trying to troubleshoot some MSTeams integration hiccups.
I’m using an Asterisk box to interface between an ipPBX and MS Teams, and sort of works. But sometimes MS Teams loose track of a call put on hold.
(i.e. you can not resume it)

HOLD is done by a reINVITE, resume is done with a REFER.
But it seems Teams does not like some NOTIFYs, and respond with 500 to some.
Oddly, the (console) debug output is not in the order I would expect, with a Ringing showing before a Trying, and then an OK, although Seq numbers are in the expected order. And the 500 response is to the “out of order” NOTIFY, which is courious…

So question: is the console output synchronous to actual messages ?
If they are, what would make an out of order TX possible ?

TIA,
-Carlos

jcolp · April 13, 2020, 6:35pm

Console output is in the order that things occur. As to being out of order, without a complete SIP trace I can’t really say. A REFER also isn’t used for resuming exactly, it’s used for transferring.

tron · April 13, 2020, 6:58pm

The being out of order comes from 2 places:
-I expect Trying happens before Ringing
-The Trying event has a CSeq: 32434 NOTIFY and the Ringing event has a CSeq: 32435 NOTIFY

Order that “things occur” might not be a good definition here.
I don’t really want to post the trace w/o sanitizing it, I don’t mind sending it to you direct, but that’s what it happened, more than once.

Fully · April 13, 2020, 7:03pm

Are you having problems with blind and consult transfers AND just putting a call on hold? hold uses Inactive and this can lead to the call being put on hold forever and MS Teams hanging up the call. Does the call disappear in Teams but stay on hold on pbx handset? or just to the pstn? where does the pstn terminate?

Fully · April 13, 2020, 7:17pm

some more info : MS Teams Proxy limitations

RFC and sections	Description	Deviation
RFC 6337, section 5.3 Hold and Resume of Media	RFC allows using “a=inactive”, “a=sendonly”, a=recvonly” to place a call on hold.	The SIP proxy only supports “a=inactive” and does not understand if the SBC sends “a=sendonly” or “a=recvonly”

tron · April 13, 2020, 7:42pm

I was just testing, transfers (attendant initiated) seem to work.
I called myself in a teams extension and put the call on hold. Teams sent a reINVITE with a=inactive , Asterisk gave me music. Fine.

Then I tried to resume and … it works sometimes. When it does not work, Teams goes into an inconsistent state with the call staying in a non working mood, and that seems to be correlated to a “500 Internal Server Error” that is the answer to an out of order NOTIFY.

PSTN is on another SIP trunk. PSTN -> Asterisk -> Teams.
(Actually PSTN -> CUBE -> Asterisk -> Teams)

Fully · April 13, 2020, 7:49pm

what version of Asterisk? Latest? i upgraded mine to latest due to some inconsistencies in call handling with MS Teams.

tron · April 13, 2020, 8:03pm

Yep, 17.3.
Trace does not fit forum limits, pasted here:
https://pastebin.com/QaCtsWFE

Fully · April 13, 2020, 8:24pm

be interested to see if you get same issue with a sip phone registered to Asterisk. what is the endpoint registered to you are testing? or is it just pstn to cisco cube to asterisk? ill test on my setup as well when i get a chance.

tron · April 13, 2020, 8:42pm

There are no phones registered to Asterisk in this setup…

david551 · April 14, 2020, 9:41am

Trying NOTIFY does seem to have been sent out of sequence (and with its CSEQ out of sequence).

Is this chan_sip or chan_pjsip. I can’t think how the former could do this, as I think everything relevant is on one thread.

It should be harmless, though.

tron · April 14, 2020, 10:07am

Yup, you are right, this is pjsip, I should have noted that.
It SHOULD be harmless, we agree, but if you have a strict state machine, it could be that you do not expect a Trying after you already got a Ringing, right ?
And that 500 sounds (to me at this stage) kind of pointing at that.

I do not know the working of the channel, but this “slippage” or reordering needs some kind of buffer somewhere. And it would seem either locking or ownership or sorting mech ?

jcolp · April 14, 2020, 10:10am

It doesn’t need a buffer, the messages themselves aren’t out of order. The code which produces the messages may be generating them out of order in the first place. A change went in[1] which touched this area of code, so it’s possible your specific scenario exposed an issue with it. If you undo the change and rebuild Asterisk and it is fixed, then that is the problem and you’d need to file an issue[2] with all the information including packet trace and console output.

[1] https://gerrit.asterisk.org/c/asterisk/+/13852
[2] https://issues.asterisk.org/jira

tron · April 14, 2020, 10:14am

Ok, I will try that, but could you please tell us how the code would generate with inverted sequence numbers, like in 2 1 3 ?
I do not understand your saying “the messages aren’t out of order” for that sequence.

jcolp · April 14, 2020, 10:18am

I don’t know, but since that code was recently touched I’d rather investigate that first to see if it is indeed the problem and having some kind of weird interaction. If the problem still occurs, then an issue would still need to be filed with the information I mentioned and there would be no time frame on when it would be resolved.

david551 · April 14, 2020, 10:39am

A correct implementation of the UAS protocol will send 500 and ignore the out of sequence message, so won’t be confused. Trying is completely optional.

The only thing that might change state is Asterisk itself, in response to the 500.

tron · April 14, 2020, 11:15am

David,
you are assuming a correct implementation, that would be an ideal world, right ?
As someone used to say, better be strict on TX, tolerant on RX.
The notifies are being pushed into a task pool, so there is a buffer in the sense of my previous message. The system seems to track state related threads to avoid reordering, but AFAIK there is evidence that it does not. Given that I have access to this side source and not the other…I would rather omit the Trying altogether (the other side has the REFER accepted, so no new info is really there in the Trying, is there ?)

tron · April 14, 2020, 12:46pm

Ok, there is more to it definetelly.
There is specific code to send a 100 Trying if it was not already sent (NOTIFY, that is). And this code is the one somehow sending in rapid sucesison the Trying/Ringing that gets swapped.
I kind of disabled that (pretended a Trying was already sent when generating the Ringing) and the OOO Trying/500 went away. Bad news is that hold call still gets stuck.

Someone putting an effort in sending the Trying means it is not that “completely” optional, it seems, at least on that someone’s view.

tron · April 14, 2020, 1:17pm

Hats off to Joshua. Reverting 13852 fixes the problem.
The out of order Trying is still inserted, but now it does not generate a 500 response.
(nor does the call stay in limbo at Teams after resume is tried).

Thanks.

system · May 14, 2020, 1:17pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Asterisk22 Refer notify behaviour Asterisk SIP	13	87	July 27, 2025
Microsoft Teams integration possible? Asterisk Integration	47	33233	February 5, 2024
Conferencing with REFER Asterisk SIP	8	687	February 26, 2022
SIP transfer with the REFER / NOTIFY methods acc. to RFC3515 Asterisk Support	1	943	January 10, 2006
Big problem with Queue Asterisk Support	18	2364	June 13, 2017

SIP XFERs via reINVITE/REFER and order of NOTIFYs

Related topics