Asterisk 13 uses closed TCP connection

I have a setup with 2 Android clients, based on PJSIP 2.6 using PJSUA2 API, calling each other through an Asterisk 13 gateway (FreePBX distro) on the same subnet. The caller party - D1 - sends the INVITE using TCP transport, due to the size of request >= 1300 bytes. The call can be properly established and everything works fine. During call initiation, all communication between asterisk and D1 is done through the TCP connection.

The problem occurs when the callee party - D2 - hangs up after more than 33 seconds. D2 send the BYE request to asterisk. Then asterisk attempts to send the BYE request to D1, but fails with a transport error. It turns out it’s trying to use the initial TCP connection to do so. But D1 has closed the TCP connection after 33 seconds, as this is the default idle time hard-coded into PJSIP. This results in D1 never being notified that the other party is hung up.

I’m trying to understand who’s at fault here. So any suggestion or hint would be appreciated. Thanks.

Has D1 sent a valid Contact header?

AFAIK, yes. Here’s the SIP trace:

[2019-05-22 12:43:05] VERBOSE[14943] res_pjsip_logger.c: <--- Received SIP request (1336 bytes) from TCP:10.254.200.96:35001 --->
INVITE sip:5092@10.254.65.200 SIP/2.0
Via: SIP/2.0/TCP 10.254.200.96:35001;rport;branch=z9hG4bKPj64011047-1b02-479b-a6b0-55dbf68bb2e6;alias
Max-Forwards: 70
From: sip:5096@10.254.65.200;tag=064ccd9f-061f-4053-957f-2ebbdef54963
To: sip:5092@10.254.65.200
Contact: <sip:5096@10.254.200.96:6000;ob>
Call-ID: 404d4338-5448-461e-a134-a1b9bbe2694b
CSeq: 1862 INVITE
Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, INFO, SUBSCRIBE, NOTIFY, REFER, MESSAGE, OPTIONS
Supported: replaces, 100rel, norefersub
Authorization: [snip]
Content-Type: application/sdp
Content-Length:   476
[snip]

On SIP you could use session timers to terminate this call, especifcally Timer F that is the maximum amount of time that a sender will wait for a non INVITE message to be acknowledged, in you case the BYE request, but there is not setting on pjsip to control such option

Thanks for the input. What I’m trying to understand first is how this is supposed to work in the first place.

Is the client right to close its idle TCP connection after 33s? Or is it supposed to keep it alive?

Is the asterisk server right to continue using the TCP connection, even though it has been signalled that the connection was closed? E.g. I can see the following in the asterisk logs:

[2019-05-22 12:43:40] DEBUG[14943] res_pjsip/pjsip_transport_events.c: Reliable transport 'tcps0x7f4e04058368' state:DISCONNECTED
[2019-05-22 12:43:40] DEBUG[14943] res_pjsip/pjsip_transport_events.c: Reliable transport 'tcps0x7f4e04058368' state:SHUTDOWN
[2019-05-22 12:43:40] DEBUG[14943] res_pjsip/pjsip_transport_events.c: Reliable transport 'tcps0x7f4e04058368' state:DESTROY

So basically, it does know the TCP connection has been closed. Nevertheless, later on it continues using it, as it seems from the following:

[2019-05-22 12:44:04] VERBOSE[17085] res_pjsip_logger.c: <--- Transmitting SIP request (468 bytes) to TCP:10.254.200.96:35001 --->
BYE sip:5096@10.254.200.96:35001;transport=TCP;ob SIP/2.0
Via: SIP/2.0/TCP 10.254.65.200:5060;rport;branch=z9hG4bKPjfbe0a793-e562-47f1-a3ad-c026a5b57ecf;alias
From: <sip:5092@10.254.65.200>;tag=02681a87-b013-47b7-8ea1-cb3caef54484
To: <sip:5096@10.254.65.200>;tag=064ccd9f-061f-4053-957f-2ebbdef54963
Call-ID: 404d4338-5448-461e-a134-a1b9bbe2694b
CSeq: 26826 BYE
Reason: Q.850;cause=16
Max-Forwards: 70
User-Agent: FPBX-14.0.11(13.22.0)
Content-Length:  0

Interesting questions, but I dont have the answer for such questions because there is not any public document that describe how it should work or at least I don’t know it , looking on the wiki I found a hint related to what you ask

Transport Selection

Connection-oriented protocols (such as TCP or TLS)

If the connection the request was received on is still open it is used to send the response.

If no connection exists or the connection is no longer open the first configured transport in pjsip.conf matching the transport type and address family is selected. It is instructed to establish a new connection to the destination IP address and port.

It depends on the client. Many require connection reuse, which the rewrite_contact option will do. This reuses the connection if available and if it is dropped then it can’t talk back to the client. Setting rewrite_contact will follow what the client says to use to talk back to it.

so rewrite_contact beside help us with NAT issue also deal with this TCP connection socket, and it makes sense because it Allow Contact header to be rewritten with the source IP address port.

The pjsip.conf already contains rewrite_contact=yes for all extensions. Does that mean it should be set to “no”? But then you’d loose NAT support?

If “rewrite_contact” is set to “no” then Asterisk will establish a TCP connection back to the device on the IP address and port provided in the Contact header. Whether this is what the phone expects/permits/allows, is an implementation detail on it.

If it doesn’t allow it then you have to set “rewrite_contact” to “yes” and keep the TCP connection up. We provide a keep alive interval option[1] to send a keep alive at an interval. Setting a qualify_frequency so OPTIONS goes out would also send SIP traffic out regularly and help to keep it alive.

[1] https://github.com/asterisk/asterisk/blob/master/configs/samples/pjsip.conf.sample#L1067

I tried with reqrite_contact=yes (which was already the case) and keep_alive_interval=20. I can see in the tcpdump captures that asterisk is sending \r\n sequences at the expected interval, but that doesn’t prevent the PJSIP client from closing the connection. I believe PJSIP expects relevant SIP traffic.

The setting qualify_setting=60 is also already set. But correct me if I’m wrong, the OPTIONS requests are sent on the UDP channel, not the TCP connection that was opened for the INVITE.

It is sent to the Contact on the AOR. If that is TCP, then it’ll use TCP.

The Contact does not specify the protocol. I’ve posted the INVITE trace previously in this thread. I’m happy to send a full logs for inspection:

caller = sip:5096@10.254.65.200
callee = sip:5092@10.254.65.200
first invite sent via UDP at 2019-05-22 12:43:05, challenged by asterisk
second invite sent via TCP just after
caller closes TCP connection at 2019-05-22 12:43:40 (33 secs after the last ACK for the INVITE response)
callee hangs up at 2019-05-22 12:44:04

Then qualify wouldn’t work, since it’s using a different transport for each. You could also try session timers[1] to keep it alive, that may work. It does seem as though you’re just fighting the client though.

[1] https://github.com/asterisk/asterisk/blob/master/configs/samples/pjsip.conf.sample#L684

What I really don’t understand at the moment is how is this supposed to work. The PJSIP client sends the INVITE through TCP, and then what? Is it supposed to keep the TCP connection opened? For how long? Because at the moment, PJSIP has a hard-coded timer of 33s for idle client TCP connections, for which there doesn’t seem to be any access from the application layer (PJSUA2). Do you consider that PJSIP is faulty and shouldn’t close the TCP connection?

It… depends? If it’s behind NAT, then it can not drop the TCP connection and expect things to work. If it’s the same network and the Contact provided to us is valid and rewrite_contact isn’t set to yes, then we’d connect back and it would be fine.

Ok, using rewite_contact=no seems to avoid the problem, the BYE request is sent back to D1 through UDP instead of using the stale/closed TCP connection. Thanks for the input.

For the record, I’ve tested the same scenario using a Linphone-based client, and interestingly enough it doesn’t switch to TCP for the second INVITE, even though it exceeds the 1300 bytes limit. Hence the problem does not occur, regardless of the settings rewrite_contact.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.