TLS SIP trunk keeps closing socket every 60 seconds

kb7qdi · March 21, 2025, 2:34pm

Just looking for a little guidance on where to proceed next in my investigation. I appreciate any feedback.

I am running Asterisk 20.9.1 at three locations, all of them running Ubuntu 24.04 LTS. Each Asterisk location connects to our ITSP vendor via two TLS SIP trunks. We recently migrated them from UDP over an MPLS to TLS/SRTP direct over the internet.

SBC A is overall very stable, while SBC B has an issue where it closes their side of the TCP socket on 5061 every 60 seconds. New sockets get created soon after, but after another 60 seconds they send us a FIN, ACK.

The issue is based on timing - if we attempt to make an outbound call to SBC B, we can see us sending the INVITE, getting the 100 Trying and 183 Session Progress, then right after the socket closes the Asterisk server gives the Congestion error. In this particular scenario, the Asterisk server sends a 503 Service Unavailable to the originating server trying to make the call, we wait a few seconds then try again.

There is nothing in pjsip_wizard.conf that is different between SBC A or B except for the remote host IP address. I originally had aor/qualify_frequency set to 60 but I changed it to 10. I also see that both SBCs send us OPTIONS packets every 7-15 seconds. In both cases, the 200 OK replies are immediate.

I sent this up to our ITSP, who forwarded it to their SBC vendor (MetaSwitch). They are claiming it’s because the Asterisk servers are not sending [TCP Keep-Alive] packets to them, and because of that they are closing the socket.

It’s been a while, but from what I’ve read the only reason a [TCP Keep-Alive] packet would be sent is because if inactivity, and since both sides are sending multiple OPTIONS messages, there shouldn’t be a need. That being said, I did add the global keep_alive_interval to 15 seconds in pjsip.conf, and I do see that setting in “pjsip show settings”.

Not seeing any difference, I did some more research and found that I can change some OS values in Ubuntu/Linux relating to TCP keepalives:

net.ipv4.tcp_keepalive_time = 45 (was default 7200)
net.ipv4.tcp_keepalive_intvl = 15 (was default 75)

Even with all these changes I don’t see any different behavior with SBC B.

So, I guess I’m asking the following:

Do you believe I’m on the right path
Is there something in a PCAP or log that would confirm that either my global keep_alive_interval setting, or the OS settings, would show that I’m doing what I’m doing is correct?
Any other suggestions you might have in order to resolve this?

I can send snips of any logs you need. Thank you in advance.
Dean

kb7qdi · March 24, 2025, 12:12pm

I guess nevermind - MetaSwitch finally admitted it was on their end to fix, and after they made a change so far so good.

Sorry to have bothered anyone.

system · April 23, 2025, 12:13pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Chan_pjsip + tls/srtp gives unexpected tcp port closure (and/or crash)	1	1357	July 6, 2016
TLS SIP trunk via SBC...register line? Asterisk General	0	342	April 20, 2012
HTTP session_keep_alive not working for the websocket Asterisk WebRTC	3	105	August 16, 2024
Pjsip TCP listener fails on many asterisk versions Asterisk SIP	1	19	February 17, 2025
Asterisk stops accepting new TLS connections Asterisk SIP	7	343	August 3, 2023

TLS SIP trunk keeps closing socket every 60 seconds

Related topics