Asterisk 16 HTTP WSS connections remain established

Happy New Year!

In Asterisk 16 we’re seeing TCP connections kept open when a WSS client continuously registers over and over using a different source port (assuming some kind of NAT issue on the client’s end).

When this happens, Asterisk doesn’t release the associated connection and the http.conf session_limit (100 is the default) is reached, web sockets become unresponsive with an associated log entry “HTTP session count exceeded 100 sessions”.

Asterisk does eventually release the socket exactly 15 minutes later outputting to the CLI:

[Dec 29 15:08:35] ERROR[23690]: res_http_websocket.c:531 ws_safe_read: Error reading from web socket: Connection timed out
[Dec 29 15:08:35] ERROR[29866]: iostream.c:552 ast_iostream_close: SSL_shutdown() failed: error:00000005:lib(0):func(0):DH lib, Underlying BIO error: Bad file descriptor

The only thing I can find that has the default number “15” is in http.conf’s session_keep_alive=15000 but this should be milliseconds.

I’ve tried the various PJSIP aor settings (minimum_expiration, default_expiration, max_expiration) without success in an attempt to get this under control (since we can’t control the client).

Any suggestions or help appreciated :slight_smile:

Do you have qualify_frequency set on the AOR? I’d expect that would cause the connection drop to be detected sooner, since it would be trying to send a packet at an interval.

Yep!

 aor/qualify_frequency=60
 aor/qualify_timeout=3.0

Here’s the relevant bits from pjsip_wizard.conf:

[extension](!)
type=wizard
accepts_auth=yes
accepts_registrations=yes
endpoint/rewrite_contact=yes
endpoint/direct_media=no
endpoint/rtp_symmetric=yes
endpoint/force_rport=yes
endpoint/send_pai=yes
endpoint/send_rpid=yes
endpoint/rpid_immediate=yes
endpoint/connected_line_method=update
endpoint/max_audio_streams=16
endpoint/max_video_streams=16
endpoint/inband_progress=yes
aor/max_contacts=4
aor/remove_existing=yes
aor/qualify_frequency=60
aor/qualify_timeout=3.0

[someendpoint)(extension)
endpoint/context=somecontext
endpoint/webrtc=yes
inbound_auth/username=someusername
inbound_auth/password=somepassword
aor/minimum_expiration=180
aor/default_expiration=180
aor/maximum_expiration=180

Have you done a “pjsip set logger on” to see if it’s going out? Does the Contact get removed or replaced?

The contact does get re-written and the last 4 contacts remain (per aor/max_contacts=4) but the actual TCP connection remains established.

For example, in just a few seconds this happens, asterisk eventually clears them exactly 15 minutes later with that CLI message above:

tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57757                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57734                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57675                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57733                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57666                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57656                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57660                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57747                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57686                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57728                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57693                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57714                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57667                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57707                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57665                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57699                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57711                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57702                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57721                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57705                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57684                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57717                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57703                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57652                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57672                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57741                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57679                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57722                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57740                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57696                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57724                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57687                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57701                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57695                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57680                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57737                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57751                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57739                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57657                
tcp    ESTAB      0      137    10.9.8.7:https                1.2.3.4:57759                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57732                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57712                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57700                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57691                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57655                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57756                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57654                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57715                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57746                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57726                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57662                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57718                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57673                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57719                
tcp    ESTAB      0      1272   10.9.8.7:https                1.2.3.4:57752                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57749                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57750                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57678                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57698                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57753                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57710                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57694                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57720                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57736                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57713                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57670                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57677                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57706                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57745                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57716                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57754                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57704                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57669                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57758                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57689                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57729                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57682                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57743                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57697                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57663                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57735                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57748                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57676                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57742                
tcp    ESTAB      0      0      10.9.8.7:https                1.2.3.4:57709

Curious as to what calls res_http_websocket to check the connection if the contact no longer exists at that 15 minute interval.

I don’t think anything calls it, and I’m not sure why it would stop at 15 minutes unless it was happening at the TCP level.

Do you have keep_alive_interval set in the global section, or is it the default of 90 seconds?

I think you would need to packet capture and look at the actual TCP connections and what is going on with them to double check things.

The PJSIP keep_alive_interval is not explicitly set, so should default to 90. I don’t think this is PJSIP related.

I think the built-in http server is holding open the TCP port, there’s no further traffic. If I restart Asterisk, the ports get closed. This is well after I’ve blocked the offending ip address.

When not blocked, only the last 4 registered contacts IP and Port have traffic (as expected) but the http sessionlimit is what makes it unresponsive so http or res_http_websocket is counting those connections and eventually, 15 minutes on the dot, cleaning them up.

So the issue really appears to be http holding the ports open.

Obviously, preventing the client behavior resolves it. Assuming it won’t hold non-authenticated sessions open like this or it’s a DoS opportunity.

However, with roaming end-users, network to network, this is bound to happen. Somebody probably has some crazy double nat setup where they aren’t getting the responses and the client just rapid fires off additional attempts.

The HTTP server doesn’t hold the connection at all once the websocket is established. It is passed to res_pjsip_transport_websocket which becomes the owner and waits for any data to come in on it[1]. I would expect that if we were to send a packet and the connection is closed, then that thread should wake up with a failure and it would close. I could be wrong though, TCP is not my specialty.

[1] https://github.com/asterisk/asterisk/blob/master/res/res_pjsip_transport_websocket.c#L395

Okay, so in summary where the client can’t receive the SIP messages via the web socket, in this case looks like because of a double NAT scenario on their end where the 200 OK isn’t being received by the device that initiated the connection, so the client tries again in rapid-fire succession getting ahead of http.conf’s sessioncount denying further connections until the dead sockets are cleaned up at some point minutes later.

Not sure where to go from here as far as Asterisk is concerned, should I open a bug report?

In this case, we control the client so we can just adjust the settings preventing the rapid-fire of packets that inevitably lead to essentially a DoS situation.

You can open a bug report, but you’ll need to provide a packet capture to show the TCP connections and what are going on with them.