SRV advertising redundant endpoints => Asterisk failing when one is down

Hi Gang

Ancient Asterisk 16.28 (sorry about that).

Situation tested in our Lab to find how clients handle such situations.

;; QUESTION SECTION:
;_sip._udp.dev-cpereg.devtel.imp.ch. IN SRV

;; ANSWER SECTION:
_sip._udp.dev-cpereg.devtel.imp.ch. 299 IN SRV 0 0 5060 dev-sbc01.devtel.imp.ch.
_sip._udp.dev-cpereg.devtel.imp.ch. 299 IN SRV 0 0 5060 dev-sbc02.devtel.imp.ch.

So when using the hostname “dev-cpereg.devtel.imp.ch” as endpoint and registrar, asterisk does perform SRV lookups and uses either of those two SBC for communication.

So for testing, one SBC is unreachable.

Adding qualify_frequency=60 I would have expected Asterisk to detect the one which is down and only use the reachable SBC.

Unfortunately I observer the contrary. Asterisk seems to randomly send OPTIONS to either of the two and is getting into a reachable / unreachable flapping situation for both, registrations AND outbound calls.

Have I missed some config options?

Even if the host being down is sending back icmp port unrechable, this does not cause asterisk to immediately use the other IP address when using UDP.

Using TCP/TLS seems to solve that issue (due to the working connection staying open I guess).

-Benoît-

Which channel driver?

On Thursday 02 January 2025 at 17:29:49, david551 via Asterisk Community
wrote:

Which channel driver?

I didn’t think chan_sip could even use SRV DNS records, so I had assumed this
was chan_pjsip.

Antony.


It may not seem obvious, but (6 x 5 + 5) x 5 - 55 equals 5!

                                               Please reply to the list;
                                                     please *don't* CC me.

My recollection is it that can use them, but only uses the first one, which could explain the problem.

Sorry, forgot to mention: pjsip.

You have the priority and weight the same for both records. You’re basically balancing them equally so each time an SRV lookup is performed, you’ll get the result you’re getting. You need to change this so the priority is 0 for one and perhaps 5 for the other which will make priority 0 be used first and if its down/not responding it will use the next priority.

So you need something like this: Note that 0 is the highest priority for SRV.

_sip._udp.dev-cpereg.devtel.imp.ch. 299 IN SRV 0 0 5060 dev-sbc01.devtel.imp.ch. 
_sip._udp.dev-cpereg.devtel.imp.ch. 299 IN SRV 5 0 5060 dev-sbc02.devtel.imp.ch.

Hi Blaze

Thank you, but in ‘normal’ operation, I would like to equally balance the traffic when both are up. If I prioritize one of them, it will get all traffic, right?

You can try modifying the weights, it will favor one more than the other but it may try srv2 if srv1 is down since they aren’t weighted the same.