And it seems to work! Asterisk switches between 192.168.0.100 and 101 pretty evenly.
However, when I do a failover test, taking one of those servers offline, Asterisk still tries it (for something like 20-30 seconds), before trying the other one, and it doesn’t appear to “remember” that this particular server was offline.
I’m wondering if there are some obvious settings that controls this behavior?
In particular I’d like to reduce the number of seconds Asterisk attempts to connect, and well, if Asterisk could be configured to remember that this particular server was offline for the next say, 5 minutes or something like that, that would be awesome as well.
Fair enough - reducing Timer B is okay for me. These Asterisk boxes only ever communicate to peers within their same LAN.
Are there any alternative ways of doing failover in this fashion? The only thing I’ve sort of been able to find is using DNS NAPTR/DNSSRV like I am here
You could have individually configured endpoints with individually monitored AORs that each have an IP address. That’s the first thing that comes to mind. The failover works best with connection oriented protocols, as they can be determined faster if unreachable or not.
No worries. I should look into if using TCP or TLS is possible anyway.
Just to confirm: If using TCP, Asterisk will use one connection per established sip call right, it’s not possible to reuse the same TCP connection for several SIP Dialogs?
Oh really? Excellent! One of the reasons we wanted to use UDP was to avoid potentially thousands of TCP Connections between these Kamailio boxes and the Asterisk boxes.
PJSIP looks for existing connections based on transport type + IP address + port. So if it needs to send a packet to “TLS 172.16.1.10 port 5061” then it’ll search for an active connection to there and if present use it.