Most likely cause: Network issues. Most likely network issue: NAT
Explanation:
If MOST or ALL your Asterisk serveres are behind NAT, seeing them go offline is not uncommon. This usually happens when the firewall has seen no traffic on the UDP port for a certain time, and decides to drop the entry from the NAT table.
Possible solutions: In no particular order.
Decrease the registration time if the devices register with each other. This will keep the NAT entry active.
Increase the UDP NAT timeout in the device performing NAT on the networks.
Change from SIP over UDP, to SIP over TCP, most NAT devices will keep TCP connections open until they see a FIN or RST packet.
In that case, bad connectivity between the devices could be the issue.
Also I noticed on your screenshot, that many IP addresses seems to be the same one…
Do you have each server register all it’s extensions to every other? Normally you’d just setup each server as one endpoint, and specify rules for which calls go to what endpoint.
Well, to bypass a lot of troubleshooting steps, you COULD try changing from using UDP to using TCP for the SIP traffic. TCP is more resilient to network instability, and might just solve the problem well enough for you.
How to change the transport with chan_sip, I have no idea. I’ve used pjsip for years. Perhaps time to migrate your config as well, as far as I remember, chan_sip is about to be removed entirely, from Asterisk.