Hello,
I have an Asterisk system (version 13.7.2) with this problem:
Some times (multiple times during a day), a lot of peers become unreachable at the same time (a few,but not all phones from a lot of clients), then a few seconds later they all become reachable again.
qualify is set to yes and qualifyfreq is set to 25.
Here is an example of such an instance (for one peer, there were quite a few others that failed/recovered at the same time):
Asterisk log:
[Apr 3 10:27:18] NOTICE[1996]: chan_sip.c:29381 sip_poke_noanswer: Peer ‘123-xxx’ is now UNREACHABLE! Last qualify: 102
[Apr 3 10:27:29] NOTICE[1996]: chan_sip.c:23945 handle_response_peerpoke: Peer ‘123-xxx’ is now Reachable. (91ms / 2000ms)
tcpdump of SIP pakets:
10:26:43.550334 Server->peer OPTIONS sip:123-xxx@1.2.3.4:5063 SIP/2.0
10:26:43.644969 peer -> server SIP/2.0 200 OK
10:27:12.916477 Server->peer OPTIONS sip:123-xxx@1.2.3.4:5063 SIP/2.0
10:27:13.004472 peer -> server SIP/2.0 200 OK
10:27:18.718232 Server->peer OPTIONS sip:123-xxx@1.2.3.4:5063 SIP/2.0
10:27:18.750281 peer -> server SIP/2.0 200 OK
10:27:28.920816 Server->peer OPTIONS sip:123-xxx@1.2.3.4:5063 SIP/2.0
10:27:29.012460 peer -> server SIP/2.0 200 OK
As you can see, asterisk decided to probe the peer at 10:27:18 for some reason (25 seconds had not passed since the last probe) and the peer responded at 10:27:18, but was still declared unreachable by asterisk. Then, after next probe at 10:27:29, it was declared reachable again. The probes continued OK every 25 seconds after that.
Has anyone encountered this problem? What could be the reason for this weird behaviour?