Lost registrations from lost STUN server

I’m running the newest Git-master version.

I need to turn off this new feature because I’m getting lots of STUN errors and resulting lost outbound registrations:
https://gerrit.asterisk.org/c/asterisk/+/16145

It worked fine before this feature was added.

Which branch? In case it is the master branch, does the problem go away if you checkout 18.6?

It was the latest GIT master (Wed morning, US) I was using.
I used 18.3.0 rest of today and had no problems.
I suspect 18.4 will be okay also.
I’ll continue updating after a full day on each version.

That change impacts rtp.conf configuration for media, not for SIP registration or anything along those lines. Do you have a STUN server configured? Is it in use on calls? What messages were you seeing?

stun.c: Attempt 1 to send STUN request to ‘173…’ timed out.
stun.c: … request 2 timed out
stun.c: …request 3 timed out
func_curl.c: Resolving timed out after 10519 ms
res_pjsip_outbound_registration.c: Obtaining token failed


on and on

“Obtaining token failed” doesn’t exist in the res_pjsip_outbound_registration source code. Do you have a patched Asterisk?

As well, you should check that the configured STUN server is actually reachable since it sounds as if it’s not responding.

Patched version = No
STUN server configured in rtp.conf works in 18.3.0

I also found that in the Git master if I did a ‘core restart now’ all registrations would fail. If I did ‘core stop now’ and gave it 10 seconds then ‘sudo asterisk’ it worked normal until the errors started a few minutes (or a couple hours) later.

I’d suggest filing an issue[1] then with actual logging and configuration. It may be that the STUN address you’re using has multiple IP addresses, and one or some of them is not working resulting in failure in some cases.

[1] System Dashboard - Digium/Asterisk JIRA

I’ll do more testing first.
I just loaded 18.5.0 and it’s up so I’ll see if it has errors when I wake in the morning.

The change you reference is not in those versions, if that change really is the cause. It was only merged on the 1st and would be in the next releases.

Yes, I know, I’m not sure if it is that feature that’s causing the failure.
At that moment I was wishing that feature had an off switch for an easy test without it.

I’ve never downloaded previous (non current) git-master versions or I would have done that. I have all the hex version numbers recorded. Is that easy to do?

Yes. Give it to “git checkout” and it’ll get it.

I’m not sure what that means.
I used e8cda4b32c around the end of July and I think it worked good.

You do “git checkout e8cda4b32c” and you’re there.

Neat, works like a charm.
Update on testing:
I ran 18.5.0 for over 10 hours and it was stable.
I jumped into git-master 466eb4a52b from late August. Worked fine for 4 hours. ‘Core restart now’ works.

Then I compiled git-master 6cc004dc5a from Sept 1.
Early in the testing I tried a ‘core restart now’ and it worked.
I tried a ‘core restart now’ after 3.5 hours and it worked.

I compiled the newest Git-master version and it seemed fine and then I noticed a much higher than normal (> 50) packet loss. It was sometimes so bad the conversations were difficult. I rebooted the server and the packet loss was back to normal (2 and under). After 3 hours no failures. Maybe it just needed the reboot. It has probably been three months (maybe six). I didn’t think to do that because other versions did not have excessive packet loss.