SIP Registration DNS errors

Hi!

I’ve had an Asterisk 13 setup working perfectly for the past two or three months, but for the past couple of days my SIP trunk will start failing to register after about an hour of the server running.

Disabling/Enabling the trunk will do nothing, but rebooting will bring it back up for another hour before it goes down again.

I haven’t changed anything from the configuration settings. It might be a provider issue but their support is fairly terrible so I’d rather troubleshoot everything I can before calling them.

Thanks for any help!

Here’s the error log:

[2016-07-05 02:11:14] NOTICE[2162] chan_sip.c: – Registration for ‘115555555@11555555’ timed out, trying again (Attempt #1036)
[2016-07-05 02:11:34] WARNING[2162] chan_sip.c: Probably a DNS error for registration to 115555555@11555555, trying REGISTER again (after 20 seconds)

You have a bogus domain name. You probably had working DNS before that was promptly responding that is is bougus, but that DNS is no longer responding.

If that’s the case then why would it start working again after a reboot?

Do you mean Asterisk reloading or rebooting an entire server?

How should this name be resolved? Could you show the following outputs:
cat /etc/hosts
cat /etc/resolv.conf
dig 11555555

I meant rebooting the whole VM, rebooting Asterisk (core restart gracefully) didn’t help.

I ended up doing a fresh install of Asterisk and with almost exactly the same settings (manually rebuilt, I didn’t use a config backup just in case) it seems to be working perfectly again.

My hunch is that FreePBX/Asterisk’s DNS cache got messed up somehow but I didn’t know how to rebuild it without reinstalling.

11555555 is just my way of obscuring my phone number/SIP username. The registration string is 11555555:SECRET@11555555/115555555

Out of curiousity, I spun up the old VM and run the commands you mentioned (replacing 11555555 with the actual number:

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

nameserver 127.0.0.1
nameserver 8.8.4.4
nameserver 8.8.8.8
namesever 192.168.1.1

When the problem started I only had “nameserver 192.168.1.1” for some reason and I added the rest just in case, but it didn’t help.

(I ran this command with the VM disconnected from the network)

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.23.rc1.e16_5.1 <<>> 115555555
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16621
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;115555555. IN A

;; ANSWER SECTION:
115555555. 0 IN NS 68.178.17.244

;; Query time: 13 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu July 7 13:32:05 2016
;; MSG SIZE rcvd: 44

And here’s the same command when run in the new VM, which works fine and which is connected to the network:

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.23.rc1.e16_5.1 <<>> 115555555
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64288
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;115555555. IN A

;; ANSWER SECTION:
115555555. 1798 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2016070700 1800 900 604800 86400

;; Query time: 43 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Thu July 7 13:33:49 2016
;; MSG SIZE rcvd: 103

In any case, the new VM is working fine so this is all academic, though it’d be nice to know what to do if it happens again.

Thanks for all the help!

So, you have a real domain name, not a bogus ‘11555555’. This sequence of digits was confusing.

Anyway, your DNS configuration is rather strange - you seem to be running a nameserver on a local machine (at least, on its loopback interface), somewhere else in the internal network (192.168.1.1) and using open Google name servers.

If the issue arises again, the whole DNS configuration (and underlying network / iptables configuration) should be reviewed - this problem is not directly related to Asterisk.