Internet going down causes PBX to go down

We had to reboot our firewall 3 times yesterday and noticed our Asterisk server goes down when internet connection is lost.
I have a PRI and I expect these to work when internet doesn’t. I expect my 2 IAX connections not to work (remote offices) but should not bring down entire system.

Phones are not reachable and inbound calls are going a IVR instead of ringing.

Hard to troubleshoot since I can’t bring internet down to test. So looking for what I might try.

Thanks

Possibly a error with the DNS resolution and Chan_SIP cause that. Hard to find without debugging.

I checked the server and noticed the PBX is assigning a gateway to the phones. I removed that setting and restarted phone, unplugged internet connection and calls are working as they should.

Thanks for the tip.

This is an Asterisk issue
Description :

If Asterisk loses internet connectivity or DNS, it stops responding to all SIP devices and trunks, and all extensions lose connectivity. This bug has apparently been around since Asterisk 1.4, persisted through 1.6, and remains in 1.8

This is the Matt Jordan response

Asterisk uses synchronous hostname resolution when it performs a DNS lookup for a peer’s hostname. As such, if chan_sip has to resolve a hostname and the DNS server is not available, it can block the SIP do_monitor thread until the request times out. How long the call blocks is dependent on the system Asterisk is running on, but can be more then several seconds. Obviously, if a large number of peers have to be resolved and Asterisk enters this state, it will be come unresponsive to SIP traffic on a local intranet.

At this time, there we are not planning to implement a DNS cache or asynchronous DNS lookups in Asterisk. The best solution is to instead install and configure a local DNS cache on the system that Asterisk runs on - there are many very good ones available in all major Linux distributions (and I would imagine the same to be true for other Operating Systems). In the case where internet connectivity is lost, this should prevent long hostname resolution times as Asterisk will still hit the local DNS cache, as opposed to timing out. In Asterisk versions 1.8 and greater, you may also find that using the dnsmgr feature (which periodically refreshes DNS information on a separate background thread) will alleviate chan_sip from becoming unresponsive. Without having a local DNS cache, however, you may be simply trading which thread is blocked for a long period of time - so this is not a solution in and of itself.

If you can, please implement a local DNS cache on the system experiencing this behavior and retest, and let us know if this prevents the complete loss of SIP functionality when the DNS provider is no longer available (you’re obviously going to lose some SIP functionality

A real and fast solution to this problem, change any domain name to his current IP address. on the SiP trunk configuration


issues.asterisk.org/jira/browse/ASTERISK-18930

That issue has been suspended because the reporter stopped responding.

The synchronous nature of DNS comes from the original, Berkeley, implementation.

I’m not sure that an asynchronous lookup will always help, as it is a temporary failure case, and the requestor cannot assume that the name would not resolve, so should wait until it gets a definitive answer.

What if you add Asterisk server IP and provider domain into /etc/hosts? That way you resolve “localhost” and provider domain before the requests reaches the phase where you have to go ask the DNS server about which IP is behing a domain.

Depends on whether Asterisk does reverse DNS. If it does, a request that would fail, when there is a valid DNS connection, will get delayed when there isn’t.