Network Issue, brainstorm with me please!

Hello

I have a network issue, quite tricky at least, and I’d like all suggestions to get to the bottom of it.

My asterisk is a 1.6.1.20 - clients connect to it from the internet using a FQDN (sip.x.tld) and I connect and register to some voip providers.

My asterisk has two nics, natted. eth0=192.168.1.1, and eth1=192.168.0.2. It always uses eth0, and everything is smooth.

However, occasionnally, eth0 provider goes down, and I manually switch to eth1 :

  • I change via ip route the default gateway,
  • edit sip.conf to change localnet, externip
  • change the DNS record that points from public ip of eth0 to public ip of eth1
  • restart asterisk

And, very quickly (low ttl in the dns), everybody starts using my backup link.

so far, so good.

but a few clients, and most of the voip providers I register to, are either unreachable or unstable (they lag or not answer).

A ngrep on any interfaces show that the packets are correctly sent, they look ok to me (no wrong external IP there), they are on the right interface, yet, no reply (packets are being resent by asterisk with the same CSEQ number).

I can ping / traceroute the IPs of these providers, no issues.

So, why would most of my clients get routed correctly, while large voip provider that do not filter on IP (siptraffic, etc…) get lots in space ???

Any thoughts will be welcomed !!!

J