Timeout doing dnsmgr_lookup causing lockup

I am using FreePBX 2.10.0-99 with Asterisk 1.8.20.1-34_centos6.

I have had problems that at times the phones would not work at all. They would wait for some time before initiating a call. I put sip into debug mode and discovered a number of entries similar to:

[2013-03-11 15:59:38] VERBOSE[8967] chan_sip.c: --- (8 headers 0 lines) --- [2013-03-11 15:59:38] VERBOSE[8967] chan_sip.c: Responding to challenge, registration to domain/host name myprovider.org [2013-03-11 15:59:38] VERBOSE[8967] dnsmgr.c: > doing dnsmgr_lookup for 'myprovider.org' [2013-03-11 15:59:58] VERBOSE[8967] chan_sip.c: REGISTER 11 headers, 0 lines [2013-03-11 15:59:58] VERBOSE[8967] chan_sip.c: Reliably Transmitting (NAT) to XXX.XXX.XXX.XXX:5060: REGISTER sip:myprovider.org SIP/2.0

What I noticed is that between 15:59:38 and 15:59:58 nothing happens on the system while the the dnsmgr_lookup is happening and this is the period when phones do not work at all. It seems that the lookup times out after 20 seconds and the system goes back to working successfully.

I have circumvented this problem by adding “srvlookup=no” to sip.conf. I am still slightly concerned that I may need srvlookup some time and so would like to resolve the underlying problem. I am planning to do some network tracing to see what is happening, but before I embark on that I would like to check that I am working in the correct direction.

One thing that I have noticed is that ping to my service provider never completes:

ping myprovider.org PING myprovider.org (XXX.XXX.XXX.XXX) 56(84) bytes of data. ^C --- myprovider.org ping statistics --- 19 packets transmitted, 0 received, 100% packet loss, time 18652ms

and I was wondering whether dnsmgr_lookup was trying a ping and then timing out. Is that likely to be related?

Charles.

I would recommend monitoring the network traffic, so you will see what exactly goes on the network part when the message happens.

If the problem is in a slowly-responsive DNS server, you can manually add an entry for myprovider.org in /etc/hosts file.

Thanks dejanst,

I am going to have to trace the network traffic.

DNS server responds quickly and I have added the IP name to /etc/hosts just in case. There is definitely something timeing out here, just need to find what it is. The fact that a ping does not respond still makes me a bit suspicious, I’ll investigate in more detail.

Thanks,

Charles.

More information on this in case anybody else finds it useful.

I think my problem is in ast_get_srv. If I run exactly the same system behind a different firewall everything works OK. I think the firewall on our router is blocking some traffic in ast_get_srv. I’ll investigate more.

Charles.