Today two customers seized to work beacuse the SipTrunk becale Unreachable
However the OPTIONS request was not sent - I got this error instead:
ERROR[63] res_pjsip.c: Error 320047 ‘No answer record in the DNS response (PJLIB_UTIL_EDNSNOANSWERREC)’ sending OPTIONS request to endpoint 51_vk102136zylXXX
Tried disabling Qualify, but then the INVITES were not sent
Inbound traffic worked fine, but outbound requests never worked.
Restarting Asterisk fixed the issue, but this naturally also dropped all calls which is not a solution that our customers likes.
Do you have any clue to what could be the root cause?
The root cause would be either an upstream DNS server failure for what it was trying to contact or reach, or that server returning no records. Every time I’ve seen this come up it’s been something upstream in some way. Turning on debug logging will also tell you what the DNS resolver is doing and what it got back.
Is Debug 1 enough to get this information, I should I set it higher?
“Unfortunately” everything works fine after a restart, so I can’t see how this can be a problem with the external DNS server, since restarting Asterisk fixes the issue.
Is there a way that I can restart or reload only the DNS part without dropping ongoing calls?
Debug level 2 would be needed I believe. And restarting Asterisk will clear any state including DNS cache. There is no functionality to “restart” the DNS portion.
What is the output of “module show like resolver”?
If a resolver module isn’t loaded then it falls back to the underlying system primitives. It will still work regardless, but it just nudges my analysis even further into something outside of Asterisk in some way.
I just found a third customer with same problem. This customer isn’t as busy as the others, so a lot easier to debug.
The debug log shows this (with debug set to 5):
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_options.c:927 sip_options_qualify_aor: Qualifying all contacts on AOR ‘4_TDCNuudaySIPTrunk’
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_options.c:856 sip_options_qualify_contact: Qualifying contact ‘4_TDCNuudaySIPTrunk@@4ddd0a01390170f2f735d1e9d2740f2b’ on AOR ‘4_TDCNuudaySIPTrunk’
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1566 endpt_send_request: 0x7facb4018ce0: Wrapper created
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1581 endpt_send_request: 0x7facb4018ce0: Set timer to 3000 msec
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_resolver.c:475 sip_resolve: Performing SIP DNS resolution of target ‘vk179970.zylinccloud.supertrunk.net’
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_resolver.c:502 sip_resolve: Transport type for target ‘vk179970.zylinccloud.supertrunk.net’ is ‘TLS transport’
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_resolver.c:545 sip_resolve: [0x7facb4007d88] Created resolution tracking for target ‘vk179970.zylinccloud.supertrunk.net’
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_resolver.c:608 sip_resolve: [0x7facb4007d88] No resolution queries for target ‘vk179970.zylinccloud.supertrunk.net’
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1444 endpt_send_request_cb: 0x7facb4018ce0: PJSIP tsx response received
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1457 endpt_send_request_cb: 0x7facb4018ce0: Cancelling timer
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1466 endpt_send_request_cb: 0x7facb4018ce0: Timer cancelled
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1487 endpt_send_request_cb: 0x7facb4018ce0: Callbacks executed
[Apr 29 09:46:49] ERROR[1168]: res_pjsip.c:1619 endpt_send_request: Error 320047 ‘No answer record in the DNS response (PJLIB_UTIL_EDNSNOANSWERREC)’ sending OPTIONS request to endpoint 4_TDCNuudaySIPTrunk
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip.c:1543 send_request_wrapper_destructor: 0x7facb4018ce0: wrapper destroyed
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_options.c:755 sip_options_contact_status_notify_task: Contact 4_TDCNuudaySIPTrunk/sip:vk179970.zylinccloud.supertrunk.net:5061 status didn’t change: Unreachable, RTT: 0.000 msec
[Apr 29 09:46:49] DEBUG[1168]: res_pjsip/pjsip_options.c:775 sip_options_contact_status_notify_task: AOR ‘4_TDCNuudaySIPTrunk’ now has 0 available contacts
I haven’t restarted Asterisk yet, so if you have any other suggestions, then I would love to hear them.
What is the configuration of the transports in pjsip.conf? And what is the AOR configuration? And did you make changes to transports (such as adding a new one) without restarting?
No, I haven’t changed anything in the transport section.
It has allow_reload set to true, which in my experience works fine as long as you don’t change the certificate or changing the port, if you have mulitple trunks using the same port (which I don’t have for TLS).
It works fine, until it doesn’t. The only time it is needed anymore is if you change the bind option basically. It causes more problems than it helps generally.
Do you have a log of this failing case on this system I can compare? Do you have a log of a working case? Previously it said a transport type of TLS, which I’m surprised by since in this configuration you provided the URI does not specify TLS.
Sorry about the bits and pieces.
I sadly do not have a log with debug enabled from before the error started to occur.
Instead I have attached configuration and logs for 2 tenants - one that is working and one that is failing.both tenants are using TLS.
I hope that this is good enough to work with.
Do you have logging from the startup of Asterisk on the failure case? Specifically these:
[Apr 29 10:54:27] VERBOSE[1694] res_pjsip/pjsip_resolver.c: 'UDP+IPv4' is an available SIP transport
[Apr 29 10:54:27] VERBOSE[1694] res_pjsip/pjsip_resolver.c: 'TCP+IPv4' is an available SIP transport
[Apr 29 10:54:27] VERBOSE[1694] res_pjsip/pjsip_resolver.c: 'TLS+IPv4' is an available SIP transport
[Apr 29 10:54:27] VERBOSE[1694] res_pjsip/pjsip_resolver.c: 'UDP+IPv6' is not an available SIP transport, disabling resolver support for it
[Apr 29 10:54:27] VERBOSE[1694] res_pjsip/pjsip_resolver.c: 'TCP+IPv6' is not an available SIP transport, disabling resolver support for it
[Apr 29 10:54:27] VERBOSE[1694] res_pjsip/pjsip_resolver.c: 'TLS+IPv6' is not an available SIP transport, disabling resolver support for it
No, sadly I have only enabled debug from the console
I have enabled it in the config for the 2 tenants that I sent you logs for but if I restart the failing one then the problem goes away, and can’t get debug logs for a failing scenario.
Is there any useful information that I should gather before restarting?