OpenWRT 23.05.3 NBG6716 Asterisk 20.5.2

Greetings all!

I’ve been having severe issues with Asterisk on my Zyxel NBG6716 Routers on a custom OpenWRT Image.
I thought its about time that i phase out my incredibly old Asterisk 1.8 Installations, so i opted for the last version that still supports chan_sip (I know its deprecated, but I don’t have the capacity to switch right now.)

I’ve tried multiple Installations in different variations, different OpenWRT versions and different Asterisk versions but i keep having one Problem im particular:

While everything generally works, the “app_queue.so” Module doesn’t really.

I set my extensions.conf up to forward to a queue “zentrale” in queues.conf which only has one member “SIP/12”

When I dial a Test number (in my Case “425”) from a SIP-Client the queue at first works as expected, but as soon as the extension “SIP/12” picks up, asterisk crashes with a SIGSEGV.

Has anyone else experienced a similar problem or knows a fix?

Thanks!
Michel

This is going to be difficult as you really need to eliminate chan_sip, as crashes due to that will not get investigated or fixed.

Normally, for crashes, you should make sure you have built with optimisation disabled, then obtain backtraces from the core dump and report it on the github issue tracker.

If you think chan_sip may compromise that, we’d need the same backtraces, here, to have a chance of understanding the problem.

The only true fix for a segmentation violation is to fix the source code. Anything else is just a question of avoiding the situation that caused it.

Hmm…
Thats unfortunate.
But does it even matter if its chan_sip or pjsip?
Doesn’t pjsip also use app_queue…? Maybe i need to fall back to different options instead of app_queue…
I don’t know if theres any alternatives(?)

The app_queue module uses whatever channels you configure, chan_sip or chan_pjsip based. There is insufficient information to know where the issue really is.

Good to know,

is there any specific Information I could provide you with that would help identify the issue?
Here’s whats in the console before the crash:

Added interface 'SIP/12' to queue 'zentrale'
    -- Registered SIP '11' at 192.168.1.124:42350
[Apr 22 13:34:37] NOTICE[3473]: chan_sip.c:25011 handle_response_peerpoke: Peer '11' is now Reachable. (489ms / 2000ms)
    -- Registered SIP '12' at 192.168.1.169:57589
[Apr 22 13:34:48] NOTICE[3473]: chan_sip.c:25011 handle_response_peerpoke: Peer '12' is now Reachable. (148ms / 2000ms)
  == Using SIP RTP CoS mark 5
       > 0xd91700 -- Strict RTP learning after remote address set to: 192.168.1.124:38597
    -- Executing [425@amt:1] Answer("SIP/11-00000000", "") in new stack
    -- Executing [425@amt:2] Wait("SIP/11-00000000", "1") in new stack
       > 0xd91700 -- Strict RTP switching to RTP target address 192.168.1.124:38597 as source
    -- Executing [425@amt:3] NoOp("SIP/11-00000000", "Zentrale an Warteschlange") in new stack
    -- Executing [425@amt:4] Goto("SIP/11-00000000", "queue,zentrale,1") in new stack
    -- Goto (queue,zentrale,1)
    -- Executing [zentrale@queue:1] Answer("SIP/11-00000000", "") in new stack
    -- Executing [zentrale@queue:2] Wait("SIP/11-00000000", "2") in new stack
    -- Executing [zentrale@queue:3] Playback("SIP/11-00000000", "/mnt/usb/ansagen/ansage11") in new stack
    -- Executing [zentrale@queue:4] Queue("SIP/11-00000000", "zentrale,tT,,120") in new stack
  == Using SIP RTP CoS mark 5
    -- Called SIP/12
    -- SIP/12-00000001 connected line has changed. Saving it until answer for SIP/11-00000000
    -- SIP/12-00000001 is ringing
       > 0xd91700 -- Strict RTP learning complete - Locking on source address 192.168.1.124:38597
       > 0x76e43630 -- Strict RTP learning after remote address set to: 192.168.1.169:7276
    -- SIP/12-00000001 connected line has changed. Saving it until answer for SIP/11-00000000
    -- SIP/12-00000001 answered SIP/11-00000000
OpenWrt*CLI> 
Disconnected from Asterisk server
Asterisk cleanly ending (0).
Executing last minute cleanups

Thanks!!

A backtrace would be required. We provide instructions:

https://docs.asterisk.org/Development/Debugging/Getting-a-Backtrace/?h=backtrace

But no idea if they are applicable or work for such an environment.

The problem with segmentation violation is that the initial corruption may not have occurred close to the failure. Sometimes it even happens on another thread.

Okay so Update:

I took the small learning curve to setup the same rough structure with pjsip.
Same problem, as soon as I pick up a call from a queue asterisk crashes:

[ 4168.760322] do_page_fault(): sending SIGSEGV to asterisk for invalid read access from 00000001
[ 4168.769157] epc = 77356b3c in libc.so.6[772b0000+1b3000]
[ 4168.774603] ra = 77873ced in libjansson.so.4.14.0[77870000+9000]

Doesn’t seem like thats the problem then, or at least not the main one.

Unfortunately OpenWRT doesn’t seem to do backtraces or useful coredumps…

Seems like the queues just dont want to work on ath79/mips_24kc anymore.

I’ll try a bit more and if it doesn’t work I’ll have to abandon this project for a while.

Will update if I do find out anything else!

You need to the full backtrace, it is the contents of the parameters and variables that will give a clue as to what is going wrong.