PJSip Stuck Channels, Asterisk Stop Working

I am having problem in Asterisk 21.4.3 apparently some channels are active for few hours and are not getting hangup, they seem to get stuck.

 Channel: PJSIP/axxonbot1-0000f6d7/SpeechBackground           Up            01:41:48
        Exten: 550                       CLCID: "" <>
    Channel: PJSIP/axxonbot1-000141fa/SpeechBackground           Up            00:30:43
        Exten: 550                       CLCID: "" <>
    Channel: PJSIP/axxonbot1-0000765e/SpeechBackground           Up            04:24:34
    Channel: PJSIP/axxonbot3-00002c78/SpeechBackground           Up            17:02:58
        Exten: 550                       CLCID: "" <>
    Channel: PJSIP/axxonbot3-00002c64/SpeechBackground           Up            17:03:02

This is what channels are usually doing

PJSIP/axxonbot5-00002c72                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot1-00012371                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot5-00002c5b                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot1-00012372                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot1-00012375                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot2-00012373                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot1-00012341                                         550@default:20                   Up      SpeechBackground(silence,15)
PJSIP/axxonbot1-00012340                                         550@default:29                   Up      Playback(playback_files/pitch/

Also after few hours my asterisk is unreachable on port 5060 on other pbx system and I am getting this on the CLI

[Feb  6 22:55:32] VERBOSE[1619070][C-0001443b] app_mixmonitor.c: End MixMonitor Recording PJSIP/axxonbot5-0001443a
[Feb  6 23:12:25] VERBOSE[2114883] asterisk.c: Remote UNIX connection
[Feb  6 23:12:27] VERBOSE[1622674] asterisk.c: Remote UNIX connection disconnected
[Feb  6 23:14:36] VERBOSE[2114883] asterisk.c: Remote UNIX connection
[Feb  6 23:14:38] VERBOSE[1622737] asterisk.c: Remote UNIX connection disconnected
[Feb  6 23:16:45] VERBOSE[2114883] asterisk.c: Remote UNIX connection
[Feb  6 23:16:46] VERBOSE[1622776] asterisk.c: Remote UNIX connection disconnected
--------

it starts working after I core restart the asterisk and its able to perform as it should but this is keep happening two three times a day which is disrupting the production how can I fix this?

Does manually hanging up the channels using “channel request hangup” work? What are you using for a speech engine?

I am using AEAP for the speech engine and yes when I am using “channel request hangup” it was getting hung up. But the problem lies that the asterisk stop working and it doesn’t take any more socket connections and call invites. I am not getting any verbose on the CLI

systemctl status asterisk.service
● asterisk.service - LSB: Asterisk PBX
     Loaded: loaded (/etc/init.d/asterisk; generated)
     Active: active (running) since Thu 2025-02-06 16:09:14 CET; 7h ago
       Docs: man:systemd-sysv-generator(8)
    Process: 2114862 ExecStart=/etc/init.d/asterisk start (code=exited, status=0/SUCCESS)
      Tasks: 260
     Memory: 20.6G
        CPU: 12h 54min 2.930s
     CGroup: /system.slice/asterisk.service
             └─2114874 /usr/sbin/asterisk

and when check the status of the asterisk it seems okay and I am able to access asterisk CLI but I am not getting anything on the CLI and it becomes unreachable on the other pbx system, unless I restart the asterisk again it starts working as it should.

You could collect a backtrace[1] to see if it is deadlocked in PJSIP for some reason.

[1] Getting a Backtrace - Asterisk Documentation

Do I have to run this on running asterisk instance or when it is getting crashed?

According to your past comments, it didn’t crash. It just stopped working. It would need to be done while in a non-working state.

Okay I will going to run this and will collect the backtrace.

As I stated though, the system has to be in the non-working state. If you do it when things are working it is useless.

Yes I will be using this in a non-working state when its going to stop working again.

root@ams-bot-1 ~ # sudo /var/lib/asterisk/scripts/ast_coredumper --running --no-default-search
No running asterisk instances detected.
If you know the pid of the process you want to dump,
supply it on the command line with --pid=<pid>.
root@ams-bot-1 ~ # asterisk -r
Asterisk 21.4.3, Copyright (C) 1999 - 2022, Sangoma Technologies Corporation and others.
Created by Mark Spencer <markster@digium.com>
Asterisk comes with ABSOLUTELY NO WARRANTY; type 'core show warranty' for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type 'core show license' for details.
=========================================================================
Connected to Asterisk 21.4.3 currently running on ams-bot-1 (pid = 4033290)
ams-bot-1*CLI>
Disconnected from Asterisk server
Asterisk cleanly ending (0).
Executing last minute cleanups

I am trying to run the backtrace but its giving me the error that asterisk instance is not detected even though I am able to access asterisk CLI during that time when it stopped working and looking at top there is no asterisk process running at the moment.

root@ams-bot-1 ~ # sudo /var/lib/asterisk/scripts/ast_coredumper --running --no-default-search --pid=4033290
Found a single asterisk instance running as process 4033290
WARNING:  Taking a core dump of the running asterisk instance will suspend call processing while the dump is saved.  Do you wish to continue? (y/N) y
Dumping running asterisk process to /tmp/core-asterisk-running-2025-02-10T18-28-00Z
Dump is complete.
Processing /tmp/core-asterisk-running-2025-02-10T18-28-00Z
    ASTBIN: /usr/sbin/asterisk
    MODDIR: /usr/lib/asterisk/modules
    ETCDIR: /etc/asterisk
    LIBDIR: /usr/lib
    Creating /tmp/core-asterisk-running-2025-02-10T18-28-00Z-thread1.txt
    Creating /tmp/core-asterisk-running-2025-02-10T18-28-00Z-brief.txt
    Creating /tmp/core-asterisk-running-2025-02-10T18-28-00Z-full.txt
    Creating /tmp/core-asterisk-running-2025-02-10T18-28-00Z-locks.txt
    Creating /tmp/core-asterisk-running-2025-02-10T18-28-00Z-info.txt
root@ams-bot-1 ~ # cd /tmp/
root@ams-bot-1 /tmp # ls
core-asterisk-running-2025-02-10T18-28-00Z            core-asterisk-running-2025-02-10T18-28-00Z-info.txt     systemd-private-c7aac41655b9472bb9f0698e03d40891-redis-server.service-MmHeGL
core-asterisk-running-2025-02-10T18-28-00Z-brief.txt  core-asterisk-running-2025-02-10T18-28-00Z-locks.txt    systemd-private-c7aac41655b9472bb9f0698e03d40891-systemd-logind.service-QP44a1
core-asterisk-running-2025-02-10T18-28-00Z-full.txt   core-asterisk-running-2025-02-10T18-28-00Z-thread1.txt  systemd-private-c7aac41655b9472bb9f0698e03d40891-systemd-timesyncd.service-nIonqB
root@ams-bot-1 /tmp # cat core-asterisk-running-2025-02-10T18-28-00Z-locks.txt
!@!@!@! locks.txt !@!@!@!

$4 = {si_signo = 19, si_errno = 0, si_code = 128, _sifields = {_pad = {0 <repeats 28 times>}, _kill = {si_pid = 0, si_uid = 0}, _timer = {si_tid = 0, si_overrun = 0, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _rt = {si_pid = 0, si_uid = 0, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid = 0, si_uid = 0, si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = {si_addr = 0x0, _addr_lsb = 0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {si_band = 0, si_fd = 0}, _sigsys = {_call_addr = 0x0, _syscall = 0, _arch = 0}}}
Signal        Stop	Print	Pass to program	Description
root@ams-bot-1 /tmp # nano core-asterisk-running-2025-02-10T18-28-00Z-locks.txt

This was the result but after running this the asterisk started to run again and I am getting the verbose on asterisk.

Anything that can be done to make it work. The issue is been increasing as it stops working 3 to 4 times a day on daily basis. I want to mitigate that. If pjsip channels stuck is causing it to stop is there a way I can automate to hangup the stuck pjsip channels?

You never provided the actual files from the backtrace when the issue was occurring, so I can’t say anything further.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.