Asterisk out of resources

This is something that hasn’t happened before, and I’m wondering if this has something to do with Asterisk, as it’s happened twice or thrice in the past 24 hours now. Trying to enter the Asterisk CLI by any means possible fails:

root@na01:~# asterisk -Rvvvvvvvvvvvvc
Asterisk 18.3.0, Copyright (C) 1999 - 2021, Sangoma Technologies Corporation and others.
Created by Mark Spencer <markster@digium.com>
Asterisk comes with ABSOLUTELY NO WARRANTY; type 'core show warranty' for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type 'core show license' for details.
=========================================================================
root@na01

Nothing in the debug log at all.

However, I can execute commands using `asterisk -rx ‘some command’… it would seem, but if I try restarting or killing all channels, they don’t seem to get processed. Asterisk is in some kind of quasi state where I can’t interact with it, but there are ton of errors in the error log so it’s obviously running:

root@na01:~# tail -40 /var/log/asterisk/messages
[2021-04-19 18:49:08] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1788)
[2021-04-19 18:49:14] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:49:20] WARNING[1135][C-000004d5] pbx.c: Failed to create new channel thread
[2021-04-19 18:49:20] WARNING[1135][C-000004d5] chan_sip.c: Failed to start PBX :(
[2021-04-19 18:49:20] WARNING[1064] cdr_adaptive_odbc.c: CDR column 'ctx' was not set and does not match filter of !''.  Cancelling this CDR.
[2021-04-19 18:49:20] WARNING[1064] cdr_adaptive_odbc.c: CDR column 'imts' was not set and does not match filter of !''.  Cancelling this CDR.
[2021-04-19 18:49:24] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:49:25] WARNING[19091][C-000004d6] func_shell.c: Failed to execute shell command '/etc/asterisk/scripts/sipcontext.sh "2"'
[2021-04-19 18:49:25] WARNING[19091][C-000004d6] func_shell.c: Failed to execute shell command 'ls -1 /var/spool/asterisk/monitor/george-2-* | wc -l'
[2021-04-19 18:49:25] WARNING[19091][C-000004d6] ast_expr2.fl: ast_yyerror():  syntax error: syntax error, unexpected '>', expecting $end; Input:
>0
^
[2021-04-19 18:49:25] WARNING[19091][C-000004d6] ast_expr2.fl: If you have questions, please refer to https://wiki.asterisk.org/wiki/display/AST/Channel+Variables
[2021-04-19 18:49:28] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1789)
[2021-04-19 18:49:29] WARNING[19091][C-000004d6] pbx.c: Failed to create new channel thread
[2021-04-19 18:49:29] WARNING[1064] cdr_adaptive_odbc.c: CDR column 'ctx' was not set and does not match filter of !''.  Cancelling this CDR.
[2021-04-19 18:49:29] WARNING[1064] cdr_adaptive_odbc.c: CDR column 'imts' was not set and does not match filter of !''.  Cancelling this CDR.
[2021-04-19 18:49:46] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:49:48] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1790)
[2021-04-19 18:49:59] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:50:06] NOTICE[1135] chan_sip.c: Peer 'ATAxMarkD1' is now UNREACHABLE!  Last qualify: 105
[2021-04-19 18:50:08] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1791)
[2021-04-19 18:50:13] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:50:27] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:50:28] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1792)
[2021-04-19 18:50:47] NOTICE[1135] chan_sip.c: Peer 'ATAxMarkD1' is now Reachable. (103ms / 2000ms)
[2021-04-19 18:50:48] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1793)
[2021-04-19 18:50:48] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:50:58] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:51:08] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1794)
[2021-04-19 18:51:08] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:51:20] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:51:25] ERROR[985] asterisk.c: Unable to spawn thread to handle connection: Resource temporarily unavailable
[2021-04-19 18:51:28] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1795)
[2021-04-19 18:51:34] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:51:48] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1796)
[2021-04-19 18:51:48] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:51:51] NOTICE[1135] chan_sip.c: Peer 'ATAxMarkD1' is now UNREACHABLE!  Last qualify: 103
[2021-04-19 18:52:00] WARNING[1271][C-00000002] func_shell.c: Failed to execute shell command 'curl "https://example.com"'
[2021-04-19 18:52:08] NOTICE[1135] chan_sip.c:    -- Registration for 'something123' timed out, trying again (Attempt #1797)

Yet, when I run top, Asterisk is using only 10.3% memory and 4.6% CPU. Overall CPU usage is basically that, with memory usage at 713M out of 990M. It seems like Asterisk it out of resource handles to do anything, but I don’t see any evidence of that in the system.

Any thoughts on what this could be? The fact that this is happening repeatedly now is interesting.

–Slight correction, htop shows something different and more concerning than top, it seems like there are about 20 different instances of Asterisk with new ones continuously spawning, probably because they keep failing:

Tracing some of these errors through the source code, looks like AST_PBX_FAILED is involved.

Any good way to get to the bottom of this and figure out the root cause? Like intentionally cause Asterisk to crash and do a core dump or something?

Looks like a hacked server. Disconnect immediateley . Backup only config files after carefull inspection And reinstall everything from scratch .

What makes you say so?

Registration for ‘something123’ timed out

unc_shell.c: Failed to execute shell command ‘curl “https://example.com”’

Asterisk would not do this by itself

The htop result looks normal pthreads uses Linux processes, which all share the same memory, for its threads, and in some modes of ps, the subordinate thread processes aren’t suppressed.

You seem to be using resources in a Microsoft Windows sense, which refers to a particular resource that is specific to Windows.

My best guess, in this case, is that you have run out of file descriptors, but any of the parameters controlled by ulimit could have been violated.

These were legitimate dialplan calls, I just redacted the exact URL and SIP endpoint. Everything was failing because of the AST_PBX_FAILED condition, so everytime anything tried to execute in the dialplan, or Asterisk tried to do something, it failed because it couldn’t allocate anything it seems.

Not sure what this is but unlikely to be a hack.

The only thing that I can remember changing is I have a bash script that translates certain numbers to the SIP device name for chan_sip. Two days ago ish, it mysteriously stopped working on #included files. It was very strange because trying to do what the script was doing interactively worked, but the script itself was failing. No other changes that could have prompted this, this was out of the blue.

I made a simple change to that to override the behavior to force it to work, and so it seems maybe that screwed something up. Not sure how that would lead to this behavior unless it was infinite recursion or an infinite loop though. Just thought of this as you mentioned “run out of file descriptors”…

Modified the behavior 12 hours ago and it hasn’t done this weird thing since, but still crossing my fingers. I’d like to be able to at least determine if it was Asterisk or something else.

Check RAM on this server too. When RAM goes bad software starts to bug.

Okay, I think I have found the problem.

It appears that if you use System() to call a bash script that has an infinite loop and never returns / finishes execution, that will cause all hell to break loose.

Obviously, this is a situation best avoided, but very bizarre things begin happening and there doesn’t seem to be any graceful failure. As soon as this happens once on a single call in a single thread, that will screw up the entire Asterisk process. Killing Asterisk doesn’t work and a reboot becomes required.

It might help if asterisk behaved properly as a daemon and used setsid() to create a new session. However, only consoles can actually trigger session kills, and asterisk does you process groups in music on hold handling, so a process group kill on asterisk wouldn’t kill all its descendants unless it took step to recurse the kill into subsidiary process groups.

Generally, though, running an OS out of process table space will make it unusable.