Recently an Asterisk server got stuck in a “deadlock” and i had to do a System restart of the service,
Some logs included the following lines:
taskprocessor.c: The ‘stasis/m:cdr:aggregator-00000005’ task processor queue reached 5000 scheduled tasks.
Autodestruct on dialog ‘J7cTqLdSTGJ5F0-qSL6aoCo’ with owner SIP/CHANNELX-000ca8df in place (Method: BYE). Rescheduling destruction for 10000 ms
My thoughts is that the System collapsed after reaching the default limit of Asterisk usage of maximum simultaneous tasks and it created a “snowball” effect when re-scheduling the tasks
If my thoughts are on the correct way:
Is there a way to check asterisk default limits and how to “increase” them at boot?
If not, what do you think that could cause the deadlock?
Any ideas or references to external links will be greatly appreciated
The message didn’t say ERROR, so I think it is just a warning that something is getting overloaded.
I assume these limits are compiled in, so you would need to modify the source code and recompile.
They are not ulimit values, as they are not referring to kernel resource.
Having a large number of scheduled tasks is suggestive of a system that is exceeding the available processing power, or has a deadlocking problem. It should be treated as sign of possible trouble, not the cause of it. Depending on the real cause, increasing the limit might have no effect other than to change the value in the warning message. If there is a deadlock, it is the cause of the long queue, not the result of it.
Thanks a lot for your thoughts,
I’ve seen that some Asterisk users modify the limits with -ulimit but i don’t know too much about it, and worse, i don’t know how to check system default limits,
For example, the logs told that “Task processor queue reached 5000 scheduled tasks”
I have never seen such value (5000 scheduled tasks) in asterisk /etc files and I don’t know what caused the tasks to queue, mainly because the system had resources available
You typically have to do this for the number of open files, but that is nothing to do with the warning you are seeing.
root@dhcppc4:~# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 30732
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 30732
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
For several task queues this is multiplied by 10. However, it is a soft limit. As I’ve already said, changing it only changes how soon you get the warning, if there is something really wrong.
This part of asterisk was added after I learned the internals, but nowadays, lots of processing is serialised by sending it to queues which are processed by a small number of threads. This is likely to result in less race conditions.
In general terms, queues will get excessively long if:
tasks are being scheduled faster than the machine can process them;
the processing for tasks involves long running steps, like database accesses;
a deadlock occurs whilst processing a task.
A deadlock is when a thread is waiting for something to happen, but it cannot happen because another thread is waiting for the first thread to do something, before it can complete what the first thread is waiting for.