I am facing this error on my production setup for a 95 endpoint “callcenter”:
“taskprocessor.c:888 taskprocessor_push: The ‘subm:rtp_topic-000000aa’ task processor queue reached 500 scheduled tasks”
After this error, no registrations are possible, effectively dropping all endpoints. New incoming calls from the SBC (IP Auth) succeed into asterisk but then fail because the endpoint is not registered.
Environment:
Debian Stretch
version 13.18.2
mostly extensions.ael
PJSIP only
HDD is not full
8G RAM, 2G Swap
4 vCores on “Intel Xeon E312xx (Sandy Bridge, IBRS update)”
Is this an overload problem or a known bug?
This setup was fine since Jan 2018 but I noticed the same problem yesterday and today at approx. the same time. The only change during this time was a BIOS update to mitigate intel flaws on the hypervisor. VMs since use a secured version of QEMU-KVM.
I’d first suggest upgrading to the latest version, as we do fix and tweak things. Secondly you have to determine what is causing the system to be slow on processing and why.
Ok, I am already preparing 13.21-cert2 but it is not production ready with my adjustments yet.
I will monitor the VMs behaviour and add more RAM and CPUs to it as the hardware is dedicated to this VM (it’s just a VM to be able to migrate easy between hosts).
This is just a workaround but might lower the occurrence in the meantime.
After some further debugging with journalctl, I noticed that both mysql servers where backed up (lvm snapshot) 10 minutes before the outage. One is for endpoints and main tables (replicated) and one is with local-only stoarge for CDRs. This might introduced lags during prime time.
ODBC/realtime also produce blocked tasks, is this assumption correct?