I have this warning flooding the logs and the console. And when there is a high number of calls (200/300 simultaneous), asterisk crashes sometimes with this message
segfault at 88 ip 00007fed1c5da73e sp 00007fec41f329e0 error 4 in libasteriskpj.so.2[7fed1c4ff000+1a4000].
In days when there isn’t a high load (low number of calls and connected users), the frequency of the warnings is much lower and no crash.
When i launch this command
watch -n 0.1 ‘sudo asterisk -rx “core show taskprocessors like stasis/pool-control”’
i see the value of the column ‘Processed’ increasing by the hundreds and thousands.
I’m using AMI heavily, but CDR and CEL are disabled.
There is plenty of RAM and CPU in the machine.
So my questions are :
what is causing these warnings ?
What does the ‘stasis/pool-control’ task processor do, and what is it processing ?
Is there a way to decrease the messages processed by that taskprocessor ?
Is the crash related to this ?
Any help would be appreciated, i’m pulling my hair over this
Thanks
Asterisk v 17.4 (I had same problem with 17.2 & 17.3)
Ubuntu 16.04
AWS EC2 machine, 8G RAM, 4 cores
Approximately 250 users connected to asterisk
Using webrtc
Using chan_sip, not pjsip
It’s caused by the amount of traffic and messages on your system. Different system usage results in different messages and message count which results in different load. The queue is used for management of the threadpool handling things. The stasis.conf configuration file allows you to disable message types, but I don’t know if it would help with this. It’s unlikely the crash itself is caused by this, but both may be caused by the same underlying thing.
You can try building with developer mode (./configure --enable-dev-mode) and then using the “stasis statistics show topics” and “stasis statistics show subscriptions” CLI commands to show what is seeing a lot of Stasis messages. That may provide insight as to what is going on, but in the end it’s likely your specific usage patterns.
Normal number is relative depending upon how your system is used and what is configured. As well you have multiple topics of the same endpoint because they’re in multiple queues.
The only thing that comes to mind is RTP but it still seems to be fine. Without digging in deeply and understanding your specific usage patterns/reproducing them/I can’t really offer anything else.
That’s the thing about performance and such - there is no silver bullet because the usage by everyone differs. Someone with a similar system may be perfectly fine, but then you use one specific thing and that alters the characteristics and causes problems. It’s not an easy thing to investigate.
Ok, i understand. I’ll just dig a little deeper, or just create another machine and move some of the users there.
Does the number of users created in the “manager.conf” have any impact? For example if there are 3 users listening to call events, and there are 100 call events, would asterisk just send 100 in total or 100*3 ?
A final question, in the output of the other command “stasis statistics show subscriptions”, There is a column “Dropped”. Are those lost messages? In the case of a peer/trunk, does that translate to disconnected calls? I saw a lot of those.
It is number of events*sessions for outgoing AMI messages. If you have 100 events and there are 3 sessions, then 100 messages internally are generated, which goes to manager which then sends them out those 3 sessions resulting in 300 AMI messages being sent.
Dropped means that they were filtered before getting to the subscriber, which reduces the amount of work Asterisk does for processing. Absolutely no relation to disconnected calls.
Could you tell me what the columns Read and Write mean in the output of this command “manager show connected”, they don’t make sens to me. Normally the Read should be higher, and the write is too much high, and doesn’t change. The “amiwriter” user send manager actions, the others just listen to events