How/where to set low/high water values that impact the pjsip/distributor?

Hello guru’s,

I have been searching for days… documents, conf files, forums, … and could find nothing.
Please, how / where do you set the low/high water values that impact the pjsip/distributor?

We are a busy system, but above about 130 calls we start to often queue on pjsip/distributor tasks [only], resulting in some call quality issues.
We really need to set the high water value to higher than 500.

You don’t. Those values are to trigger alerts, with the only place that actually listens for the alerts and does anything being PJSIP which is configurable[1]. They don’t control the size of queues or directly have any impact on call quality.

The number of threads IS configurable in pjsip.conf as well[2]. Whether that does anything for your situation, I don’t know. If the distributor threads are getting backed up it’s usually because of realtime, or DNS.

[1] asterisk/configs/samples/pjsip.conf.sample at master · asterisk/asterisk · GitHub
[2] asterisk/configs/samples/pjsip.conf.sample at master · asterisk/asterisk · GitHub

Ah, I did forget that the pool of taskprocessors for PJSIP is fixed[1]. Changing that is a bit of a sledgehammer though without understanding why the distributors are so busy and piling up.

[1] asterisk/res/res_pjsip/pjsip_distributor.c at master · asterisk/asterisk · GitHub

Thanks for the quick reply.

We do not use realtime.
We will look at DNS*

My understanding was that if pjsip queues, it will stop processing, resulting in issues.

*The call quality issues are only when we see queuing are audio drops, warbles, clicks. Once queues are cleared, audio is back to normal. Queuing seems to relate to a CPU spike ( to a load of 15-25, running on a 12 core bare metal server). The spike also seems to relate to receiving approx 8-10 calls within a few seconds. If we get another 8-10 calls within a few seconds, before the prior spike load can come down, it causes a jump to a load of 25-35.

If we do not see the 8-10 calls within a few seconds, our load remains around 1-2.

We have
taskprocessor_overload_trigger=global
threadpool_max_size=300

If a taskprocessor overload occurs and taskprocessor_overload_trigger is set, then new SIP requests will be ignored until the queue returns to normal.

Thanks. That is what I understood would occur.

pjsip/distributor is the only place I have managed to see queuing occur.
The above taskprocessor report was based on a week running, so you can see we do not go crazy above the limits, but enough to experience call issues when we get 8+ calls within a few seconds.

If the distributor pool cannot be increased and the high water value cannot be increased…
is it not a risk setting taskprocessor_overload_trigger=none ?

We are running a 6 core / 12 thread bare metal server with 32G memory.
Memory usage has never exceeded 15G and CPU 15 min avg varies from 2 to 15 (with 1 min avg from 2 to 35).

There is inherently risk in setting it to none as the queues then have no limit. The only other option is to determine why the queues are growing so large in the first place.

Agreed.
Is there a way to determine why the queues max out when there are about 8-10 calls within a few seconds (nothing a total calls of approx 140).

There is no magical CLI options or thing that will outright tell you, no. It requires someone doing work to orchestrate things, understand what is going on.

It seems the 8 or so calls within a few seconds seems to trigger it. If there are a few calls at a time, everything runs smooth.

We will try to find what Asterisk is doing different when it gets 8+ calls.

Thanks for your help and time jcolp.
Cheers!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.