Header: Disconnection Between Asterisk Servers During High Call Volume

Hello Asterisk Community,

I’m facing an issue with my setup involving two Asterisk servers:

  1. Asterisk 1: Serves as the primary PBX and connects to Asterisk 2 for outbound calls.
  2. Asterisk 2: Connected directly to a PRI line and acts as an endpoint for Asterisk 1

When the call volume increases, Asterisk 1 gets disconnected from Asterisk 2. During this disconnection, the following error logs are generated on Asterisk 1:

[Nov 30 06:37:07] ERROR[4424]: res_pjsip.c:4053 ast_sip_create_dialog_uac: Endpoint ‘trunk’: Could not create dialog to invalid URI ‘mytrunk’. Is endpoint registered and reachable?
[Nov 30 06:37:07] ERROR[4424]: chan_pjsip.c:2656 request: Failed to create outgoing session to endpoint ‘trunk’
[Nov 30 06:37:07] WARNING[4480][C-00000001]: app_dial.c:2598 dial_exec_full: Unable to create channel of type ‘PJSIP’ (cause 3 - No route to destination)

What could cause the disconnection between the two servers under high call volume?

What is the answer to this question? If it is not reachable, then Asterisk won’t send calls. It would become unreachable if the remote side failed to respond to SIP OPTIONS requests, which would be confirmed using a packet capture.

Thank you for your response!

I understand that Asterisk marks an endpoint as unreachable if it fails to respond to SIP OPTIONS requests. However, what I’m trying to understand is why this disconnection occurs only during high call volume.

  • Under normal call volume, everything works fine, and the endpoint remains reachable.
  • The issue happens only when call volume increases, at which point Asterisk 1 disconnects from Asterisk 2.

You need to look at the SIP traffic as I stated to see what is happening from that level because right now you don’t actually know what side is the cause. Is the remote side not responding? Is the SIP response getting lost? These are questions to answer to isolate things.

Thank you for your guidance.

I performed a packet capture and analyzed the SIP traffic between the two servers. It turns out that Asterisk 2 is not responding to the SIP OPTIONS packets sent by Asterisk 1 during high call volume.

I’d also appreciate advice on addressing the lack of replies to OPTIONS packets from Asterisk 2.
Any additional suggestions or best practices for handling high call volumes effectively would be greatly appreciated.

Examine Asterisk 2. Look at a packet capture there to see if it is receiving the SIP OPTIONS requests. Look at the Asterisk log to see if there is anything of note.

Thank you for your suggestions so far.

I’ve confirmed through packet capture that Asterisk 2 is receiving the SIP OPTIONS packets from Asterisk 1 but is not responding to them during high call volume.

Additionally, I found the following log on Asterisk 2 which seems relevant to the issue:
taskprocessor.c: The ‘stasis/m:cel:aggregator-00000006’ task processor queue reached 3000 scheduled tasks again.

That would mean that CEL is overloaded. If sending to a database for example, then the database interaction may not be able to keep up. By default PJSIP will not accept certain traffic to reduce the load on things. This is configurable[1].

[1] asterisk/configs/samples/pjsip.conf.sample at master · asterisk/asterisk · GitHub

I’ve adjusted the configuration by setting taskprocessor_overload_trigger = none to address the task processor overload issue. This seems to have alleviated the immediate problem of OPTIONS responses not being sent.

What are the potential consequences of setting `taskprocessor_overload_trigger = none?

Traffic will continue to be processed and handled, meaning that the queue could keep continue growing if the system can not handle things fast enough.

Thank you for the clarification and your guidance so far.

I’d also like to ask about best practices for configuring Asterisk to handle high call volumes effectively.
Our current setup is as follows:

Asterisk 2: Connected to the PRI line and acts as an endpoint.
Asterisk 1: Sends calls to Asterisk 2 for processing.

Does this approach align with best practices for handling high call volumes, or would you suggest an alternative architecture?

I don’t think anyone can really answer or say anything to that, because it’s an extremely generic ask and description. Asterisk is a toolkit and can be used in tons of different ways, which inherently result in different performance requirements and tweaking.

ok thank you for all the guidance.