Asterisk stopping to respond to INVITES

I am using 20.2.0 on Debian 12. Every so often Asterisk will stop responding to INVITES and seem to no longer work. If I do lsof | egrep 'asterisk.*STREAM \(CONNECTED' | wc -l I get back 44872. lsof | egrep ' asterisk' | wc -l gives me back 79466. When I look at cat /proc/`(pidof asterisk)`/limits I get back

Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            unlimited            unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             65535                65535                processes 
Max open files            1048576              1048576              files     
Max locked memory         8388608              8388608              bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       31566                31566                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us 

It seems as if “something” is not releasing the fd’s for the RTP streams and we are then hitting one of the limits. What would be the reason for this and how would I go about troubleshooting such an issue?

TIA.

Dovid

The reason is generally that the underlying channel is still around. Not releasing RTP file descriptors would also not cause it to stop responding to INVITEs, they’d be responded to but just rejected. It sounds more like a deadlock, which would need a running backtrace.

@jcolp I assume I should follow Getting a Proper Asterisk Backtrace - Support Services - Documentation the next time it happens?

The Asterisk documentation for it is here[1].

[1] Getting a Backtrace - Asterisk Documentation

That seems to be based on a crash. In my case asterisk is still running. Should I kill the PID which should then generate a dump?

The specific link I posted is for a deadlock. The underlying ast_coredumper script that collects the information works on crashes or deadlocks depending on the given arguments. The “–running” argument causes it to locate a running Asterisk instance and get a backtrace.

@jcolp I just had it happen again. I have never put in an issue since the move to GitHub. Does it go here Issues · asterisk/asterisk · GitHub ? How do I attach the dump data while keeping it secure?

That is the place to file issues, yes. You can not attach things securely. It is advised to scrub the backtrace of information you consider sensitive before attaching. If you REALLY can’t do so, then you COULD send it to asteriskteam@sangoma.com however this limits any investigation or resolution to Sangoma. There is no timeframe or even if such a thing would be resolved.

I am responding here after working this on GH ([bug]: Asterisk stops responding to SIP INVITES · Issue #373 · asterisk/asterisk · GitHub). I currently have two machines running Debian 12. Two keep track of things the logs on GH were from the box a14. I have another box, a15 that also started having this problem this morning. In the logs the last thing that I saw before things going sideways was

[2023-10-26 00:13:28] ERROR[3647858][C-000009ae] res_config_mysql.c: MySQL RealTime: Ping failed (2006).  Trying an explicit reconnect.
[2023-10-26 00:13:28] VERBOSE[3647858][C-000009ae] res_musiconhold.c: Started music on hold, class '60720elmv0', on channel 'PJSIP/endpoint-external-000009ad'

After that Asterisk stopped responding to SIP INVITES. A debug of pjsip showed the same Taskprocessor overload alert error. When looking at a14 I do not see any MySQL errors. I have restarted a15 as I needed to process calls (which is now OK). Running ```core show taskprocessors`` I get back

Processor                                                               Processed   In Queue  Max Depth  Low water High water
app_voicemail                                                                   0          0          0        450        500
ast_msg_queue                                                                   0          0          0        450        500
CCSS_core                                                                       0          0          0        450        500
dns_system_resolver_tp                                                          0          0          0        450        500
hep_queue_tp                                                                    0          0          0        450        500
pjsip/default-0000000a                                                         31          0          1        450        500
pjsip/default-0000000b                                                          0          0          0        450        500
pjsip/default-0000000c                                                          0          0          0        450        500
pjsip/default-0000000d                                                          0          0          0        450        500
pjsip/default-0000000e                                                          0          0          0        450        500
pjsip/default-0000000f                                                          0          0          0        450        500
pjsip/default-00000010                                                          0          0          0        450        500
pjsip/default-00000011                                                          0          0          0        450        500
pjsip/distributor-00000020                                                  14830          0          3        450        500
pjsip/distributor-00000021                                                  14572          0          2        450        500
pjsip/distributor-00000022                                                  13335          0          2        450        500
pjsip/distributor-00000023                                                  11457          0          3        450        500
pjsip/distributor-00000024                                                  14585          0          3        450        500
pjsip/distributor-00000025                                                  13215          0          2        450        500
pjsip/distributor-00000026                                                  13627          0          3        450        500
pjsip/distributor-00000027                                                  13328          0          2        450        500
pjsip/distributor-00000028                                                  13887          0          4        450        500
pjsip/distributor-00000029                                                  14279          0          3        450        500
pjsip/distributor-0000002a                                                  14104          0          3        450        500
pjsip/distributor-0000002b                                                  14371          0          3        450        500
pjsip/distributor-0000002c                                                  14471          0          3        450        500
pjsip/distributor-0000002d                                                  13529          0          2        450        500
pjsip/distributor-0000002e                                                  12699          0          3        450        500
pjsip/distributor-0000002f                                                  13941          0          3        450        500
pjsip/distributor-00000030                                                  12905          0          3        450        500
pjsip/distributor-00000031                                                  11731          0          2        450        500
pjsip/distributor-00000032                                                  12977          0          3        450        500
pjsip/distributor-00000033                                                  14556          0          3        450        500
pjsip/distributor-00000034                                                  13855          0          3        450        500
pjsip/distributor-00000035                                                  15185          0          3        450        500
pjsip/distributor-00000036                                                  15766          0          3        450        500
pjsip/distributor-00000037                                                  14675          0          3        450        500
pjsip/distributor-00000038                                                  16516          0          3        450        500
pjsip/distributor-00000039                                                  17990          0          3        450        500
pjsip/distributor-0000003a                                                  14999          0          3        450        500
pjsip/distributor-0000003b                                                  14020          0          3        450        500
pjsip/distributor-0000003c                                                  16403          0          3        450        500
pjsip/distributor-0000003d                                                  14295          0          3        450        500
pjsip/distributor-0000003e                                                  16185          0          2        450        500
pjsip/exten_state                                                               0          0          0        450        500
pjsip/messaging                                                                 0          0          0        450        500
pjsip/mwi-00000050                                                              1          0          1        450        500
pjsip/mwi-00000051                                                              0          0          0        450        500
pjsip/mwi-00000052                                                              0          0          0        450        500
pjsip/mwi-00000053                                                              0          0          0        450        500
pjsip/mwi-00000054                                                              0          0          0        450        500
pjsip/mwi-00000055                                                              0          0          0        450        500
pjsip/mwi-00000056                                                              0          0          0        450        500
pjsip/mwi-00000057                                                              0          0          0        450        500
pjsip/options/CdO-00000042                                          		7          0          1        450        500
pjsip/options/CtO-00000041                                          		7          0          1        450        500
pjsip/options/generic-aor-0000003f                                              7          0          1        450        500
pjsip/options/LO-00000043                                          		7          0          1        450        500
pjsip/options/manage                                                           29          0          6       4500       5000
pjsip/options/TC-00000040                                          	   332067          0          5        450        500
pjsip/pool                                                                 617229          0          3        450        500
pjsip/pool-control                                                        1235179          0          5        450        500
sorcery/acl-00000059                                                            0          0          0        450        500
sorcery/aor-00000017                                                            4          0          1        450        500
sorcery/asterisk-publication-0000004f                                           0          0          0        450        500
sorcery/auth-00000012                                                           3          0          1        450        500
sorcery/bucket-00000000                                                         0          0          0        450        500
sorcery/certificate-00000047                                                    0          0          0        450        500
sorcery/client-00000049                                                         3          0          1        450        500
sorcery/contact-00000016                                                        3          0          1       1350       1500
sorcery/domain_alias-00000018                                                   0          0          0        450        500
sorcery/endpoint-00000013                                                       4          0          1        450        500
sorcery/file-00000001                                                           0          0          0        450        500
sorcery/general-00000045                                                        0          0          0        450        500
sorcery/global-00000019                                                         8          0          1        450        500
sorcery/identify-00000044                                                       0          0          0        450        500
sorcery/inbound-publication-0000004d                                            0          0          0        450        500
sorcery/log_mappings-00000008                                                   0          0          0        450        500
sorcery/nat_hook-00000014                                                       0          0          0        450        500
sorcery/outbound-publish-0000004a                                               0          0          0        450        500
sorcery/pool                                                                   32          0          2        450        500
sorcery/pool-control                                                           72          0          2        450        500
sorcery/profile-00000048                                                        0          0          0        450        500
sorcery/registration-0000005b                                                   7          0          1        450        500
sorcery/resource_list-0000004c                                                  0          0          0        450        500
sorcery/store-00000046                                                          0          0          0        450        500
sorcery/subscription_persistence-0000004b                                       0          0          0        450        500
sorcery/system-00000009                                                         0          0          0        450        500
sorcery/transport-00000015                                                      0          0          0        450        500
stasis/m:bridge:all-0000005e                                                    1          0          1        450        500
stasis/m:cache_pattern:0/endpoint:all-00000007                                 35          0          2        450        500
stasis/m:cdr:aggregator-00000005                                          1042337          0         24       4500       5000
stasis/m:channel:all-0000005f                                                   1          0          1        450        500
stasis/m:devicestate:all-00000002                                           25478          0          7        450        500
stasis/m:devicestate:all-00000003                                           25478          0          7        450        500
stasis/m:manager:core-00000006                                            7860928          0        356       2700       3000
stasis/m:mwi:all-0000005d                                                      29          0          7        450        500
stasis/m:presence_state:all-00000004                                            1          0          1        450        500
stasis/m:security:all-0000001f                                                  1          0          1        450        500
stasis/m:security:all-0000005a                                                  1          0          1        450        500
stasis/m:security:all-00000060                                                  1          0          1        450        500
stasis/m:system:all-0000005c                                                    1          0          1        450        500
stasis/p:endpoint:PJSIP/CdO-0000001d                                1           0          1        450        500
stasis/p:endpoint:PJSIP/CtO-0000001c                              		1          0          1        450        500
stasis/p:endpoint:PJSIP/endpoint-external-0000001a                        1041097       1238          9        450        500
stasis/p:endpoint:PJSIP/LO-0000001e              		                1          0          1        450        500
stasis/p:endpoint:PJSIP/TC-0000001b	                                        1          0          1        450        500
stasis/pool                                                                933345          1          2        450        500
stasis/pool-control                                                       1872113          0          4        450        500

From what I gather stasis/p:endpoint:PJSIP/endpoint-external-0000001a has 1238 tasks but a max of 500 in the queue? There are currently 0 calls on the box so why would PJSIP be taking up any more resources? Why are they not being free’d up? a14 is still up in it’s “sad” state if it would help to look at it directly.

The taskprocessor list is showing that stasis/p:endpoint:PJSIP/endpoint-external-0000001a has 1238 tasks in the queue but has a max queue depth of 9. This contradiction indicates that the thread handling that taskprocessor queue is deadlocked for some reason because it has not been able to update the max queue depth statistic. By default when a taskprocessor queue reaches the high water level, Asterisk stops processing any further new PJSIP calls until the queue backlog goes below the low water level.

https://docs.asterisk.org/Asterisk_20_Documentation/API_Documentation/Module_Configuration/res_pjsip/#taskprocessor_overload_trigger

Are there any docs that explain what the different numbers in the queue are and what they mean? I have since restarted Asterisk. When this happens again in such a case where the tasks are higher than the max should I do a back trace or should be looking elsewhere? Also another interesting thing I noticed is when I did asterisk -rx' module show' it showed 65 use Count even though there were no agi’s running. I assume something is “stuck” which may be a bug?

Anyone have any ideas on what could be wrong or how to debug further?

Curious if you can try restarting the database instance when it locks up? That might cause whatever AGIs are hanging to come unstuck.

I am using AWS RDS so I can’t really restart it. Strangely this has not been an issue as of late,not sure why.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.