Weird PJSIP Issues and possible memory lieak?


#1

Potentiallly related to closed https://issues.asterisk.org/jira/browse/ASTERISK-25653

While I do have sip clients on cellular right now this is all occuring in-house over wifi or wired.
My Asterisk 14/Freepbx14 is located on vultr and now is 2048GB ram up from 1024 ram last night

Is anyone having weird issues with PJSIP? Since I migrated from ASterisk 13/Freepbx 12 I’ve had nothing but extensions dropping off. I was running on a 1024 RAM on Vultr and upped it to 2048 ram last night and TOP is showing me free memory keeps dropping:

top - 10:21:35 up 12:16, 1 user, load average: 0.02, 0.13, 0.15
Tasks: 107 total, 1 running, 106 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1883708 total, 212440 free, 778244 used, 893024 buff/cache
KiB Swap: 2097148 total, 2094288 free, 2860 used. 790064 avail Mem

This started off as about 650000 free last night. looking at a asterisk debug I see pjsip running out of memory. Before I go over to Asterisk on this has anyone been seeing anything weird?

Thanks leon

some of the asterisk debug

[2017-09-06 09:45:48] ERROR[7284] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[3640] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[3640] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[3640] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[3640] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[7285] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[7285] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[7285] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[7285] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[7285] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 09:45:48] ERROR[2803] pjproject: ssl0x7f0b5c0ccd40 Renegotiation failed: Not enough memory (PJ_ENOMEM)
[2017-09-06 09:45:48] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:45384;transport=TLS is now Unreachable. RTT: 0.000 msec
[2017-09-06 09:54:06] VERBOSE[8236] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:38964;transport=TLS has been deleted
[2017-09-06 09:56:06] VERBOSE[8236] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:45367;transport=TLS has been deleted
[2017-09-06 09:59:30] VERBOSE[7602] pbx_variables.c: Setting global variable ‘SIPDOMAIN’ to ‘pbx.backwoodswireless.net

more interesting debugs:

[2017-09-06 06:23:45] ERROR[32309] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:23:45] ERROR[32309] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:23:45] ERROR[7602] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:23:45] ERROR[2803] pjproject: ssl0x7f0b641d3e60 Renegotiation failed: Not enough memory (PJ_ENOMEM)
[2017-09-06 06:23:45] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:38938;transport=TLS is now Unreachable. RTT: 0.000 msec
[2017-09-06 06:25:44] VERBOSE[7163] res_pjsip_registrar.c: Added contact ‘sips:702@70.44.10.180:45328;transport=TLS’ to AOR ‘702’ with expiration of 3600 seconds
[2017-09-06 06:25:44] VERBOSE[8236] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:45328;transport=TLS has been created
[2017-09-06 06:25:44] ERROR[31134] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:25:44] ERROR[31134] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:25:44] ERROR[31134] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702

[2017-09-06 06:25:44] ERROR[8841] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:25:44] ERROR[7163] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 06:25:44] ERROR[2803] pjproject: ssl0x7f0b4c626b10 Renegotiation failed: Not enough memory (PJ_ENOMEM)
[2017-09-06 06:25:44] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:45328;transport=TLS is now Unreachable. RTT: 0.000 msec
[2017-09-06 06:33:36] ERROR[2803] pjproject: sip_endpoint.c Error processing packet from 70.44.10.180:5060: Missing required header(s) (PJSIP_EMISSINGHDR) Via [code 171050]:
SIP/2.0 400 Bad Request
From: sip:701@209.222.10.59;tag=bcd2606e-5c15-492c-b6d3-0c934bb00eb4
To: sip:701@70.44.10.180;tag=9ffa1a1b62bb6a06
Call-ID: 62929999-00d7-44ed-bb3f-a43ec24ff7a9
CSeq: 6193 NOTIFY
User-Agent: Grandstream GXP2020 1.2.5.3
Warning: 398 "You are not allowed to CANCEL an established dialog"
Content-Length: 0

– end of packet.
[2017-09-06 06:34:03] VERBOSE[8236] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:38926;transport=TLS has been deleted
[2017-09-06 06:34:04] ERROR[2803] pjproject: sip_endpoint.c Error processing packet from 70.44.10.180:5066: Missing required header(s) (PJSIP_EMISSINGHDR) Via [code 171050]:
SIP/2.0 400 Bad Request
From: sip:702@209.222.10.59;tag=2c13f4fb-d808-43ef-ba65-b6455e6f978e
To: sip:702@70.44.10.180;tag=86962d2d9f6b5e8b
Call-ID: b6f21b3c-85c5-44bc-a7b7-6b8400528244
CSeq: 4265 NOTIFY
User-Agent: Grandstream GXP2020 1.2.5.3
Warning: 398 "You are not allowed to CANCEL an established dialog"
Content-Length: 0

– end of packet.
[2017-09-06 06:35:52] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:45312;transport=TLS has been deleted
[2017-09-06 06:36:02] VERBOSE[2767] asterisk.c: Remote UNIX connection
[2017-09-06 06:36:02] VERBOSE[2871] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 06:36:02] VERBOSE[2767] asterisk.c: Remote UNIX connection
[2017-09-06 06:36:02] VERBOSE[2873] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 06:36:02] VERBOSE[2767] asterisk.c: Remote UNIX connection
[2017-09-06 06:36:02] VERBOSE[2875] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 06:38:40] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:5060;transport=TCP is now Unreachable. RTT: 0.000 msec
[2017-09-06 06:41:34] ERROR[2803] pjproject: sip_transport.c Error processing 161 bytes packet from TCP 70.44.10.180:5066 : PJSIP syntax error exception when parsing ‘’ header on line 2 col 7:
SIP/2.0 400 Bad Request
CSeq: User-Agent: Grandstream GXP2020 1.2.5.3
Warning: 398 "You are not allowed to CANCEL an established dialog"
Content-Length: 0

– end of packet.
[2017-09-06 06:41:34] ERROR[2803] pjproject: sip_endpoint.c Error processing packet from 70.44.10.180:5066: Missing required header(s) (PJSIP_EMISSINGHDR) Via [code 171050]:
SIP/2.0 400 Bad Request
From: sip:702@209.222.10.59;tag=c69e843a-2268-4b15-8253-752439dec7ec
To: sip:702@70.44.10.180;tag=750369d3-f4fa-406f-a596-7b68de87290e
Call-ID: cce7be3d-4c88-498f-82dd-bb3b58bac473
CSeq: 15435 NOTIFY
User-Agent: Grandstream GXP2020 1.2.5.3
Warning: 398 "You are not allowed to CANCEL an established dialog"
Content-Length: 0

– end of packet.
[2017-09-06 06:41:36] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 702/sip:702@70.44.10.180:5066;transport=TCP has been deleted
[2017-09-06 07:13:45] VERBOSE[8841] res_pjsip_registrar.c: Added contact ‘sips:702@70.44.10.180:38939;transport=TLS’ to AOR ‘702’ with expiration of 3600 seconds
[2017-09-06 07:13:45] VERBOSE[2809] res_pjsip/pjsip_configuration.c: Contact 702/sips:702@70.44.10.180:38939;transport=TLS has been created
[2017-09-06 07:13:45] ERROR[32309] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702
[2017-09-06 07:13:45] ERROR[32309] res_pjsip.c: Error 70007 ‘Not enough memory (PJ_ENOMEM)’ sending NOTIFY request to endpoint 702

Thanks leon


#2

I am running ASterisk 14.6.1 Freepbx 14.0.1.4


#3

I haven’t seen any issues with such things and I’m aware of a few different major installs. It may be unique to your environment or what you are doing. More information would be useful. How many endpoints, subscriptions, type of transport, etc.


#4

AM using PJSip for endpoints. THis all worked under Asterisk 13/fpbx 12. very weird
basically I have three physical extensions 701-703. 703 is not connected now so just two 701 and 702. I have multiple devices talking to those extensions around the house (all grandstream and a zoiper android client which I disabled last night to debug this) On my Watchguard firewall I noticed in my traffic management window I was noticed weird sip behavior with lots of sip traffic. see this display which is no where near as bad as it was last night going over 120kbps!

Usually the SIP traffic is a flat line with very low bandwidth. It was hoping all around last night.
Also PJSIP extensions keep dropping out and I have to wait for the timers to expire. Going the debug log from last night you can see errors from the GXP2020 phone up above near the end of my original post.

On 701 I have PJSIP set for 8 and on 702 I have it set to 6. I usually add an extra 2 count in case things get out of sync

I have MWI subscriptions set on.

WOuld you like to get access to the server? If so, please post me privately wa4zlw@arrl.net

WHat else do you need? I’ve been running top all night and the free memory keeps going down pops up a bit which is what I would expect but something doesnt seem right to me

THanks leon


#5

I don’t provide one on one help like that. You’d need to look at the log before errors occur to see if anything looks out of the ordinary, for example is the Asterisk server under attack?


#6

I also have BLF setup as well for the extensions

no there is a firewall that freepbx runs so things are locked down.
the log entries I entered into the ticket is whta I thought unusual and why I added them to the ticket.

I looked at the previous ticket and it said to open a new one if you were not the original person which is what I did.

One thing I did do is cut down from 3600 --> 900 seconds the SIP max time

other than that I’m stumped.

thanks leon


#7

here’s another unregistration/disconnection:

[2017-09-06 11:53:57] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 11:53:57] VERBOSE[15247] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 11:53:57] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 11:53:57] VERBOSE[15249] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 11:53:57] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 11:53:57] VERBOSE[15251] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 12:02:41] VERBOSE[369] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:44141;transport=TLS is now Unreachable. RTT: 0.000 msec
[2017-09-06 12:36:03] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 12:36:03] VERBOSE[22779] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 12:36:03] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 12:36:03] VERBOSE[22781] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 12:36:03] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 12:36:03] VERBOSE[22783] asterisk.c: Remote UNIX connection disconnected


#8

more keep dropping like flies

[2017-09-06 12:36:03] VERBOSE[21467] asterisk.c: Remote UNIX connection
[2017-09-06 12:36:03] VERBOSE[22783] asterisk.c: Remote UNIX connection disconnected
[2017-09-06 12:49:43] VERBOSE[369] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:44142;transport=TLS is now Unreachable. RTT: 0.000 msec
[2017-09-06 12:49:44] VERBOSE[369] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:44143;transport=TLS is now Unreachable. RTT: 0.000 msec


#9

That would mean that the TLS connection has dropped, or the endpoint did not respond to our OPTIONS request.


#10

my latency to the pbx is like 40ms

and this just happened

[2017-09-06 12:59:13] VERBOSE[369] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:44142;transport=TLS has been deleted
[2017-09-06 12:59:13] VERBOSE[369] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:44141;transport=TLS has been deleted
[2017-09-06 12:59:13] VERBOSE[369] res_pjsip/pjsip_configuration.c: Contact 701/sip:701@70.44.10.180:44143;transport=TLS has been deleted

this is going on all day up and down…what about the other errors above?

Thanks leon


#11

I don’t really have anything to add. You can file an issue[1] with all the information you can. There’s no timeline on when it would get looked into, and as it only seems to be impacting you it could prove difficult to figure out.

[1] https://issues.asterisk.org/jira


#12

thanks just did that

Leon


#13

there’s definitely a memory leak. I just rebooted the PBX and this is the TOP I get. Before I shut it down it was like 200000 free and was getting those memory errors again.

Rebooted 9/6/17 @ 706PM

top - 19:09:13 up 0 min, 1 user, load average: 1.72, 0.49, 0.17
Tasks: 112 total, 2 running, 110 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.3 us, 0.7 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1883708 total, 1014856 free, 396464 used, 472388 buff/cache
KiB Swap: 2097148 total, 2097148 free, 0 used. 1270676 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
998 asterisk 20 0 646616 20836 10252 S 1.0 1.1 0:00.03 httpd
811 mongodb 20 0 477776 47644 20452 R 0.3 2.5 0:00.10 mongod
1044 mysql 20 0 1171728 106336 9192 S 0.3 5.6 0:00.62 mysqld
2699 asterisk 20 0 1614280 71772 17004 S 0.3 3.8 0:04.14 asterisk
3544 asterisk 20 0 1081264 49708 14392 S 0.3 2.6 0:00.84 node /var/www/h
1 root 20 0 43292 3724 2420 S 0.0 0.2 0:00.86 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/0
4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u2:0
7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 0:00.14 rcu_sched
10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
13 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd
15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback
16 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd
17 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset
18 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd
19 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 md
20 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kworker/0:1
25 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0
26 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd
27 root 39 19 0 0 0 S 0.0 0.0 0:00.00 khugepaged
28 root 20 0 0 0 0 S 0.0 0.0 0:00.00 fsnotify_mark
29 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 crypto
37 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kthrotld
38 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u2:1
39 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kmpath_rdacd
40 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kpsmoused
41 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ipv6_addrconf
60 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 deferwq
92 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kauditd
93 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/0:2
246 root 20 0 0 0 0 S 0.0 0.0 0:00.15 kworker/0:3
262 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ata_sff


#14

another item is the phones still show registered yet the pbx doesnt :frowning:


#15

Hi folks…I’m running out of memory multiple times a day requiring manual reboot. This is not good for a live system. Is there anything you can please do to look at the stuff I already sent and/or see what the differences in from asterisk 13 to asterisk 14 with pjsip?

Thanks in advance leon


#16

You don’t need to post this in multiple places. As I stated on the issue, we’ll look into it as soon as we are able. As this is an open source project there is no timeframe on when things will get looked into or resolved.