NOTIFYs not generated on Hint State Changes

Hello,

Asterisk 18.17.1
PJPROJECT version 2.13

It seems like Asterisk simply stops sending a SIP Notify after a while. There are active hint watchers, but when looking at sngrep I don’t see any NOTIFY being sent.

Changing the state to InUse:

root@pbx2:~# asterisk -x "core show hints" | grep DND_105@T3
DND_105@T3_extension: Custom:T3_DND_105     State:InUse           Presence:not_set         Watchers  4

Then we changed it back:

root@pbx2:~# asterisk -x "core show hints" | grep DND_105@T3
DND_105@T3_extension: Custom:T3_DND_105     State:Unavailable     Presence:not_set         Watchers  4

But no NOTIFY was sent.

If we add a new watcher, Asterisk will send a NOTIFY only to the 5th new watcher, and not to the previous 4 watchers.

Appreciate any pointers.

Thanks

There is core debug logging around device state and other things, so the first step would be to get a log[1] and see if anything stands out. There is also the “pjsip show subscriptions inbound” CLI command to verify that the subscriptions are actually still active.

[1] Collecting Debug Information - Asterisk Documentation

Thanks. I’ll try to get a log.

Looking at the inbound subs I see the below:

Endpoint: GT105/"John Doe" <105>
Resource: DND_105/dialog
  Expiry:        0  106712756351905-581141140314699@REMOTE_WAN_IP

Endpoint: GT105/"John Doe" <105>
Resource: DND_105/dialog
  Expiry:        0  371454462044969-360622824216565@REMOTE_WAN_IP

Endpoint: GT105/"John Doe" <105>
Resource: DND_105/dialog
  Expiry:        0  611255120157165-245104492615373@REMOTE_WAN_IP

Endpoint: GT105/"John Doe" <105>
Resource: DND_105/dialog
  Expiry:      769  469964006827744-641356236947188@REMOTE_WAN_IP

Does Expiry 0 indicate that it is expired? And 769 is the amount of seconds until it expires?

Yes, and yes. I’m also not sure why the subscription is still active there - but I recall changes being done in that area and multiple fixes being done.

I see. It seems to me like the issues were introduced after updating PJSIP to 2.13, but I may be wrong.

I see some work was done to correct this under: [ASTERISK-30469] res_pjsip_pubsub: Regression for subscription shutdowns - Digium/Asterisk JIRA

If it can be reproduced under the current release then please file an issue on Github.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Coming back to this to report that this issue is still happening on a single box that uses Asterisk 18.20.0
Other Asterisk instances (same version) does not have this issue.

We see that for some reason, after the client sends a SUBSCRIBE with Expire 0, asterisk marks that subscription as inactive instead of removing it. If Asterisk has an inactive subscription, it eventually will stop sending NOTIFYs to the active subscriptions.

root@hdc2:~# asterisk -x'pjsip show subscriptions inbound like "Expiry:        0"' | grep matched
2490 active subscriptions matched "Expiry:        0"

Is there a way to enable debug for a specific endpoint only?

No, there is no granularity like that.

Understood, is there a way to flush all inactive subscriptions?

What you see in CLI commands is what there is.

It took us some time to reproduce. Here is what we see…

When we unregister the client and Asterisk removes the subscription we see the following:

[2024-07-02 11:49:48] DEBUG[2482351] res_pjsip_pubsub.c: Removing subscription 'GT251->DND_251' from list of subscriptions
[2024-07-02 11:49:48] DEBUG[2482351] res_pjsip_pubsub.c: Removing subscription 0x7f941511c530 'GT251->DND_251' reference to subscription tree 0x7f9415b56e10
[2024-07-02 11:49:48] DEBUG[2482351] res_pjsip_pubsub.c: Destroying subscription tree 0x7f9415b56e10 'GT251->DND_251'
[2024-07-02 11:49:48] DEBUG[2482351] res_pjsip_pubsub.c: Destroying SIP subscription from 'GT251->DND_251'

When we unresgiter and asterisk fails to remove the subscription, we see:

[2024-07-02 11:38:56] DEBUG[169457] res_pjsip_pubsub.c: evsub 0x7f9449c6a7e8 state TERMINATED event TSX_STATE sub_tree 0x7f948082aff0 sub_tree state Normal
[2024-07-02 11:38:56] DEBUG[169457] res_pjsip_pubsub.c: Subscription ending, do nothing.
[2024-07-02 11:38:56] DEBUG[169457] res_pjsip_pubsub.c: evsub 0x7f9449c6a7e8 sub_tree 0x7f948082aff0 sub_tree state Normal
[2024-07-02 11:38:56] DEBUG[169457] res_pjsip_pubsub.c: Updating persistence for 'GT251->DND_251'  prune on boot: no
[2024-07-02 11:38:56] DEBUG[169457] res_pjsip_pubsub.c: sub_tree 0x7f948082aff0 sub_tree state TerminatePending
[2024-07-02 11:38:56] DEBUG[169457] res_pjsip_pubsub.c: Updating persistence for 'GT251->DND_251'  prune on boot: no

Here’s is our theory.

We looked in github for the Subscription ending, do nothing and it is referenced here:

Looking at that function, if we understand it correctly, Asterisk will not remove the subscription if there is a pending NOTIFY.

Looking at the SIP capture we actually see that this is the case.
(We use Kamailio as an edge)

Asterisk removed the subscription, you can see the NOTIFY after the SUBSCRIBE with Expire of 0:

Asterisk failed to remove the subscription, no NOTIFY sent after the Exp 0:

So, it seems, that for whatever reason, Asterisk caches that NOTIFY but can’t sent it because the endpoint is no longer active, but since there is an active (cached) NOTIFY waiting to be sent, Asterisk is not removing the subscription even it is expired.

Please let me know if you need any further information

I would suggest filing an issue with the logs and details, it may be an off-nominal/race condition.

1 Like

Update: This was caused by allowing transports to reload.

Once we disabled the transport reloads, there are no stuck NOTIFYs