[Solved] Endpoint removed from ps_contacts before expiration

kanarie2007 · September 17, 2015, 12:22pm

Hello all,

I’m struggeling for a few days now with something strange in my setup.

Running Asterisk 13.3.0 with pjsip, phones are sending a REGISTER with the parameter Expires: 3600
After a lookup in ps_endpoints, the contact is sucessfully added:
– Added contact ‘sip:+123456789@10.20.30.40:5060’ to AOR ‘+123456789’ with expiration of 3600 seconds

All goes well, but after some 12 minutes, the contact is gone, all I can see is a database delete action when debug logging is enabled:

SQL: DELETE FROM ps_contacts WHERE id=

Sadly, I have no control about the register frequency on the phones side.

Is there a parameter I’m missing?

Thanks!

jcolp · September 17, 2015, 2:03pm

Have you confirmed that the phone is not removing the contact? What is the complete console output around when it is deleted?

kanarie2007 · September 17, 2015, 2:15pm

Have debug logging on 1, verbose logging on 3, all I can see is the database SQL query removing the record, no other messages. Also pjsip logger on, no sip packet coming from the phone before the delete.

Just saw that the interval is rather random. Mostly between 5 and 45 minutes i guess.
Contacts just dissapear and of course coming back every hour due to the phone registration.

Any suggestions?

kanarie2007 · September 17, 2015, 3:13pm

Ok, additional output debuglevel increased.
This is the relevant logging part:

[17-Sep 16:56:51] DEBUG[1223]: pjsip0 <?>     tdta0x7f540c00 .Destroying txdata Response msg 200/OPTIONS/cseq=13723 (tdta0x7f540c008160)
[17-Sep 16:56:54] DEBUG[32410]: res_config_odbc.c113 custom_prepare Skip: 0; SQL: DELETE FROM ps_contacts WHERE id=?
[17-Sep 16:56:54] DEBUG[32410]: res_odbc.c1057 odbc_release_obj2 odbc_release_obj2(0x1c17f08) called (obj->txf = (nil))
[17-Sep 16:56:54] DEBUG[32372]: threadpool.c508 grow Increasing threadpool Sorcery's size by 1
[17-Sep 16:56:54] DEBUG[1722]: res_config_odbc.c113 custom_prepare Skip: 0; SQL: SELECT * FROM ps_aors WHERE id = ?
[17-Sep 16:56:54] DEBUG[1722]: res_config_odbc.c129 custom_prepare Parameter 1 ('id') = '+12345678901'
[17-Sep 16:56:54] DEBUG[1722]: res_odbc.c1057 odbc_release_obj2 odbc_release_obj2(0x1c17f08) called (obj->txf = (nil))
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [60] in [-2147483648, 2147483647] gives [60](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [3600] in [-2147483648, 2147483647] gives [3600](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [0] in [-2147483648, 2147483647] gives [0](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [7200] in [-2147483648, 2147483647] gives [7200](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [3600] in [-2147483648, 2147483647] gives [3600](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [10] in [-2147483648, 2147483647] gives [10](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [3600] in [-2147483648, 2147483647] gives [3600](0)
[17-Sep 16:56:54] DEBUG[1722]: config.c3530 ast_parse_arg extract int from [60] in [0, 86400] gives [60](0)
[17-Sep 16:56:54] DEBUG[1722]: res_config_odbc.c113 custom_prepare Skip: 0; SQL: SELECT * FROM ps_contacts WHERE id LIKE ? ORDER BY id
[17-Sep 16:56:54] DEBUG[1722]: res_config_odbc.c129 custom_prepare Parameter 1 ('id LIKE') = '+12345678901;@%'
[17-Sep 16:56:54] DEBUG[1722]: res_odbc.c1057 odbc_release_obj2 odbc_release_obj2(0x1c17f08) called (obj->txf = (nil))
[17-Sep 16:56:54] DEBUG[32373]: threadpool.c508 grow Increasing threadpool stasis-core's size by 1
[17-Sep 16:56:54] DEBUG[32380]: devicestate.c368 _ast_device_state No provider found, checking channel drivers for PJSIP - +12345678901
[17-Sep 16:56:54] DEBUG[32380]: res_config_odbc.c113 custom_prepare Skip: 0; SQL: SELECT * FROM ps_endpoints WHERE id = ?
[17-Sep 16:56:54] DEBUG[32380]: res_config_odbc.c129 custom_prepare Parameter 1 ('id') = '+12345678901'
[17-Sep 16:56:54] DEBUG[32380]: res_odbc.c1057 odbc_release_obj2 odbc_release_obj2(0x1c17f08) called (obj->txf = (nil))
[17-Sep 16:56:54] DEBUG[32380]: res_sorcery_realtime.c119 sorcery_realtime_filter_objectset Filtering out realtime field 'disallow' from retrieval
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [0] in [-2147483648, 2147483647] gives [0](0)
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [0] in [-2147483648, 2147483647] gives [0](0)
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [1800] in [-2147483648, 2147483647] gives [1800](0)
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [0] in [-2147483648, 2147483647] gives [0](0)
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [0] in [-2147483648, 2147483647] gives [0](0)
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [90] in [-2147483648, 2147483647] gives [90](0)
[17-Sep 16:56:54] DEBUG[32380]: config.c3530 ast_parse_arg extract int from [0] in [-2147483648, 2147483647] gives [0](0)
[17-Sep 16:56:54] DEBUG[32380]: devicestate.c464 do_state_change Changing state for PJSIP/+12345678901 - state 5 (Unavailable)

Any ideas?

kanarie2007 · September 18, 2015, 10:37am

I’m sorry for answering my own post, but I think I have found the problem.

I did forget to mention that we are using two Asterisk machines, who share the same database.
Both machines are getting SIP traffic from the provider and handsets in a roundrobin way.

So what happens:

Registration comes in on Asterisk1, is set in contacts with a expire of 3600 sec.
Before the end of the expire, phone sends another register, but it arrives on Asterisk2 due to the roundrobin strategy.
Asterisk2 does a nice update of the shared record, but it seems Asterisk1 isn’t aware of this.

The timer of Asterisk1 for this phone expires and does a remove from the shared contacts table.

Some time later, a new registration comes in on Asterisk1.

It seemed random to me, but the registration attempts are within the 3600 sec on a 70-90% timespan base or so. After analyzing the debuglog, I saw a delete exactly after 3600 sec of the initial register attempt.

For testing purposes, I have set all expires on 7200 sec, just to see if this is really the case.

Is there maybe something in the settings I’m missing, or is this just normal behaviour that both Asterisk machines are unaware of the update record, even if they share the same database?

Thanks a lot!

jcolp · September 18, 2015, 11:48am

It’s normal, you can’t have the res_pjsip_registrar_expire module loaded in such an environment. It reacts based on actions from the registrar on the instance itself. If another instance gets a REGISTER then it won’t know. The only way to make it work otherwise would be either having it query the database constantly to update its state or some communication between instances to push the information around. Neither of these exist today.

kanarie2007 · September 18, 2015, 12:38pm

Ok, consider it solved now, with increased expiration (double) due to roundrobin registars on two instances.

Thanks for your help!

kanarie2007 · September 21, 2015, 9:47am

Unfortunately, it’s not resolved yet.

When a registrations is send to instance 1, the internal timer will countdown to purge the record.
When I increased the expiration time, the problem only occurs less.
At some point, the re-registration will be send to only instance 2 for a couple of rounds, due to the roundrobin method and will be purged by instance 1. When we deploy more instances in the future, this will happen even ofter.

Are you sure, there is no way two (or more) instances can keep the counter in the external database?

Otherwise, is there a way to completely disable the purge routine like set the expire to 0?
Or just set the module res_pjsip_registrar_expire to noload in modules.conf?
I can have a custom external process keep track of this and purge only when it’s needed.

Thanks for your help.

jcolp · September 21, 2015, 8:15pm

The res_pjsip_registrar_expire module is what implements that logic. Disabling it is noloading it. Contacts will revert to being purged on retrieval.

kanarie2007 · September 22, 2015, 7:27am

Ok, thanks, not using the res_pjsip_registrar_expire module is the solution for me.

Changed subject to [Solved].

Topic		Replies	Views
Max_contacts, remove_existing, and unavailable contacts Asterisk SIP	2	1743	August 25, 2021
Unregistered device with PJSIP realtime Asterisk SIP	7	1071	November 22, 2019
Problems with pjsip contact Asterisk SIP	11	3075	October 9, 2019
PJSIP endpoints with multiple contacts Asterisk SIP	4	832	December 1, 2021
[Solved] PJSIP Contact was deleted while dialing extension Asterisk SIP	3	2400	January 12, 2020

[Solved] Endpoint removed from ps_contacts before expiration

Related topics