Asterisk hangs on Reload (for up to 20 seconds)

We have been running a large Asterisk (4000+ peers) cluster for 9+ years, our previous cluster was running 1.4 and for the most part pretty stable. We have deployed a new set of servers on the weekend running 11.23.0. During all my testing the reload command was flawless when ever we need to reload changes we made. Now when I issue a reload command the server hangs for up to 20 seconds, during this time all callers are getting is dead air if they try to make/receive a call.

Does anyone have any idea this is a large critical system and requiring reloads is during the day is important.

Rick

Please see http://www.catb.org/~esr/faqs/smart-questions.html#urgent

This sounds like a DNS timeout.

No DNS involved all phones devices use static IP’s. The config is exactly the same except for modified IP and a few sip (Directrtp) settings from my Asterisk 1.4 server.

Asterisk will reverse resolve its own address.

Has that changed since 1.4? Never had that issue before. I’ll make sure it’s reverse DNS is working. Do I need reverse for all my interfaces (management, backup network etc) or just my Primary address that Asterisk is bound to?

I did do individual reloads and the largest hangs seems to be reloading the Voicemail. Would the VM module require the DNS?

dialplan reload
sip reload
voicemail reload

FYI Asterisk 11.23 / 4500 phones / 3200 mailboxes.

Rick

Reverse DNS for my IP that is bound to Asterisk is setup, dialplan reload, sip reload seem ok, voicemail reload hangs for up to 20 seconds, then next time loads without issue.

are you using a realtime driver for voicemail?

What’s in your extconfig.conf file?

No everything is static flat file; VM messages are stored on a glusterfs mount.

;
; Static and realtime external configuration
; engine configuration
;
; Please read doc/README.extconfig for basic table
; formatting information.
;
[settings]
;
; Static configuration files:
;
; file.conf => driver,database[,table]
;
; maps a particular configuration file to the given
; database driver, database and table (or uses the
; name of the file as the table if not specified)
;
;uncomment to load queues.conf via the odbc engine.
;
;queues.conf => odbc,asterisk,ast_config

;
; Realtime configuration engine
;
; maps a particular family of realtime
; configuration to a given database driver,
; database and table (or uses the name of
; the family if the table is not specified
;
;example => odbc,asterisk,alttable
;iaxusers => odbc,asterisk
;iaxpeers => odbc,asterisk
;sipusers => odbc,asterisk
;sippeers => odbc,asterisk
;voicemail => odbc,asterisk
;extensions => odbc,asterisk

I believe the voicemail reload rescans mailboxes so it can send notifies of new messages, maybe it’s an IO timeout.

I’m not familiar with glusterfs, it sounds like network storage from a quick google, could it be timing out?

Possible the i/o is a bit slow I’m working with RedHat to see if this can be tuned up for speed. They assured me it would be fast enough but this could be the issue.

Rick