Asterisk Redundancy

Lots of questions on this. No answers.

I’d like to set up an Asterisk farm with multiple boxes that can load balance or handle failure of nodes. We’re going to be using Polycom phones and these support DNS NAPTR/SRV lookups to allow the phones to select their registrar/proxy to use based on a round robin or failover approach. This part seems easy. I hope I’m not missing something.

The tricky part is maintaining a consistent data set between all the Asterisk boxes. One approach, the poor man’s approach, would be to simply use the conf files directly, and write some custom scripts that replicate (say with rsync) files from a master edit machine to all nodes in the farm at regular intervals. We could use CVS on the master machine to allow us to roll back to any previous version of the files.

I’ve noticed that when a phone isn’t registered with an Asterisk box and tries to make a call, Asterisk requires proxy authentication, and as part of that process, the phone immediately re-registers with Asterisk. Pretty nifty. For incoming calls, I think I’ve seen this work too. Needs to be tested.

Of course, we could use an AGI script with a database to do all our dialing logic. Then you have to make your database redundant, which is extra complexity. We also noticed, by using the SIPP test tool, that when all calls where processed by an AGI script, there was a noticable delay in call processing. We went from being able to handle 120 almost simultaneous INVITE requests to some of the INVITE reqests timing out and being resent by SIPP. Then again, is being hit with 120 simultaneous calls likely in the real world (as opposed to a gradual rampup). Also there’s extra development time required to write the custom dial plan completely in a script. I also read today that the MeetMe application can’t be controlled from an AGI script.

I’m wondering if the RedHat GFS (global file system) only works with applications that are GFS-aware, or if you can just stick /etc/asterisk on your global file system and share it between nodes.

Another idea is to wack all your nodes on a fibre-channel SAN (using RAID-5), but then you’d have to expose the same LUN to all nodes which is dangerous. You could maybe expose the LUN as read/write to ONE node and read-only to the rest and this would work for most situations except when someone wanted to change their voicemail PIN (asterisk writes to voicemail.conf when you do this). Maybe you could stick voicemail on a separate farm. Can anyone tell me what other applications actually write to their config files directly?

Oh boy… so many variables and unknowns.