Failsafe SIP server using Asterisk

I’m VERY new to the world of SIP/VoIP and am researching using Asterisk in a failsafe configuration. I will have all SIP channels on an internal voice system (i.e. no external land lines or other channels, just SIP clients talking internally to each other).

Maybe SIP already does this, but I’d like to see two physical servers that fail over between each other should one die. I’m willing to guess that Asterisk has no notion of a shared state, thus I’m guessing if a scenario happened where Server A died for some reason, a failover to server B wouldn’t maintain any call state (such as the second server being able to restablish calls and reset them up)?

Anyone done anything like this?

My first instinct is to put in two servers and setup Linux-HA and do an active standby IP configuration where the second server takes over a virtual IP of the first server should server A fail. All SIP clients would be pointing to the virtual IP. Then all SIP clients would of course have to re-register with the server and then re-establish any calls that were active (since they’d drop upon failure).

I’d love to talk about this and get some ideas…

Thanks,

Colin Stefani

Cisco Callmanager has this capability, but it does not use SIP. Avaya has a feature called “server duplication” but I do not know much about it. There are some problems to overcome when it comes to a redundant Asterisk setup. If you have the $ and the time I am sure it could be done.

Ok, I have the resources (dev staff) and the skills around to get something done, probably won’t seek outside help in terms of $.

That, said, does anyone know if Asterisk state can be kept externally? Anyway to hook in to Asterisk and in a close to real-time basis capture and update state to a cache engine?

Futher, what about restoring state? Using the API interfaces anyone know if that can be done at runtime (i.e. start up the server and have it extract it’s state upon start up on the secondary node) or can the “state” of the system be altered live, meaning could an external program somehow push the state of callers, clients, agents, etc in to Asterisk and maybe even attempt to re-establish calls that were in progress at the time of failure.

-colin

Copied from voip-info.org/wiki-Asterisk+at+large

by Michael Shuler
You can do what we did and setup 2 SER servers which are load balanced by a Foundry ServerIron XL (you could use UltraMonkey for free if you prefer). The 2 SER machines handle the REGISTER messages, NAT and final delivery to the VoIP devices and to the media gateways. The SER machines don’t know what to do with a call they only know to hand it over to Asterisk for routing/CLASS features or whatever you want the call to do. You then have a separate set of Asterisk boxes that are nothing but media gateways/transcoders that have the PRI cards in them. We don’t actually use Asterisk as media gateways in our setup we use the Lucent TNTs but it accomplishes the same thing but on a much larger scale. As far as sharing the common configs we use svn.asteriskdocs.org/res_data/ to read all of our configs (sip.conf, extensions.conf, etc.) live from a MySQL database. Just apply the patch and go.

So far we have seen no limit as to the scalability of this setup. If the 2 SER servers get overloaded we just add another. If the Asterisk routing servers get overloaded we just add another. IF we run out of PRI’s we add another TNT and more PRI’s from our Plexus 9000 switch. So the system in theory has no real-world limits.

Here is the sample configuration for SER.

Excellent, thank you! That’s along the lines of what I’m after, at least as a starting point.

Thanks,

Colin

Colin,

We are using the same configuration in our call centres.

Our design is as follows:

2x OpenSER/MySQL front-end servers
2x Asterisk back-end call processing servers

This is scalable - we can add as many Asterisk servers to the backend.

Heartbeat (using iptakeover) together with MySQL A to B to A replication takes care of the front-end failover.

We have customised our OpenSER configuration so that we can ‘pause’ a server (stop sending INVITEs to it) while it still continues to loose-route to the correct servers. This is working brilliantly, we can run full maintenance even during working hours. OpenSER is also configured to load balance evenly across the back end servers.

Failover of the front-end servers isn’t so straight forward though, as the Asterisk servers die on signal 11 when they lose connection to the MySQL database. It would be great if they remembered state so when they restarted the calls begun flowing again. All the end users would only have a second of dead air - this wouldn’t be perfect but would be acceptable.

This is taken from http://www.voip-info.org/wiki/view/Asterisk+hardware

IBM NEBS compliant Blade Server for Telco applications.

* Have successfully run multiple Asterisks with HA fail over. (without dropping connected SIP calls)
* .......

We would like to improve our HA configuration. If Asterisk had stateful recovery that would be a great start.

Regards,
Andrew