Every once in a while, I get calls from clients regarding one of their phone services being down. Even if a phone system has been fine for months, a SIP device will mysteriously stop responding or a Zap channel won’t pick up incoming calls anymore. A restart or power down;up usually fixes it.
I’d like to have some sort of health monitoring in place that I can run (cron) once an hour during business hours to trigger red alarms in my own office. This way, while I won’t always be able to determine exactly what the cause was, I can work to quickly bring whatever service is down back up before anyone at the site knows there was an issue.
If you’ve done anything like this give me a hint or two as to how you went about it. Did you use the manager API? Is cron the only way?
the best thing i can think of would be to have an extension set up that somehow could be able to return a valid response on some input, so that you could call it from your remote location to monitor the system…
something like this: (???)
exten => 99871,1,Answer
exten => 99871,n,Read(TEST,,4)
; your system sends 3456 at this point
exten => 99871,n,Gotoif($[${TEST}=3456],101)
exten => 99871,n,Hangup
;your system would be waiting for audio on the line, if none is played, the system is down.
exten => 99871,101,Playback(success)
this is just a rough idea i had when we were having DTMF issues…don’t know if this would help or not, but it’s there…
EDIT: and as far as monitoring asterisk, there are quite a few utilities out there that would work. not sure on the telco lines, but THAT would be VERY useful for me, so if anyone does have a resource for that, please let me know.
how about querying CLI > zap show status
and parsing that to find alarms ?
that’s a good idea - i already have a manager socket application built, i will have to play with that…
I’ll start on something using the Manager API. We’ll see if its wiki-worthy…