How to make a device consider a call not answered?

Hi all,

I have the following problem (simplified a lot): I have Asterisk 13 running successfully, one end connected to a VoIP / SIP provider, the other end connected to three devices, namely two hardware SIP phones and one server-based SIP answering machine.

Inbound calls are signaled simultaneously to the phones and the answering machine. If nobody answers the call at one of the phones within 30 seconds, the answering machine answers the call. This works very well with the following exception:

There is no way to make the answering machine signal actively that it has got a call; I really can’t do anything about that (as said above, the answering machine is server-based, and you have to open the respective client software on a client PC to see its status).

The problem now is that the two hardware SIP phones do NOT signal calls which have been answered by the answering machine in any way, so I don’t know about such calls unless I open the answering machine’s client software. This is not what I want. I would like to make the hardware SIP phones ignore the information that a call has been answered by the answering machine and instead treat that call as if it had not been answered elsewhere at all. The hardware SIP phones then would signal such a call by blinking LEDs, and the caller’s number would be in the list of unanswered calls so that I could call back comfortably.

I tried to find a configuration option in the hardware SIP phones which would enable such behavior, but without success. Then I have been told that the PBX would be responsible for telling the phones what calls have been answered elsewhere.

So my question actually is: How could I modify my dialplan so that both hardware SIP phones signal an unanswered call if that call has been answered by the answering machine, but none of the hardware SIP phones signals an unanswered call if the call has been answered at the other one of the hardware SIP phones?

The relevant bit of my dialplan currently looks something like that:

exten => 08203xxxxx,   1, Dial(PJSIP/CuiTNitzpqXxzOYt&PJSIP/1rpiQqobew8CkaC7&PJSIP/20@bCo9m7OfHWK2Y2sb)
  same =>              n, Hangup()

08203xxxxx is one of my phone numbers; if an inbound call for that number comes in, PJSIP dials CuiTNitzpqXxzOYt, 1rpiQqobew8CkaC7 and 20@bCo9m7OfHWK2Y2sb simultaneously (please don’t wonder about those extensions … these are just hardware SIP phone 1, hardware SIP phone 2 and the SIP answering machine).

Any hints?

Regards,

Binarus

Can you alter the number of rings or timeout on the answering machine? If so, set it to 0 then you can just dial the 2 SIP phones with a timeout of 30 seconds. if Dial falls through, you can then Dial the answering machine and have it pick up immediately. You can even check the DIALSTATUS variable after the first Dial to see if you really want to send to voicemail.

Thank you very much! This is a really good idea. I will implement it that way - problem solved! I will report if I had success. Coming from an ISDN setup, I obviously have tried to transfer that setup into the VoIP setup 1:1 without looking to the left or to the right …

Having said this: I am still curious. From what I have understood, the hardware SIP phones won’t signal an unanswered call if they get the information from the PBX that the call has been answered elsewhere. So the straightforward solution to the problem would be to prevent Asterisk from sending the “call has been answered elsewhere” message.

I have read the documentation of the Dial() application, but have not fully understood it yet. There seem to be some sorts of hooks for various stages of a call. Could we use that to influence the messages which Asterisk sends to the hardware SIP phones when the answering machine gets the call?

Thanks again,

Binarus

Unfortunately, it has turned out that I can’t implement that solution. The reason is that the answering machine executes some rather complex scripts when a call comes in, and the result of these scripts influences the waiting time until the answering machine actually accepts the call.

A simple example: When a call gets to the answering machine, one of the scripts calculates if it is working time (dependent on various variables like day of week, holiday calendar, time of day and so on). If it is working time, the answering machine waits 30 seconds before actually accepting the call so that one of the phones can be picked up. If it is not working time, the answering machine accepts the call immediately so that the other phones even don’t ring a single time.

Thus, I can’t implement a solution where the answering machine must accept every call immediately.

Theoretically, I could try to rebuild the scripts of the answering machine in the Asterisk dial plan and thus move them from the answering machine into Asterisk. But that would be a lot of work; furthermore, I doubt if it would be possible at all.

Given that, I would still prefer a solution which prevents Asterisk from telling the SIP phones “the call has been answered elsewhere”.

Any ideas?

Thank you very much,

Binarus

Ah well.

2 options then…

Asterisk can execute contexts based on time of day if you want to migrate functionality. Look at the “include” statement in extensions.conf.

You can use a hangup handler to alter the hangup cause:

[hanguphandler]
exten = s,1,verbose(In hangup handler)
;  You'll probably want to test the existing cause here and only alter it if it's 
;  answered elsewhere.
exten = s,n,Hangup(16) ; set the cause code to any in include/asterisk/causes.h

[predial]
; You can make a decision here whether to add the hangup handler or not
;  Probably yes for the phone channels, no for the voicemail
exten = s,1,Set(CHANNEL(hangup_handler_push)=hanguphandler,s,1)
exten = s,n,Return(0)

[default]
; Add a pre-dial handler to each channel
exten = 1111,1,Dial(PJSIP/phone1&PJSIP/phone2&PJSIP/voicemail,,b(predial^s^1))

Altering the cause works but it does cause an “Abnormal Gosub exit” NOTICE message because Hangup doesn’t return but the handlers expect a Return(0). Altering the cause in a hangup handler is a gray area though. It works now but the behavior is “undefined” meaning it may be an accident that it works. :slight_smile:

Thank you very much again. This is a very interesting insight into Asterisk. I did not test that solution yet because you have stated that it might be an accident that it works, so I am afraid that it won’t work in future Asterisk versions. But I will get back to it if there is no other idea. At least, I think I have understood how it works.

In the meantime, I have tried to improve your solution, but without success. At first, I have tried to understand the difference between the Hangup() and the SoftHangup() application. The former seems to be meant for hanging up the calling channel, while the latter is for hanging up one of the called channels.

Then, supported by your code, I have used a prebridge handler (a gosub) where I did a SoftHangup() for the two phones. The idea was: The prebridge handler (according to the documentation) gets executed immediately after the answering machine has answered the call. This would be exactly the right moment to hang up the phones, possibly making the phones signal an unanswered call afterwards. But unfortunately, the prebridge handler seems to execute AFTER Asterisk already has sent the “call answered elsewhere” message to the phones.

So I am still interested in solving this problem in a clean way. Could you please tell me what you think about the following ideas?

  1. I have seen that there is a HANGUPCAUSE function which currently is readonly. Could you imagine making this read-write? We could then use that in the hangup handler shown in your code instead of actually hanging up and thus avoid at least the problems which arise from Hangup() not returning. It would still be a gray area, though …

  2. Alternatively, could you imagine creating a new hook like prebridge, possibly called afteranswer, which gets executed immediately after a call is answered, but before any other message is sent to the other channels, where we could set the hangup cause which will be sent to the other devices?

What do you think about these suggestions?

Thank you very much again,

Binarus

I’ve got a better idea…

How about an option to Dial() that sets the hangup cause on unanswered channels?

exten = 1111,1,Dial(PJSIP/phone1&PJSIP/phone2&PJSIP/voicemail,,Q(NO_ANSWER))

If any of the extensions answer, NO_ANSWER will be sent to the others instead of ANSWERED_ELSEWHERE. You can use any cause listed in causes.h or NONE to set no code.

I just posted a patch for that if you want to try it…
https://gerrit.asterisk.org/4033

2 Likes

THANK YOU very much! This is absolutely great. I definitely will dedicate the day tomorrow to find out how to get / apply the patch (never used the Asterisk VCS before …) and to test it. I will report if it worked (although I am already convinced it will). This provides an absolutely clean solution to the problem.

By the way, in the meantime, after thinking thoroughly about my situation, I have decided that none of my devices needs the information that the call has been answered elsewhere. Therefore, I have taken apps/app_dial.c and have modified hanguptree(…) so that it always returns code 16 instead of AST_CAUSE_ANSWERED_ELSEWHERE. Thanks to the clear structure and the comments in your code, I didn’t have any difficulties in doing so, and it worked like a charm. I did not see any negative side effects yet.

Of course, your patch is by far better …

One problem remains with your patch as well with my quick hack, though: If the call is answered at one of the phones (i.e. not at the answering machine), the other phone will signal an unanswered call.

The ideal situation would be: If the answering machine answers the call, both of the phones (or, even better, only one of them) should signal an unanswered call, but if the call is answered at one of the phones, none of the devices (especially not the other phone) should signal an unanswered call.

I think this is impossible to implement currently, because for that you would need to give different Q options to every channel being dialed, and this currently isn’t possible with the dial application.

Having said this, I’d like to stress that this is not really a problem in any way for me. All companies I know can live well with phones signaling an unanswered call even if the call in fact has been answered at another phone. But most of them couldn’t live with the problems with the answering machine (many of them use the same software as I do), and these are solved now thanks to your patch.

So I am very glad that you took the time and effort, and I am looking forward to tomorrow. I am eager to test …

Thank you very much,

Binarus

I confirm that your patch works (no clue how to confirm this directly on gerrit …), and I am still excited about it :-).

Will the patch make it in the next release of branch 13?

Thank you very much again,

Binarus

I’m glad it’s working!. Yes, it should make it into 13.12.

2 Likes

An additional report which may or may not be an indication that something is wrong with the patch:

After having installed Asterisk 13.11.2 with the patch applied, all phones are losing their registrations every two days or so. According to the phones’ logs, Asterisk suddenly stops answering re-registration attempts; when a re-registration attempt comes in, it just doesn’t answer, and the phones go into timeout (after having tried the re-registration several dozen times).

Well, what’s the relationship to the patch? Of course, the first thing I did was scanning the logs. Unfortunately, Asterisk had not logged anything suspicious, with the following exception:

[Oct 12 13:46:37] WARNING[27217][C-00000002] app_dial.c: Unable to write frametype: 2
[Oct 12 13:48:00] WARNING[27254][C-00000005] app_dial.c: Invalid cause given to Dial(...Q(<cause>)): ""
[Oct 12 13:48:44] WARNING[27276][C-00000006] app_dial.c: Invalid cause given to Dial(...Q(<cause>)): "HcE"
[Oct 12 13:50:37] WARNING[27311][C-00000007] app_dial.c: Invalid cause given to Dial(...Q(<cause>)): "HcE"
[Oct 12 15:27:02] WARNING[29488][C-00000009] app_dial.c: Invalid cause given to Dial(...Q(<cause>)): "HcE"
[Oct 13 10:58:08] WARNING[25856][C-00000011] app_dial.c: Invalid cause given to Dial(...Q(<cause>)): "HcE"
[Oct 13 15:25:09] WARNING[32311][C-00000019] app_dial.c: Invalid cause given to Dial(...Q(<cause>)): ""

There are more messages like these. I have no clue what they are trying to tell me, and I have no clue if there is any relationship with the re-registration attempt timeout. But it is the only hint I could find so far, so I hope somebody will shed some light onto the situation …

Further hints:

  • As George proposed, my dial strings end with ,,Q(NO_ANSWER). This shouldn’t be an “invalid cause”, should it?

  • Before installing 13.11.2 with the patch, I had used 13.7.2 without the patch which had worked stable and reliably, notably without any problems with the re-registration.

  • I did not change any bit of pjsip.conf, extensions.conf or another configuration file when upgrading from 13.7.2 (without patch) to 13.11.2 (with patch), with the following exception: I have appended the “,,Q(...)” portion to the dial strings which are responsible for dialing my phones when an inbound call comes in from one of the trunks (SIP / VoIP providers).

  • I did not change any bit of the phones’ configuration when upgrading Asterisk.

  • If re-registration fails as described above (with 13.11.2 with patch), I have to kill and restart Asterisk to make it work again, i.e. after the timeout has happened the first time with a certain phone, Asterisk won’t answer any further re-registration attempts for all other phones as well. The only “solution” I have found is killing and restarting Asterisk.

I apologize for first confirming that the patch was working as intended and then reporting a possible problem some days later, but the patched Asterisk 13.11.2 initially ran for more than two days without any problem. When the re-registration problems showed up the first time, I thought it would be a network problem and put some time into respective analysis, That’s the reason why it took some days to report.

In the meantime, it has turned out that undoubtedly the Asterisk upgrade in combination with the patch is causing the misbehavior. Eventually, I will apply the patch to 13.7.2 and test if that runs reliably (as 13.7.2 without the patch does). If it does, the patch can’t be the reason for the problem (some other change from 13.7.2 to 13.11.2 must be the reason then).

Thank you very much,

Binarus

The patch was updated after tests uncovered a potential issue where it could access uninitialized memory. Are you using that version of the patch? As well - it has now been included in the source code.

Thank you very much. As far as I can remember, I downloaded patch set 3 for testing. Probably the bug has been corrected some days later.

I am currently testing 13.11.2 without the patch applied (I think that test should run at least for three days). If the test is successful, I will either download the newest version of the patch and apply it to 13.11.2 or download 13.12.x with the patch included if it is already out, and test again.

Regards,

Binarus

Yeah, sorry about that. There was a bug that I didn’t catch. As @jcolp said, grab the latest version of the 4033 review or grab the current 13 branch from git.

Having said that, I don’t think the registration issue is related to the patch. If it continues, let’s start another thread to figure that one out.

George, never mind. I’m a developer myself and thus can very well understand that not every subtle bug will be caught at once.

Regarding the registration issues: I am now running 13.11.2 without the patch applied (instead, I have applied my own mini-patch as described above) for more than 3 days now, and I did not see any issue during that time. So I believe that the early version of the patch I have used has been responsible for my problems. Issues with uninitialized memory could lead to errors at places you would never have thought about, couldn’t they?

My next step: I will wait until official 13.12.x is out. According to your statement, this will include the right version of the patch. I will test 13.12.x with the patch being activated (by using ,,Q(...) in many places in my extensions.conf) at least for 4 days and then report the results.

If the issues with the re-registration reappear, I will open another thread as you have proposed.

Thank you very much,

Binarus