During call: Originate -> Local -> Dial -> BridgeAdd == 200% CPU load ( forced channel swap breaking channel/call?)

I’m trying to build a simple dynamic/ad-hoc three party call feature into my growing Asterisk dialplan. To get started I added the following to features.conf:

initbridge  => #2,peer,Originate(Local/add@multi,app,dial,PJSIP/pj3)

and in extensions.lua I have:

multi = {
      ["add"] = function(c, e)
         peerch = channel.GLOBAL("peerch"):get()

where “peerch” is one of the channels already bridged in the call. This gets set earlier during initial call setup. But whereas CPU load before inviting a third party is around 40% (this is with direct_media=no, jitterbuffer, autogain, denoising and for fun toggleable pitch-shifting on the rx-side of both channels, on a cheap quad core SBC), after the third channel is bridged into the call CPU usage immediately jumps to 200%.

Here’s part of the log:

    -- Executing [add@multi:1] log("Local/add@multi-00000026;2", "NOTICE,peerch: PJSIP/pj-0000008a")
[Nov  1 02:57:43] NOTICE[31065][C-0000003e]: Ext. add:1 @ multi: peerch: PJSIP/pj-0000008a
    -- Executing [add@multi:1] bridgeadd("Local/add@multi-00000026;2", "PJSIP/pj-0000008a")
    -- Local/add@multi-00000026;1 answered
    -- Called PJSIP/pj3
    -- PJSIP/pj3-0000008c is ringing
    -- Channel Local/add@multi-00000026;2 joined 'simple_bridge' basic-bridge <7a993de6-23d5-43db-9679-8b2a73d370c6>
[Nov  1 02:57:44] WARNING[31060][C-0000003e]: dsp.c:1456 ast_dsp_silence_noise_with_energy: Can only calculate silence on signed-linear, alaw or ulaw frames :(
    -- PJSIP/pj3-0000008c answered Local/add@multi-00000026;1
    -- Channel PJSIP/pj3-0000008c joined 'simple_bridge' basic-bridge <39675d89-d5f3-437a-aa77-9dcc4d55a1ee>
    -- Channel Local/add@multi-00000026;1 joined 'simple_bridge' basic-bridge <39675d89-d5f3-437a-aa77-9dcc4d55a1ee>
    -- Channel PJSIP/pj3-0000008c left 'simple_bridge' basic-bridge <39675d89-d5f3-437a-aa77-9dcc4d55a1ee>
    -- Channel Local/add@multi-00000026;2 left 'softmix' basic-bridge <7a993de6-23d5-43db-9679-8b2a73d370c6>
    -- Channel PJSIP/pj3-0000008c swapped with Local/add@multi-00000026;2 into 'softmix' basic-bridge <7a993de6-23d5-43db-9679-8b2a73d370c6>
    -- Channel Local/add@multi-00000026;1 left 'simple_bridge' basic-bridge <39675d89-d5f3-437a-aa77-9dcc4d55a1ee>
    -- Added contact 'sip:pj3@' to AOR 'pj3' with expiration of 3600 seconds
    -- Channel PJSIP/pj2-0000008b left 'softmix' basic-bridge <7a993de6-23d5-43db-9679-8b2a73d370c6>

Especially juice might be those two lines:

Channel PJSIP/pj3-0000008c swapped with Local/add@multi-00000026
Added contact 'sip:pj3@' to AOR 'pj3' with expiration of 3600 seconds

I’m not 100% sure I understand it correctly, but I have the hunch that the actual channel behind the virtual Local/add@multi channel gets ripped in half when it gets dropped from the bridge and since it’s in an active call that somehow leaves it or the bridge in an inconsistent state. core show channels also says there are 2 active calls with 3 channels. It still works fine, but at 3-4 higher CPU usage than it should be. If I change the entry in features.conf to this:

initbridge2  => #2,peer,Originate("PJSIP/pj3,exten,conf,add,1")

there is no Local channel middle-men and only a single active call throughout. CPU usage is at ~35%, so it’s actually LOWER with a 3 channnel softmix bridge than with a 2 channel simple bridge. Maybe the softmix bridge should also be used when only two channels are bridged?

Anyway, problem is that also means no ringing is heard during call setup and no indication if the call was rejected, timed out, or what happened. Perhaps someone more experienced with Asterisk has some ideas how to proceed to route the ringing back to the bridge (all bridged channels should hear it). Maybe there’s even some way to make my original approach work since that’s the most natural way to express it, at least to me. Not sure if that behaviour is expected or a bug, though.

I think figured it out, the problem seems to have been caused by a mix of things. Currently I only have two phones so I can’t 100% replicate the setup from yesterday, but from my testing I could see that in the high-load situation it’s not so much the channel getting ripped, but the call. Current call setup is like this:

  • Call 1: pj (local) → o2_sip (remote)
  • Call X: (o2_sip (remote) → sipgate (remote); in some far away place outside asterisk)
  • Call 2: sipgate (remote) → pj(sipgate) (local)

I’m calling a regular phone number from my sipgate account here, through my sip account from o2, so there are two calls just for pj → pj(sipgate) (local endpoints, same phone actually, but doesn’t matter, call is esstablished). Now when I invoke the originate function from endpoint/channel pj (not pj(sipgate)) what happens is that the actual sipgate channel is stored in peerch and used for the bridgeadd call. Due to how I set it up originally the variable gets overwritten when sipgate calls into asterisk to reach pj(sipgate).

So the Originate, Dial and BridgeAdd all happen through the pj channel in call 1 (or rather the local channel created through originate), but the pj2 channel ends up getting bridged into call 2. As soon as I change the line in features.conf to this:

initbridge  => #2,self,Originate(Local/add@multi,app,dial,PJSIP/pj2,,,aB(multi^setch^1))``` (also peer -> self, but that really doesn't seem to matter)

and add this to extensions.lua:

      ["setch"] = function(c, e)
         ch = channel.CHANNEL("name"):get()

it starts working properly without the high CPU usage. I guess I got bitten by my use of global variables while prototyping this. I’m not 100% sure how this maps to the case with 3 local endpoints only from yesterday, but the result was the same, so I’m sure it should work now, too. What I noticed was that in the case where I initiate a call to pj2 directly without the local channel stuff the call count remains at 3, even after one local channel leaves, but with the local channel voodoo there are only 2 active calls, in both the working and broken configuration, although the way the channels are bridged is the same in both caeses. CPU load is still a bit higher than the direct call version, but not much (10-20%), which is weird, but fine.

I’m not sure if this is even supposed to work (bridging into another call that way with originate + local channnel + bridgeadd), but it does smell like a bug to me. I’m on Asterisk 18 5.1 from OpenWrt, albeit with some frankensteined .so’s I cross compiled for performance (mostly pitchshift and it happens with and without them loaded). If it sounds like a bug I might dig a bit more and provide a proper bug report with a minimal example to replicate the issue, but otherwise I’ll leave it at that because I got it to work now.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.