ROSE invoke operation not handled! ROSE_Unknown?


#1

Hello folks, I have a client with a perplexing situation on their hands. They have a CentOS 4.9 based server running the following software (installed via RPM):

asterisk-1.4.42-D2.C4.SC
asterisk-addons-1.4.13-1.C4.LSE
asterisk-devel-1.4.42-D2.C4.SC
asterisk-extra-sounds-en-gsm-1.4.11-1.C4.LSE
asterisk-libpri-1.4.11.5.1-1.C4.LSE
asterisk-perl-1.01-1.C4.LSE
dahdi-linux-2.6.9-100.EL.plus.c4.1.LSEsmp-2.4.1.2-1.C4.LSE
dahdi-tools-2.4.1-1.C4.LSE
iaxmodem-1.2.0-2.C4.SC
iaxmodem-libiax2-1.2.0-2.C4.SC
iaxmodem-spandsp-1.2.0-2.C4.SC
wanpipe-modules-2.6.9-100.EL.plus.c4.1.LSEsmp-3.5.20-0
wanpipe-util-3.5.20-0
kernel-smp-2.6.9-100.EL.plus.c4.1.LSE

They are connected via a single PRI to a CenturyLink pipe and have been up and running for a year and a half now. Over the past two months they began experiencing dropped calls both in and out bound with ISDN cause codes ranging across the board - 16,17,21,28,31,34,44,127,etc. The only change on the server is minor version increments in libpri and asterisk. The telco has tested the line, replaced the smart jack we have talked to facility engineers for hours trying to figure out the problem. Telco says there is nothing wrong with the circuit. It is a standard build out. Their switch is a DMS100. The line is build out as B8ZS/ESF. The switchtype is National2 (they had us change to dms100 for debugging - no help). The line hunt is set to ascending sequential on their side, we are descending. They are master we are slave (pri_cpe). The only special change in chan_dahdi.conf is pridialplan=unknown and prilocaldialplan=local which were necessary for long distance to work.

The symptoms have gotten worse in the last week to the point that we can only get 10-12 calls going at the same time. Strangely, if we reset everything we can fill the PRI with calls for an hour or two before we begin getting the drops. Calling users get a telco message “you have reached a number that is not in service” or “all circuits are busy now”. Outbound callers get The Voice, “all circuits are busy now…”

No one I have spoken with has a clue as to what’s going on. The only message I have seen in the debug log has to do with Q.SIG / ROSE invoke not handled by National ISDN (as reported by libpri 1.4.12). I contacted Sangoma regarding the error and they suggested we revert from libpri 1.4.12 to 1.4.11.5 with a patch that is already applied to 1.4.12 which I did. It had no positive effect.

Right now, there are 9 calls on the circuit. The 10th call fails with all circuits busy and here is the PRI debug log:

==========================================
< Protocol Discriminator: Q.931 (8) len=69
< TEI=0 Call Ref: len= 2 (reference 3914/0xF4A) (Sent from originator)
< Message Type: SETUP (5)
< [04 03 80 90 a2]
< Bearer Capability (len= 5) [ Ext: 1 Q.931 Std: 0 Info transfer capability: Speech (0)
< Ext: 1 Trans mode/rate: 64kbps, circuit-mode (16)
< User information layer 1: u-Law (34)
< [18 03 a9 83 85]
< Channel ID (len= 5) [ Ext: 1 IntID: Implicit Other(PRI) Spare: 0 Exclusive Dchan: 0
< ChanSel: As indicated in following octets
< Ext: 1 Coding: 0 Number Specified Channel Type: 3
< Ext: 1 Channel: 5 Type: CPE]
< [1c 15 9f 8b 01 00 a1 0f 02 01 52 06 07 2a 86 48 ce 15 00 04 0a 01 00]
< Facility (len=23, codeset=0) [ 0x9F, 0x8B, 0x01, 0x00, 0xA1, 0x0F, 0x02, 0x01, ‘R’, 0x06, 0x07, '’, 0x86, ‘H’, 0xCE, 0x15, 0x00, 0x04, 0x0A, 0x01, 0x00 ]
< [1e 02 82 83]
< Progress Indicator (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) 0: 0 Location: Public network serving the local user (2)
< Ext: 1 Progress Description: Calling equipment is non-ISDN. (3) ]
< [6c 0c 21 83 38 35 30 32 32 34 35 37 33 37]
< Calling Number (len=14) [ Ext: 0 TON: National Number (2) NPI: ISDN/Telephony Numbering Plan (E.164/E.163) (1)
< Presentation: Presentation allowed of network provided number (3) ‘NXXXXX5737’ ]
< [70 0b a1 38 35 30 35 35 38 31 32 35 35]
< Called Number (len=13) [ Ext: 1 TON: National Number (2) NPI: ISDN/Telephony Numbering Plan (E.164/E.163) (1) ‘NXXXXX1255’ ]
– Making new call for cref 3914
Received message for call 0x85f58a0 on 0xb7845f68 TEI/SAPI 0/0, call->pri is 0xb7845f68 TEI/SAPI 0/0
– Processing Q.931 Call Setup
– Processing IE 4 (cs0, Bearer Capability)
– Processing IE 24 (cs0, Channel Identification)
– Processing IE 28 (cs0, Facility)
– Processing IE 30 (cs0, Progress Indicator)
– Processing IE 108 (cs0, Calling Party Number)
– Processing IE 112 (cs0, Called Party Number)
– Delayed processing IE 28 (cs0, Facility)
ASN.1 dump
Context Specific [11 0x0B] <8B> Len:1 <01>
<00> - "~"
Context Specific/C [1 0x01] Len:15 <0F>
Integer(2 0x02) <02> Len:1 <01>
<52> - "R"
OID(6 0x06) <06> Len:7 <07>
<2A 86 48 CE 15 00 04> - "
~H~~~~"
Enumerated(10 0x0A) <0A> Len:1 <01>
<00> - "~"
ASN.1 end
interpretation Context Specific [11 0x0B] = 0 0x0000
INVOKE Component Context Specific/C [1 0x01]
invokeId Integer(2 0x02) = 82 0x0052
operationValue OID(6 0x06) = 42.840.10005.0.4
operationValue = ROSE_Unknown
Skipping unused constructed component octets!
!! ROSE invoke operation not handled! ROSE_Unknown
q931.c:6966 post_handle_q931_message: Call 3914 enters state 6 (Call Present). Hold state: Idle
q931_hangup: other hangup
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Call Present, peerstate Call Initiated, hold-state Idle
q931.c:5052 q931_release_complete: Call 3914 enters state 0 (Null). Hold state: Idle

DL-DATA request
Protocol Discriminator: Q.931 (8) len=9
TEI=0 Call Ref: len= 2 (reference 3914/0xF4A) (Sent to originator)
Message Type: RELEASE COMPLETE (90)
TEI=0 Transmitting N(S)=38, window is open V(A)=38 K=7

Protocol Discriminator: Q.931 (8) len=9
TEI=0 Call Ref: len= 2 (reference 3914/0xF4A) (Sent to originator)
Message Type: RELEASE COMPLETE (90)
[08 02 81 ac]
Cause (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) Spare: 0 Location: Private network serving the local user (1)
Ext: 1 Cause: Requested channel not available (44), class = Network Congestion (resource unavailable) (2) ]
q931_hangup: other hangup
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Null, peerstate Null, hold-state Idle
NEW_HANGUP DEBUG: Destroying the call, ourstate Null, peerstate Null, hold-state Idle

< Protocol Discriminator: Q.931 (8) len=36
< TEI=0 Call Ref: len= 2 (reference 3914/0xF4A) (Sent from originator)
< Message Type: FACILITY (98)
< [1c 1d 9f 8b 01 00 a1 17 02 01 53 02 01 00 80 0f 46 4c 4f 52 49 44 41 20 43 41 4c 4c 20 20 20]
< Facility (len=31, codeset=0) [ 0x9F, 0x8B, 0x01, 0x00, 0xA1, 0x17, 0x02, 0x01, ‘S’, 0x02, 0x01, 0x00, 0x80, 0x0F, 'FLORIDA CALL ’ ]
– Making new call for cref 3914
Received message for call 0x85f58a0 on 0xb7845f68 TEI/SAPI 0/0, call->pri is 0xb7845f68 TEI/SAPI 0/0
– Processing IE 28 (cs0, Facility)
– Delayed processing IE 28 (cs0, Facility)
ASN.1 dump
Context Specific [11 0x0B] <8B> Len:1 <01>
<00> - "~"
Context Specific/C [1 0x01] Len:23 <17>
Integer(2 0x02) <02> Len:1 <01>
<53> - "S"
Integer(2 0x02) <02> Len:1 <01>
<00> - "~"
Context Specific [0 0x00] <80> Len:15 <0F>
<46 4C 4F 52 49 44 41 20-43 41 4C 4C 20 20 20> - "FLORIDA CALL "
ASN.1 end
interpretation Context Specific [11 0x0B] = 0 0x0000
INVOKE Component Context Specific/C [1 0x01]
invokeId Integer(2 0x02) = 83 0x0053
operationValue Integer(2 0x02) = 0 0x0000
operationValue = ROSE_QSIG_CallingName
callingName Name
namePresentationAllowedSimple Context Specific [0 0x00] =
<46 4C 4F 52 49 44 41 20-43 41 4C 4C 20 20 20> - "FLORIDA CALL "

DL-DATA request
Protocol Discriminator: Q.931 (8) len=9
TEI=0 Call Ref: len= 2 (reference 3914/0xF4A) (Sent to originator)
Message Type: RELEASE COMPLETE (90)
TEI=0 Transmitting N(S)=39, window is open V(A)=39 K=7

Protocol Discriminator: Q.931 (8) len=9
TEI=0 Call Ref: len= 2 (reference 3914/0xF4A) (Sent to originator)
Message Type: RELEASE COMPLETE (90)
[08 02 81 d1]
Cause (len= 4) [ Ext: 1 Coding: CCITT (ITU) standard (0) Spare: 0 Location: Private network serving the local user (1)
Ext: 1 Cause: Invalid call reference value (81), class = Invalid message (e.g. parameter out of range) (5) ]
q931_hangup: other hangup
NEW_HANGUP DEBUG: Calling q931_hangup, ourstate Null, peerstate Null, hold-state Idle
NEW_HANGUP DEBUG: Destroying the call, ourstate Null, peerstate Null, hold-state Idle
===============================================

This problem is a show stopper and is causing a major disruption in my client’s business. I hope there is someone here who recognizes the problem and can suggest a fix!!

Thanks, Andrew


#2

It looks like the ROSE invoke is not the problem. I have plenty of good calls with that same message.

I’ve made a couple changes since posting. Kernel version is now kernel-smp-2.6.9-101.plus.c4.1.LSE and libpri is now asterisk-libpri-1.4.12-1.C4.LSE .

No errors since restarting everything. It may be a few hours before that begins. We have 19 calls up right now with no errors. trying to get INTENSE debug logged with error.


#3

solar flares


#4

We tested twice over the weekend, filled the PRI dialing in and out with no error. Thought the software update did the job (however unlikely). today, monday, total failure. Error logged this time:

chan_dahdi.c: Span 1: SETUP requested unavailable channel 0/2. Attempting to renegotiate.

calls were unable to send out or come in at random. it seemed to try to take the next open channel but caught a cause code 34.


#5

ok so after a bunch of debugging, we’ve determined via ISDN PCAP that we are not transmitting when an attempted call fails…

================

[b]edit --> posted the wrong call log… hang on…

[/b]

ISDN pcap trace showed no packets sent out over the pipe. I was told by Sangoma support that the communication is breaking somewhere between DAHDI and libpri. To make this case more perplexing, we are up and running using the same software configuration on a different server (DELL 2850). I feel like I’m troubleshooting a moving target!

Would upgrading to asterisk 1.8 be a worthwhile effort? At my level of understanding of this problem, that will be my next step.

Can anyone suggest further steps can I take to determine the exact cause?

Thanks, Andrew