Traffic shaping causing jb resyncing


#1

While trying to improve quality between two offices over not very ideal internet connections (One is on a “business” comcast cable modem, the other is on a 1 meg SDSL line), I setup some traffic shaping which has seriously improved call quality during file uploads/downloads. However, ever since I added the traffic shaping we’ve started to occasionally get the following warnings:

May 2 16:56:19 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the
jb. last_delay -2920, this delay -400, threshold 1008, new offset -2478
May 2 17:01:43 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the
jb. last_delay -2521, this delay 380, threshold 1008, new offset -2858

Whenever that warning appears, the call goes silent for a few seconds to a minute and starts to become very delayed and/or choppy for a few minutes before it clears up again. Usually only happens after a long call (can take 10-30 minutes before it starts happening).

I tried playing with various jitter buffer settings, but haven’t made much progress. I was hoping someone could help me understand what is going on and how I might be able to solve it?


#2

If it started after your ‘traffic shaping’ then you may want to take a closr look at what you did. (And maybe post some info).

Try doing an iax2 show netstats as well as just turning on iax2 debug to see if there are any obvious issues being reported during the call. (I’m assuming an iax channel from the messages).

p


#3

Yes, these are iax2 channels between servers. Using the g729 codec.

Turning on iax2 debug, I’m not sure what to make of the output. It seems to be doing a lot of retries every few seconds even when there’s no calls going on, I don’t know if thats regular “chatter” between servers or not. It’s hard to isolate things because we have 6 offices/servers connected to each other via iax2. Bechtelsville is always the side that reports the “Resyncing the jb” warning.

Here’s some of the output around when the resyncing happens:

Tx-Frame Retry[000] -- OSeqno: 069 ISeqno: 069 Type: IAX     Subclass: PING   
   Timestamp: 220013ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Tx-Frame Retry[000] -- OSeqno: 070 ISeqno: 069 Type: IAX     Subclass: LAGRQ  
   Timestamp: 220016ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 069 ISeqno: 070 Type: IAX     Subclass: PONG   
   Timestamp: 220013ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
   RR_JITTER       : 8
   RR_LOSS         : 0
   RR_PKTS         : 10923
   RR_DELAY        : 70
   RR_DROPPED      : 0
   RR_OUTOFORDER   : 1
asterisk*CLI> 
Tx-Frame Retry[-01] -- OSeqno: 070 ISeqno: 070 Type: IAX     Subclass: ACK    
   Timestamp: 220013ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 070 ISeqno: 070 Type: IAX     Subclass: PING   
   Timestamp: 220036ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
Tx-Frame Retry[000] -- OSeqno: 071 ISeqno: 071 Type: IAX     Subclass: PONG   
   Timestamp: 220036ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
   RR_JITTER       : 3
   RR_LOSS         : 20
   RR_PKTS         : 10631
   RR_DELAY        : 78
   RR_DROPPED      : 48
   RR_OUTOFORDER   : 0
asterisk*CLI> 
Rx-Frame Retry[ No] -- OSeqno: 071 ISeqno: 070 Type: IAX     Subclass: LAGRQ  
   Timestamp: 220039ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
Tx-Frame Retry[000] -- OSeqno: 072 ISeqno: 072 Type: IAX     Subclass: LAGRP  
   Timestamp: 220039ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 072 ISeqno: 071 Type: IAX     Subclass: LAGRP  
   Timestamp: 220016ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
Tx-Frame Retry[-01] -- OSeqno: 071 ISeqno: 073 Type: IAX     Subclass: ACK    
   Timestamp: 220016ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 071 ISeqno: 072 Type: IAX     Subclass: ACK    
   Timestamp: 220036ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 072 ISeqno: 073 Type: IAX     Subclass: ACK    
   Timestamp: 220039ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
May  3 12:21:41 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the jb. last_delay -2975, this delay -448, threshold 1006, new offset -2506
Tx-Frame Retry[000] -- OSeqno: 073 ISeqno: 073 Type: IAX     Subclass: LAGRQ  
   Timestamp: 230013ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 073 ISeqno: 074 Type: IAX     Subclass: LAGRP  
   Timestamp: 230013ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
Tx-Frame Retry[-01] -- OSeqno: 074 ISeqno: 074 Type: IAX     Subclass: ACK    
   Timestamp: 230013ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 074 ISeqno: 074 Type: IAX     Subclass: LAGRQ  
   Timestamp: 230036ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
Tx-Frame Retry[000] -- OSeqno: 074 ISeqno: 075 Type: IAX     Subclass: LAGRP  
   Timestamp: 230036ms  SCall: 00020  DCall: 00008 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 075 ISeqno: 075 Type: IAX     Subclass: ACK    
   Timestamp: 230036ms  SCall: 00008  DCall: 00020 [71.126.92.43:4569]
   
asterisk*CLI> iax2 show netstats
                               -------- LOCAL ---------------------  -------- REMOTE --------------------
Channel                    RTT  Jit  Del  Lost   %  Drop  OOO  Kpkts  Jit  Del  Lost   %  Drop  OOO  Kpkts
IAX2/reading-20            56    3  890     0   0    54    0     11    8   70     0   0     0    1     11

The traffic shaping is something I honestly know very little about. Each location has a different QoS setup and connection…

Bechtelsville is behind a Linksys WRT54G running OpenWrt WHITE RUSSIAN (RC5) with the qos-scripts package:
wiki.openwrt.org/Faq#head-a1dc72 … dfb4b761ff

Bechtelsville’s /etc/config/qos-wan:

class:Priority
pktsize:300
pktdelay:10
oversize:400:Normal
avgrate:30
share:75
end

class:Express
pktsize:400
pktdelay:10
oversize:400:Normal
avgrate:35
share:80
end

class:Normal
pktsize:1500
pktdelay:70
avgrate:20
share:30
end

class:Bulk
share:10
dlavgrate:15
dlimit:75
end

option:enabled
option:upload:950
#option:download:3000

option:priority:Priority
option:bulk:Bulk

classify:Bulk:ipp2p=all
classify:Bulk:layer7=edonkey

# voip
classify:Priority:proto=udp dport=5060
classify:Priority:proto=udp sport=5060
classify:Priority:proto=udp dport=4569
classify:Priority:proto=udp sport=4569

classify:Priority:proto=udp sport=53
classify:Priority:proto=udp dport=53
classify:Priority:proto=tcp dport=22,53
classify:Priority:proto=tcp sport=22,53

classify:Normal:proto=tcp dport=20,21,25,80,110,143,443,993,995

default:Express:proto=udp maxsize=300
default:Normal:dport=1-1024
default:Normal:dport=1-1024 mode=dl
default:Bulk
default:Bulk:mode=dl

This is the output of iptables -L -v -t mangle in Bechtelsville:

Chain DEV_ct_vlan1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
  347 14950 MARK       all  --  any    any     anywhere             anywhere            MARK match 0x0 ipp2p v0.8.1_rc1 --ipp2p MARK set 0x4 
 2215  174K MARK       all  --  any    any     anywhere             anywhere            MARK match 0x0 LAYER7 l7proto edonkey MARK set 0x4 
    4  1728 MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 5060 MARK set 0x1 
    0     0 MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport sports 5060 MARK set 0x1 
    2   118 MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 4569 MARK set 0x1 
    0     0 MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport sports 4569 MARK set 0x1 
    2   300 MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport sports 53 MARK set 0x1 
40798 2871K MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 53 MARK set 0x1 
    0     0 MARK       tcp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 22,53 MARK set 0x1 
 2894  150K MARK       tcp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport sports 22,53 MARK set 0x1 
42207 2287K MARK       tcp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 20,21,25,80,110,143,443,993,995 MARK set 0x3 

Chain DEV_vlan1 (2 references)
 pkts bytes target     prot opt in     out     source               destination         
  25M   10G CONNMARK   all  --  any    any     anywhere             anywhere            CONNMARK restore 
  21M 9872M DEV_ct_vlan1  all  --  any    any     anywhere             anywhere            MARK match 0x0 
  25M   10G CONNMARK   all  --  any    any     anywhere             anywhere            CONNMARK save  
    0     0 MARK       all  --  any    any     anywhere             anywhere            TOS match Minimize-Cost MARK set 0x4 
  13M 2091M MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 length 0:300 MARK set 0x2 
 191K   15M MARK       tcp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 1:1024 MARK set 0x3 
49873   39M MARK       udp  --  any    any     anywhere             anywhere            MARK match 0x0 multiport dports 1:1024 MARK set 0x3 
7364K 7720M MARK       all  --  any    any     anywhere             anywhere            MARK match 0x0 MARK set 0x4 
42425 2306K MARK       tcp  --  any    any     anywhere             anywhere            MARK match !0x4 length 0:128 tcp flags:FIN,SYN,RST,PSH,ACK,URG/SYN MARK set 0x1 
 802K   40M MARK       tcp  --  any    any     anywhere             anywhere            MARK match !0x4 length 0:128 tcp flags:FIN,SYN,RST,PSH,ACK,URG/ACK MARK set 0x1 
 124K   27M MARK       all  --  any    any     anywhere             anywhere            TOS match Minimize-Delay MARK set 0x1 
19915 5624K MARK       icmp --  any    any     anywhere             anywhere            MARK set 0x1 
    0     0 MARK       icmp --  any    any     anywhere             anywhere            icmp echo-reply MARK set 0x4 
25963   17M MARK       all  --  any    any     anywhere             anywhere            MARK match 0x1 length 400:65535 MARK set 0x3 
    0     0 MARK       all  --  any    any     anywhere             anywhere            MARK match 0x2 length 400:65535 MARK set 0x3 

Chain PREROUTING (policy ACCEPT 45M packets, 17G bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 102K packets, 14M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 45M packets, 17G bytes)
 pkts bytes target     prot opt in     out     source               destination         
  25M   10G DEV_vlan1  all  --  any    vlan1   anywhere             anywhere            

Chain OUTPUT (policy ACCEPT 107K packets, 17M bytes)
 pkts bytes target     prot opt in     out     source               destination         
56115 9660K DEV_vlan1  all  --  any    vlan1   anywhere             anywhere            

Chain POSTROUTING (policy ACCEPT 45M packets, 17G bytes)
 pkts bytes target     prot opt in     out     source               destination

Reading is behind a regular linux firewall box using a modified WonderShaper (HTB) called AstShape from:
krisk.org/astlinux/misc/astshape

astshape status says:

qdisc ingress ffff: ---------------- 
 Sent 7744391683 bytes 9400288 pkts (dropped 550057, overlimits 0) 

 qdisc sfq 40: quantum 1514b perturb 10sec 
 Sent 0 bytes 0 pkts (dropped 0, overlimits 0) 

 qdisc sfq 30: quantum 1514b perturb 10sec 
 Sent 1427240406 bytes 7536677 pkts (dropped 0, overlimits 0) 

 qdisc sfq 20: quantum 1514b perturb 10sec 
 Sent 68304100 bytes 770367 pkts (dropped 0, overlimits 0) 

 qdisc sfq 10: quantum 1514b perturb 10sec 
 Sent 61383074 bytes 948312 pkts (dropped 0, overlimits 0) 

 qdisc htb 1: r2q 10 default 30 direct_packets_stat 5
 Sent 1556928183 bytes 9255362 pkts (dropped 0, overlimits 397699) 

 class htb 1:1 root rate 500Kbit ceil 500Kbit burst 6Kb cburst 2239b 
 Sent 1556927646 bytes 9255357 pkts (dropped 0, overlimits 0) 
 rate 10912bps 22pps 
 lended: 0 borrowed: 0 giants: 0
 tokens: 51584 ctokens: -10880

class htb 1:10 parent 1:1 leaf 10: prio 1 rate 500Kbit ceil 500Kbit burst 6Kb cburst 2239b 
 Sent 61383074 bytes 948312 pkts (dropped 0, overlimits 0) 
 rate 341bps 5pps 
 lended: 948312 borrowed: 0 giants: 0
 tokens: 97536 ctokens: 35072

class htb 1:20 parent 1:1 leaf 20: prio 2 rate 500Kbit ceil 500Kbit burst 6Kb cburst 2239b 
 Sent 68304100 bytes 770367 pkts (dropped 0, overlimits 0) 
 rate 69bps 
 lended: 770367 borrowed: 0 giants: 0
 tokens: 96640 ctokens: 34176

class htb 1:30 parent 1:1 leaf 30: prio 3 rate 450Kbit ceil 450Kbit burst 6Kb cburst 2175b 
 Sent 1427240472 bytes 7536678 pkts (dropped 0, overlimits 0) 
 rate 11187bps 17pps 
 lended: 7536678 borrowed: 0 giants: 0
 tokens: 49355 ctokens: -21187

class htb 1:40 parent 1:1 leaf 40: prio 4 rate 400Kbit ceil 400Kbit burst 6Kb cburst 2111b 
 Sent 0 bytes 0 pkts (dropped 0, overlimits 0) 
 lended: 0 borrowed: 0 giants: 0
 tokens: 122880 ctokens: 42240

#4

unfortunately I can’t invest the time to dig through the details. It’s clear from what you reported that something you did with your QoS setup broke things. If it is only happening on one path, there is a good hint to where to start looking.

My appologies in advance if the following is obvious to you, but I will point out that doing any type of CoS tagging on packets that are going out over the internet will almost certainly have no effects given your porviders. (I didn’t even look close enought to see if you were or weren’t doing this, but just thought I’d mention it just in case.)

p


#5

I believe I’m not just tagging the packets, I’m shaping the bandwidth. Limiting the overall bandwidth to lower priority traffic and giving full bandwidth to the voip ports at the firewalls.

It has definitely made a huge improvement. If I turn off the traffic shaping rules and try to upload/download a lot, calls will immediately begin to break up. Unfortuntely we’re now running into this new problem on longer duration calls and that resyncing warning. I thought perhaps the traffic shaping was causing a lot of jitter as it reprioritizes packets and eventually overloading the jitter buffer?


#6

I just had a very bad call while watching the iax2 debug output a little different then before. The RR_LOSS, etc numbers look really odd:

May  3 13:39:48 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the jb. last_delay -3, this delay 11119, threshold 1042, new offset -11119
May  3 13:39:48 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the jb. last_delay -42, this delay 11082, threshold 1044, new offset -11082
May  3 13:39:48 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the jb. last_delay -2398, this delay -11124, threshold 5556, new offset 42
May  3 13:39:49 WARNING[13411]: chan_iax2.c:692 jb_warning_output: Resyncing the jb. last_delay -2398, this delay -11122, threshold 5556, new offset 3

Tx-Frame Retry[000] -- OSeqno: 090 ISeqno: 090 Type: IAX     Subclass: VNAK   
   Timestamp: 282829ms  SCall: 00014  DCall: 00008 [66.93.176.44:4569]
Rx-Frame Retry[ No] -- OSeqno: 093 ISeqno: 089 Type: IAX     Subclass: LAGRP  
   Timestamp: 282744ms  SCall: 00008  DCall: 00014 [66.93.176.44:4569]
Tx-Frame Retry[000] -- OSeqno: 090 ISeqno: 090 Type: IAX     Subclass: VNAK   
   Timestamp: 282832ms  SCall: 00014  DCall: 00008 [66.93.176.44:4569]
Rx-Frame Retry[Yes] -- OSeqno: 109 ISeqno: 106 Type: IAX     Subclass: PING   
   Timestamp: 340040ms  SCall: 00001  DCall: 00006 [71.126.92.43:4569]
Tx-Frame Retry[000] -- OSeqno: 108 ISeqno: 110 Type: IAX     Subclass: PONG   
   Timestamp: 340040ms  SCall: 00006  DCall: 00001 [71.126.92.43:4569]
   RR_JITTER       : 1
   RR_LOSS         : 33554440
   RR_PKTS         : 16701
   RR_DELAY        : 64238
   RR_DROPPED      : 1
   RR_OUTOFORDER   : 2

Rx-Frame Retry[Yes] -- OSeqno: 110 ISeqno: 106 Type: IAX     Subclass: LAGRQ  
   Timestamp: 340043ms  SCall: 00001  DCall: 00006 [71.126.92.43:4569]
Tx-Frame Retry[000] -- OSeqno: 109 ISeqno: 111 Type: IAX     Subclass: LAGRP  
   Timestamp: 340043ms  SCall: 00006  DCall: 00001 [71.126.92.43:4569]
Rx-Frame Retry[ No] -- OSeqno: 111 ISeqno: 107 Type: IAX     Subclass: PONG   
   Timestamp: 347795ms  SCall: 00001  DCall: 00006 [71.126.92.43:4569]
   RR_JITTER       : 0
   RR_LOSS         : 0
   RR_PKTS         : 1
   RR_DELAY        : 40
   RR_DROPPED      : 0
   RR_OUTOFORDER   : 0

Tx-Frame Retry[000] -- OSeqno: 099 ISeqno: 102 Type: IAX     Subclass: PING   
   Timestamp: 320013ms  SCall: 00014  DCall: 00008 [66.93.176.44:4569]
Tx-Frame Retry[000] -- OSeqno: 100 ISeqno: 102 Type: IAX     Subclass: LAGRQ  
   Timestamp: 320016ms  SCall: 00014  DCall: 00008 [66.93.176.44:4569]
Rx-Frame Retry[ No] -- OSeqno: 102 ISeqno: 100 Type: IAX     Subclass: PONG   
   Timestamp: 320013ms  SCall: 00008  DCall: 00014 [66.93.176.44:4569]
   RR_JITTER       : 0w 
   RR_LOSS         : 0w 
   RR_PKTS         : 1w 
   RR_DELAY        : 40 
   RR_DROPPED      : 0w 
   RR_OUTOFORDER   : 0w 
asterisk*CLI> iax2 show 
Tx-Frame Retry[-01] -- OSeqno: 100 ISeqno: 103 Type: IAX     Subclass: ACK    
   Timestamp: 320013ms  SCall: 00014  DCall: 00008 [66.93.176.44:4569]
Rx-Frame Retry[ No] -- OSeqno: 103 ISeqno: 100 Type: IAX     Subclass: PING   
   Timestamp: 320000ms  SCall: 00008  DCall: 00014 [66.93.176.44:4569]
Tx-Frame Retry[000] -- OSeqno: 101 ISeqno: 104 Type: IAX     Subclass: PONG   
   Timestamp: 320000ms  SCall: 00014  DCall: 00008 [66.93.176.44:4569]
   RR_JITTER       : 3w 
   RR_LOSS         : 16777158
   RR_PKTS         : 15195
   RR_DELAY        : 62 
   RR_DROPPED      : 0w 
   RR_OUTOFORDER   : 117

#7

I didn’t think you were trying to do CoS but just thought I’d mention it in case. That last snapshot looks pretty ugly - question is, is it the networking that is screwing with you or is it the the bw management work you are doing on one one side or another.

If the network between the two sites is really a big issue, you may want to try one of two things. First - find out why and see if you can get it fixed. (E.g. how is the network quality, jitter and delays between the problematic site and it’s providers primary gateway? I’ve seen horendous problems that fit this category that got fixed after a few hours on the phone with the provider.) The other thing you may want to try is to setup a vpn between the two problematic sites. Although common wisdom usually says ‘don’t do this’ because of the added delay that is created, there was a study that showed on a really bad network, the encapsulation and resulting reordering and retransmission of lost packets that occurs in a tcp created vpn can significantly help call quality issues in a very bad network environment).

just some thoughts,

p