Asterisk vmware esxi deployment RTP problems with one-way audio

Hello everyone.

This is my first post in communitiy since I’m using Asterisk. There was no need for me to ask anything because I found it in community. I’ve been working with Asterisk for 10 years now and have been deploying numerous system for business and also for branch I’m working in and that is telecommunications.

So lets go to my problems.

Few weeks ago I’ve deployed central Asterisk system which connects different Telecom providers for big Call Center so my Asterisk acts like gateway. There is no local extension. I have connected 3 GSM Gateways on same local subnet, telecom providers (on dedicated link and over the Internet) and cloud call center provider over the Internet.

This call center has in peak 80 concurent calls for now and will grow. What is started to happen is that on some calls there is no RTP i one direction. Mostly there are outbound calls from cloud call center over my Asterisk to telecom providers. My client (call center) reported that out of 260 calls that they made 230 were bad where B side (called customer) was unable to hear A side (agent that called them).

I’ve been researching for weeks now and I can’t figure this out. When I had problems with audio or no audio or no audio in one direction that was affecting ALL calls and not some.

Asterisk 16.6.0 is installed on Debian 10, has 16 CPU, 12GB of RAM and 500GB disk space on fast Netapp storage and esxi, where Asterisk VM is located, are connected over 4 10Gbps iface fiber optics.

When in peak of calls (70-80 simultanious) there is only 1GB of RAM used, and 1 CPU in total.

I’ve tried playing with directmedia, canreinvite and directrtpsetup and nothing worked.
Some provider do not support directmedia so I had to turn it down.

Here is my sip.conf:

[general]
context=public                 
allowoverlap=no                
udpbindaddr=0.0.0.0             
tcpenable=no                    
tcpbindaddr=0.0.0.0            
transport=udp                  

srvlookup=yes
jbenable=no


[mypbx]
usrname=user
secret=secret
host=10.xx.xx.xx
qualify=yes
type=friend
port=5060
nat=never
dtmfmode=inband
insecure=invite,port
canreinvite=no

[sipprovider]
disallow=all
allow=alaw:20
host=176.xx.xx.xx
nat=never
type=peer
dtmfmode=inband
qualify=yes
insecure=invite,port
canreinvite=yes
directmedia=yes

[cc1]
host=10.xx.xx.xx
qualify=yes
type=friend
port=5060
nat=no
dtmfmode=rfc2833
insecure=invite,port
canreinvite=no

[cc2]
host=10.xx.xx.xx
qualify=yes
type=friend
port=5060
nat=no
dtmfmode=rfc2833
insecure=invite,port
canreinvite=no

[cc3]
host=10.xx.xx.xx
qualify=yes
type=friend
port=5060
nat=no
dtmfmode=rfc2833
insecure=invite,port
canreinvite=no

[cc4]
host=10.xx.xx.xx
qualify=yes
type=friend
port=5060
nat=no
dtmfmode=rfc2833
insecure=invite,port
canreinvite=no

[cloudcc]
disallow=all
allow=alaw:20
allow=ulaw:20
allow=gsm
nat=no
host=194.xx.xx.xx
username=user
secret=secret
qualify=yes
type=peer
port=5186
dtmfmode=rfc2833
insecure=invite,port
canreinvite=yes
directmedia=yes
limit=20000

[cloudcc2]
disallow=all
allow=alaw:20
allow=ulaw:20
allow=gsm
nat=no
host=194.xx.xx.xx
username=user
secret=secret
qualify=yes
type=peer
port=5187
dtmfmode=rfc2833
insecure=invite,port
canreinvite=yes
directmedia=yes
limit=20000

[gsmAgw]
secret=secret
dtmfmode=rfc2833
canreinvite=no
host=dynamic
type=friend
port=5060
nat=no
qualify=yes

[gsmgw]
secret=secret
dtmfmode=rfc2833
canreinvite=no
host=dynamic
type=friend
nat=no
port=5060
qualify=yes

[gsmgw2]
secret=secret
dtmfmode=rfc2833
canreinvite=no
host=dynamic
type=friend
nat=no
port=5060
qualify=yes

[a1-sip1]
disallow=all
allow=alaw
remotesecret=secret
username=user
dtmfmode=inband
host=10.xx.xx.xx
type=peer
qualify=yes
canreinvite=no

[a1-sip2]
disallow=all
allow=alaw
remotesecret=secret
username=user
dtmfmode=inband
host=10.xx.xx.xx
type=peer
qualify=yes
canreinvite=no

[iptel]
secret=secret
fromuser=user
username=user
dtmfmode=rfc2833
host=public.ip
type=friend
port=5060
qualify=yes
canreinvite=no
fromdomain=some.domain
insecure=invite

[local-pbx-helper]
host=10.xx.xx.xx
qualify=yes
type=friend
port=5060
nat=no
dtmfmode=inband
insecure=invite,port
canreinvite=no

[ht]
disallow=all
allow=alaw:20
port=5060
insecure=invite,port
dtmfmode=rfc2833
qualify=yes
type=friend
host=192.xx.xx.xx
canreinvite=no

[nth]
type=friend
username=user
secret=secret
host=62.xx.xx.xx
port=5068
nat=no
insecure=port,invite
dtmfmode=rfc2833
canreinvite=no
qualify=yes
disallow=all
allow=alaw

[uni]
type=peer
disallow=all
allow=alaw
host=10.xx.xx.xx
insecure=invite,port
port=5060
dtmfmode=rfc2833
qualify=yes
canreinvite=no

There is no problems with GSM at all.

Here is rtp.conf:

;
; RTP Configuration
;
[general]
;
; RTP start and RTP end configure start and end addresses
;
; Defaults are rtpstart=5000 and rtpend=31000
;
rtpstart=10000
rtpend=31000
;
; Whether to enable or disable UDP checksums on RTP traffic
;
;rtpchecksums=no
;
; The amount of time a DTMF digit with no 'end' marker should be
; allowed to continue (in 'samples', 1/8000 of a second)
;
;dtmftimeout=3000
; rtcpinterval = 5000 	; Milliseconds between rtcp reports
			;(min 500, max 60000, default 5000)
;
; Enable strict RTP protection. This will drop RTP packets that
; do not come from the source of the RTP stream. This option is
; enabled by default.
strictrtp=no
;
; Number of packets containing consecutive sequence values needed
; to change the RTP source socket address. This option only comes
; into play while using strictrtp=yes. Consider changing this value
; if rtp packets are dropped from one or both ends after a call is
; connected. This option is set to 4 by default.
; probation=8
;
; Whether to enable or disable ICE support. This option is disabled by default.
; icesupport=true
;
; Hostname or address for the STUN server used when determining the external
; IP address and port an RTP session can be reached at. The port number is
; optional. If omitted the default value of 3478 will be used. This option is
; disabled by default.
;
; e.g. stundaddr=mystun.server.com:3478
;
; stunaddr=
;
; Hostname or address for the TURN server to be used as a relay. The port
; number is optional. If omitted the default value of 3478 will be used.
; This option is disabled by default.
;
; e.g. turnaddr=myturn.server.com:34780
;
; turnaddr=
;
; Username used to authenticate with TURN relay server.
; turnusername=
;
; Password used to authenticate with TURN relay server.
; turnpassword=

When I was testing this setup I had no problems with audio at all but that was max of 5 calls over each trunk.

Can Asterisk handle such traffic? Am I missing something?
Every help is much appreciated.

I think the question is not whether Asterisk can handle the calls, but your system.

If it is a hardware issue, then you should observe no problems with a few calls, distorted audio with more calls, and loss of audio when it gets messy.

The hardware seems to be adequate, but what about the internet connectivity? Since Asterisk is a B2BUA that typically stays in all paths, there could be a problem. A first hint would be to check what “sip show channelstats” says.

That said, Asterisk may not be the best solution in your case. Not because Asterisk is bad, but SIP servers like Kamailio seem to be more appropriate to me.

Hello.

Thank you for reply.

There are few calls now:

Peer             Call ID      Duration Recv: Pack  Lost       (     %) Jitter Send: Pack  Lost       (     %) Jitter
10.xx.xx.xx      41ac862d08d  00:02:30 0000003286  0000000000 ( 0.00%) 0.0000 0000007530  0000000000 ( 0.00%) 0.0024
194.xx.xx.xx    98545af4-78  00:00:25 0000000936  0000000000 ( 0.00%) 0.0000 0000001148  0000000000 ( 0.00%) 0.0011
192.xx.xx.xx   37929f226f6  00:00:25 0000001148  0000000000 ( 0.00%) 0.0000 0000000936  0000000000 ( 0.00%) 0.0002
176.xx.xx.xx     4dc8201f-78  00:02:30 0000007530  0000000000 ( 0.00%) 0.0000 0000007243  0000000000 ( 0.00%) 0.0015

I’m not sure what you mean by internet connectivity. This Asterisk is deployed in core of ISP network where I work. It has several interfaces. Interface with Public IP is directly connected (via VLAN) to core router that holds BGP to my uplink provider and doesn’t have any QoS or restrictions. Public IP is used to connect other voice providers over the internet that I couldn’t connect P2P. Second interface is in local network and its gateway is on Mikrotik CHR and this holds GSM Gateways trunks on same subnet. Other interfaces are P2P via VLAN and are directly connected with fiber optics. Only restriction is transreceivers on both sides and that is 1,25Gbps. All other interfaces are 10GE except uplink to provider which is 2x 1GE in LACP.

I basically meant bandwidth problems, 1.25 Gbps is a lot and that should not be a problem. The question is whether you identify a specific bottleneck when the traffic increases.

canreinvite option is deprecated and it is replaced by directmedia option, also directsetup option never worked, NAT settings need to be adequated based to the network enviroment, have you try using nat=force_rport,comedia and also adding localnet option

Hello. Thanks for reply.

Funny thing about canreinvite and directmedia. When I setup directmedia=yes on peer there was no audio at all until I setup canreinvite=yes also. So I realy don’t know what is depricated here. Directrtpsetup option was “just to try everything” but I did know that would not work.

I don’t see how NAT will help because my setup is not behind NAT, but for sake of test I did that and it didn’t make any difference. I did not use localnet and externaddr because those options are for NAT.

@EkFudrek I must gather and analyze data to see if there is some kind of bottleneck. Maybe vswitch is messing with my setup.

Your list of addresses indicates you must be operating in either a NAT environment or a broken multi-homing one (multi homing without border gateway protocol and autonomous system numbers).

If it still recognizes canreinvite at all, chan_sip treats it as an alias of directmedia, and uses the same code to handle the parameters, so it shouldn’t be possible for one to work but not the other unless canreinvite has been removed completely.

The nat= parameter is confusing, because it isn’t actually intended for inside NAT cases; it is intended for where Asterisk is on a public address and the peer is inside NAT, but not fully NAT aware.

If it’s not behind NAT why do you have NAT addresses connecting to you? How are they routing to you and visible in your peers?

The server might have a static IP, but there is a NAT issue at play here.

Are you maybe using 1to1 static NAT? That is after all a form of NAT and would drastically change your config.

Hello.

Thank you for reply.

It is not NAT. NAT is Network Address Translation and I do not use that local subnet to go out with public IP. It is just local network without NAT. So we can 100% say that is not NAT problem.

But I have some new information:

Problem is narrowed down to few peers:
Calls are originated from:
[cloudcc]
disallow=all
allow=alaw:20
allow=ulaw:20
allow=gsm
nat=no
host=194.xx.xx.xx
username=user
secret=secret
qualify=yes
type=peer
port=5186
dtmfmode=rfc2833
insecure=invite,port
canreinvite=yes
directmedia=yes
limit=20000

And go out over this:
[sipprovider]
disallow=all
allow=alaw:20
host=176.xx.xx.xx
nat=never
type=peer
dtmfmode=inband
qualify=yes
insecure=invite,port
canreinvite=yes
directmedia=yes

Or this four:
[gsmgw]
secret=secret
dtmfmode=rfc2833
canreinvite=no
host=dynamic
type=friend
nat=no
port=5060
qualify=yes

[gsmgw2]
secret=secret
dtmfmode=rfc2833
canreinvite=no
host=dynamic
type=friend
nat=no
port=5060
qualify=yes

[a1-sip1]
disallow=all
allow=alaw
remotesecret=secret
username=user
dtmfmode=inband
host=10.xx.xx.xx
type=peer
qualify=yes
canreinvite=no

[a1-sip2]
disallow=all
allow=alaw
remotesecret=secret
username=user
dtmfmode=inband
host=10.xx.xx.xx
type=peer
qualify=yes
canreinvite=no

Everything else seams to be fine as client confirmed that.

So every bad call is originated from cloudcc and goes through one of 5 mentiond peers which depends what number is called.

@EkFudrek I’ve checked everything and there is no bottleneck. I have 500/500Mbps throughtput and most of the time traffic is not greater than 50Mbps.

P.S. - there is no directimedia=yes options anymore since that did not fix the problem.

You will need to provide the actual SDP exchanges from the SIP prootocol logging, to see what addresses and ports are acutally being used.

You can either eliminate more, or you have something else to fix, though this might not have anything to do with the problem. You are using different DTMF modes.

Hello.

After analizing houndreds of calls with Wireshark i got to conclusion.
All “bad” calls without audio in one way were using RTP port higher that 20000.

My port range in rtp.conf is from 10000-20000 and there is no firewall rule for that ports. I mean, I do not block anything.

Why am I receiving RTP from SIP provider on ports higher than in my range? Should I open more ports? OR is there something else to do?

You could try it and check whether it helps…

On the other hand, initially you wrote that the rtp ports are between 10000 and 31000.

I’m sorry. Yes from 10000-31000 but i receive rtp higher than that. On some calls higher than 40000.

Are you talking about the remote source port, which can be up to 65535, or the local destination port, which should be what is offered by Asterisk in the SDP, and in your configured range?

I’m talking about remote src port. Calls received on rtp port higher than 31000 I have no audio. When in my rtp range, call is fine. My asterisk always offer port from range in rtp.conf Maybe cloud provider where calls end and from where calls originate blocks higher ports. It is on me to find that information.