Greetings.
I’ve been crawling both the wiki and the forum for days in hope that this particular problem has already been solved. If this is the case, please direct me to it. Otherwise, I thought I’d give my thoughts on trying to set up such a technology.
I’ve been trying to set up a home VOIP network using only free software and hardware I have lying around the house (in the form of a sturdy but unused desktop in my room). All phones/devices are generic, free softphones used on smartphones (over 3G/EDGE), of which I have no control over.
Using Asterisk-r288993 on Ubuntu 10.04
I had set up an IPSec/L2TP VPN (Openswan/xl2tpd/pppd all free, and are their own labyrinths to solve for mobile users) server to encrypt/tunnel traffic through the internet into my home network where the Asterisk server is set up. The VPN worked fine; the problem I had wasn’t with that (or this wouldn’t be relevant to this forum).
The problem seemed to be that there’s so much NAT/UN-NAT’ing going on that Asterisk had no idea where to send the RTP packets to, or that it’s not understanding that the RTP packets terminating at its NIC is intended for it.
A break-down of the IP insanity:
Mobile phone on network has internal IP of 10.0.0.0/8
Connects to VPN on Carrier IP of 166.0.0.0/8 (AT&T in this case)
VPN tunnel assigns mobile phone an IP of 192.168.1.128/25
Asterisk listens on 192.168.1.1
I’ve been using Wireshark on the Asterisk box to watch network traffic, and also on a Mac when using the Blink softphone.
Now, the SIP traffic itself worked fine. Using Wireshark, I could see all the SIP REGISTERs, INVITEs, BYEs, OKs, etc. happen 192.168.1.128/25 <–>192.168.1.1. However the REGISTER contact binding (in the 200 OK) pointed to 10.0.0.0/8, so when the call was established (say when calling the Echo() application) I got RTP traffic from Mobile–>Asterisk on 192.168.1.128/25 —> 192.168.1.1 but Asterisk–>Mobile went 192.168.1.1 —> 10.0.0.0/8.
This seems like a textbook case of nat=yes in sip.conf, except setting nat=yes (without defining any localnet) did nothing (although the wiki says that contact bindings are ignored when nat=yes, opting for comedia / symmetric RTP instead). I still got the same symptoms. So I set localnet=192.168.1.128/25 (the IP pool for road warriors), externalip=192.168.1.1 and I got the same behavior (although setting localnet/externalip seems intuitively to reverse the benefits of nat=yes and doesn’t make much sense; I tried it anyway). In fact I tried every value for nat= and I got the same behavior. RTP came in from the client on VPN tunnel (such that symmetric RTP should have worked), and RTP went out from the server to 10.0.0.0/8.
Even worse: Some SIP clients (such as Blink on OSX) were smart enough to know when they’re talking SIP through a VPN connection, and modified their own contact info in the REGISTER to match their assigned IP (192.168.1.128/25), so that the 200 OK contact-binding matched the VPN tunnel, and I got RTP from the server to client just fine.
However, Echo() still didn’t work in this case. I saw the RTP going client to server, and I’d heard the Play(demo-echotest) on the clientside–so the return trip was ok–but when the server is on the Echo() portion, I saw RTP going to the server on the network, but it wasn’t received and retransmitted back to the client. Codec agreement wasn’t an issue.
“rtp set debug on” was showing packets sent to VPN tunnel (during demo-echotest.gsm) from asterisk server to client, but zero rtp packets received. This would make sense why Echo() wasn’t working, then, because it’s not receiving anything to retransmit.
The question was, why? Perhaps the reason why RTP traffic for my 10.0.0.0/8 clients was being sent into the nether is because the RTP stream wasn’t being received so it couldn’t proceed with the symmetric RTP transmissions.
The only thing I could think of is that the client’s RTP port range is different than Asterisk’s rtp.conf range (50000-60000 on client and 10000-20000 on Asterisk). This doesn’t make sense, however, because the RTP packets from the client are (src=50000ish – dst=10000ish). Furthermore, I’m hearing the demo-echotest.gsm at the client from the Asterisk server (which travels on the exact vice-versa of client ports). So what on Earth was the problem?
I’ll tell you: default iptables security settings on INPUT was DROP (sudo iptables -P INPUT DROP) for the asterisk server. Now, I’d already opened up the SIP port (5060 by default) in the firewall. But looking at network traffic saw the client RTP reaching the server, but not showing up in the rtp debugging. The packets were being dropped right at the front door! The reason why I didn’t think of this early on is because I thought since the server was pushing out through 10000-ish ports that the firewall had been pinholed to allow return traffic on the same port/protocol. This was not the case. A quick (sudo iptables -A INPUT -s 192.168.1.128/25 -d 192.168.1.1/32 -p udp --dport 10000:20000 -j ACCEPT) and I’ve got two way comms, and it actually sounds really clear over 3G even though I’m tunneling all the way home and back out to the phone (just a longer delay).
I hope some of the debugging tools and steps I tried will help jog the memory of others that will undoubtedly encounter similar problems. Good luck hunting down your own bugs.