NAT is evil. SIP and NAT do not mix.

But we have been testing - we put half a dozen different makes of phone on a simple NAT which just does the port mapping dynamically, nothing else, and which was running a 5 minute timeout of UDP.

First issue is detecting NAT. That is easier than it sounds - as SIP control messages all have a Via header saying what IP they are supposed to be from. If that does not agree with the IP it actually comes from, then some mapping has happened. The reply actually has details of the IP and port from which we received, so the requesting device can also tell if NAT was used if it wants.

So, we detect NAT on a REGISTER, and record that. We log not only the stated Contact to which to send the incoming calls, but also the IP and port from which the REGISTER came which we will use instead. That way incoming calls work.

We can do the same when we get an outgoing call, an INVITE. This means we can tell that the call leg is NAT either way.

The next trick is the RTP - the actual audio stream. If we know the call is using NAT, we wait for incoming packets and send out packets back to the IP and port from which it came. This assumes, as seems to normally be the case, that devices send from the port/IP to which they want audio to be sent. It is an assumption, but for the range of phones we tested it works.

The final trick, knowing a REGISTER was NAT, is to send a dummy packet every minute to keep the NAT alive. The RFC recommendation for UDP timeouts is a minimum of 2 minutes for a high numbered port like 5060. Some phones, on detecting NAT, also do this, or send an OPTIONS message.

Of course, there is a lot that can go wrong. NAT is always a pain in the arse.
  • The phone could detect it is behind NAT and decide to try and work around it in some way, messing things up
  • The phone could send audio from IP/port that is not those to which it expected to get audio, messing up our assumptions
  • The NAT gateway to try and do some ALG rewrite on the SIP, and not get it right, which seems to happen on some devices
  • The NAT gateway could have a low timeout of less than a minute, dropping the registration session
  • The REGISTER Contact could be some other IP and port than those from which the REGISTER is sent, which would be odd, but is valid.
There is another case, which seems to be what the Technicolor router does, is a proper ALG that is managing to handle the SIP and RTP well enough that neither side actually knows there is any NAT.

So, Home::1 customer with a Technicolor can use a VoIP phone without any real problems, or so it seems. We have one set up and it seems to work rather well! Of course, if you can find an IPv6 SIP phone (and we have a couple) then that can work without NAT.

It is, as ever, a huge load of work around that is not needed when things follow the basic design principles of IP, something that is easy for IPv6 but not so easy with a lack of IPv4 addresses.

And no, I have no idea how well any of this would work via CGN.

No comments:

Post a Comment

Comments are moderated purely to filter out obvious spam, but it means they may not show immediately.

ISO8601 is wasted

Why did we even bother? Why create ISO8601? A new API, new this year, as an industry standard, has JSON fields like this "nextAccessTim...