Wednesday, 11 November 2015

Why internet is not the same as telephone - lesson for MPs

We all have some grasp of how a telephone works, I think.

The telephone can be relatively simple - remember the old ones with mechanical switches and dial - this is not complex technology.

When you use the telephone and dial a number, the telephone company has equipment that works out where the call has to go.
  • It works out routing for the call across the network
  • It connects all of the way end to end
  • It ensures the data (voice) gets from one end to the other intact, in order, and reliably
  • It keeps that connection in place
  • At the end it dismantles that connection
  • It makes a record of the connection for billing purposes
I hope that all makes sense.

Now for the Internet and why it is different.

First off, yes, things using the Internet typically still makes a connection end to end to another device over the network and has the data (even voice) go reliably from one end to the other. The connection starts, continues, and ends in much the same way as a telephone call.

But there is a HUGE DIFFERENCE in the way this is done. In the telephone network all of the clever stuff is in the telephone network. The phone company does all the work to establish and maintain and then dismantle the connection. The Internet works very differently, the end devices turn that connection in to tiny packets of data and send those via the network.

The network operator does not see any connections at all - they see packets. Indeed, making a connection is just one of many ways for end devices to communicate over the Internet. It is possible to send packets with no reply, send packets that get one packet reply, or send packets that even go to more than one place at once. All of the clever stuff is done in the end devices, and only there is the concept of a connection seen or possibly logged in some way.

This is why the Internet does not have any sort of Internet Connection Record in the same way as telephone systems have a Call Data Record, because the network operator does not see connections! There don't even have to be connections in any logical sense and many protocols work without a logical connection being made. The network operator looks at one thing only - the destination of each packet, so it can get the packet one step closer to that destination. Individual routers and network operating companies do not even need to know the route the packet will take, just the next step to which they have to send a packet to get it one step closer.
  • Telephone: The intelligence and "connection" logic exists in the network so easy to log
  • Internet: The intelligence and "connection" logic exists in end devices and not the network
So the idea of getting Internet communications operators to log and retain Internet Connection Records is totally nonsense.


  1. So that's why you've been eating pot noodles ;)

  2. The other big thing is that with HTTPS sites in particular even if you do connection tracking (which is still not conclusive, but for a typical TCP session does tend to work, albeit with a cost in terms of router overhead etc), all you know is the destination IP address.

    To work out the domain name visited (which is the equivalent of the telephone number dialled, as you very rarely directly enter an IP) you would have to track DNS requests as well, and correlating those is never going to be 100% accurate.

    HTTP by default is the same but by doing DPI / intercepting proxies it can be worked around so you know what site the user is trying to get (not that I'm saying it should be routinely though!)

    1. With SNI, you can see the domain name visited as this is not encrypted. I think browsers now include this information whether the server uses SNI or not. Even if the server didn't use SNI, it would therefore only be hosting one SSL-enabled domain so there would be no question of which it was. The domain name alone often doesn't tell you all that much though.

    2. With hindsight in particular, I think it's a great shame SNI leaves that loophole: better if it send a hash - better still a salted hash, perhaps with some server-supplied salt to allow efficient caching on that end - of the desired hostname.

      Like the recent post about leverage, there's blackmail potential if someone - an evil ISP, a rogue employee, a nosy colleague with firewall logs - can just skim logs and notice that you've been visiting a gay dating website, a recruitment site or a support site for some embarrassing medical condition. Hit a CDN looking for "the website with hash 0x34134f39ef9303bc3939" and they've pretty much hit a dead end.

      Let's hope that post-Snowden, privacy will prove a much higher priority in protocol design generally!

    3. TLS 1.3 aims to encrypt SNI and ALNP information in the TLS setup phase.

      Not a panacea, but it makes passive snooping harder.

  3. ISTM that the ICR is an artifact defined by the bill to do what 'is intended to be done' and to get around the problems with things like CG Nat - whether it makes technical sense is entirely secondary.