Monday, 27 December 2010

Everything you wanted to know about PPPoE but were afraid to ask

PPPoE is a simple concept allowing PPP (point to point protocol) packets to be carried over Ethernet (normal local area networks).

The RFC is refreshingly small, and is largely concerned with how a device (client) discovers and connects to an access controller on the network. Once you have a connection to an access controller, the rest is PPP, which has its own protocols to negotiate IP addresses and carry packets.

PPP itself dates back to the good old days of dialup modems, but is still used today for broadband lines and even high speed fibre to the cabinet and fibre to the premises lines.

The key thing PPPoE does is separate the modem (which converts signals on the line itself) from the router (which decides what to do with IP packets). There are a couple of good reasons to do this. (a) It makes for a good demarcation point for a telco allowing generic termination equipment (the modem) to be part of the service whilst providing choice of actual router, and (b) modem/router manufacturers are notoriously bad at making routers that are any good at routing (note lack of IPv6 support as a good example) and you usually want to have a decent router/firewall from someone that can make routers (like the FireBrick, of course :-) ).

As you probably saw, I wrote the FireBrick PPPoE client on Friday morning, and was well pleased with myself having tested on a Vigor V120 PPPoE/A modem on a BT line. I then spent most of this morning trying to get it working with BT lines using a Zyxel in bridge mode. It is working now with zyxel in bridge mode to BT and Be as well as to the Vigor. Next to test is FTTC and FTTP BT lines.

Whilst the RFC for PPPoE is not bad, there are a few issues:-
  1. PPPoE limits the MTU to 1492 as 8 bytes are used for PPPoE and PPP headers. Fortunately there is a later RFC allowing negotiation of baby jumbo frames (dumbo frames?) to handle full 1500 byte MTU. Unfortunately I have yet to find a router that supports it even though their Ethernet chip-sets can probably do the larger frame. Fortunately BT FTTC and FTTP does support it, apparently.
  2. PPPoE has a range of extensible tagged parameters, but they missed a trick by failing to define a few simple ones such as telephone number for dialup or VPI/VCI/encap mode for DSL. Having these would mean modems need no configuration at all and so not need DHCP, IP and web interfaces - having all parameters using the PPPoE tags would be perfect and should have been encouraged in the original PPPoE spec. Other obvious status parameters, like tx speed and rx speed and so on, in the response from them modem would have been a simple addition. These could have been defined as optional tagged values in the original spec and saved everyone a lot of time.
  3. PPPoE allows for a relay device. This makes perfect sense for a DSL router to relay PPPoE either to PPPoA as raw PPP, or to a remote PPPoE device on the wire whilst appearing as only one device on the local network. This is how it should be done. Sadly it seems almost all routers that do PPPoE work in a bridge mode - bridging the LAN to the far end of the DSL line. This causes serious problems. For a start you have no way to direct traffic to a specific line via a specific router/bridge, if you have more than one, as you only see the far end bridged Ethernet MAC addresses. You also have no way to tell this has happened. You also have to run a separate LAN segment even if you only have one router/bridge as the broadcast traffic on your LAN is bridged and can trip MAC address limits on the DSL service. In short, each PPPoE router/bridge has to be on its own LAN segment which is a pain, and a shame as the spec allowed them to act as relays!
  4. Finally, bugs... It seems our favourite telco do not follow the RFC. There is an "end" tag, id 0x0000, which you can put at the end of the list of tags. It is not required but remains for backwards compatibility. So I dutifully included it, and all was well. Vigor happy. Be happy. Could not get working with our favourite telco. Turns out if you include this completely valid tag then our favourite telco just totally ignore your PADI packets. WTF! RTFRFC guys!
So, the new FireBricks now do PPPoE, including negotiating IPv6, including baby jumbo frames, and including multiple links on separate ports with bonding. They even provide loss/latency graphs for each line from the client end.

There is much more code still to do though...

7 comments:

  1. Will this allow you to change the MTU between the Firebrick and Vigor 120? http://www.draytek.co.uk/support/kb_vigor_mtu.html. I was looking the other day when I was working out whether I can replace my dismal router with a Firebrick, working with my Vigor..

    ReplyDelete
  2. There was some IRC discussion on #A&A-Asterisk about multiline bonding with a FB2700, for cases where you only have access to slow ADSL. As the FB2700 only has 4 or so ports, we were wondering if it would work with PPPoE inside 802.1q tagged VLANs?

    If so, we could solve the (theoretical only) multiline bonding issue with a cheap web-managed 10/100 switch - put each ADSL router on a switch port, give each port a tag, put the FB2700 on one port, and use 802.1q VLANs to let us only use one physical port for 4 or more ADSL modems.

    ReplyDelete
  3. Second question from the same source; what QoS support is the FB2700 going to have? Will it support queue selection based on DSCP values? Will it be able to tag identified traffic with new DSCP values?

    ReplyDelete
  4. Yes, of course it does - at present it allows up to 10 PPPoE's but could do more. Obviously packet reordering starts to be an issue if you have too many.

    ReplyDelete
  5. And second question... Not yet, but we may add queue selection. At present it has the Fb6000 style "small packets get priority" logic.

    ReplyDelete
  6. You mention reordering, we currently use a bonding system which delays releasing packets so that it can release them in order, in itself it does not resend packets but simply releases all of those that are currently in the queue if it detects that a packet has been lost, but again the packets that are released when that happens are done so in-order.

    Of course the same problems that cause out of order packets when bonding also cause jitter in this system, but it does solve the major problems that made >2 line bonding useless for us.

    The software profiles each line in turn when they come up (and when sent SIGUSR) so that it can calculate the optimum loading ratio for the lines, we then setup a TBF at each end at 98% of the calculated total bandwidth, that tends to keep packet loss to a minimum and the entire system works extremely well.

    Our 6 lines give us 16.4mbit total and a single ftp download is currently running at 1.98mbyte/sec.

    If you were to add such a feature I might be interested in a pair of fb2700's or if you can terminate the tunnel on your fb7000 just one.

    doc

    ReplyDelete
  7. Yes, many ways to do this when using tunneling. The Fb105 tunnels have some means to do this too (not yet in the current FB2700 build). Of course the protocols using IP should not be assuming they will come in order anyway, and usually it is not a problem unless you are doing more that 4 lines.

    ReplyDelete