Tuesday, 5 November 2013

BT Huawei FTTC modem bug breaking VPNs

We have confirmed that the latest code in the BT FTTC modems appears to have a serious bug that is affecting almost anyone running any sort of VPN over FTTC.

Existing modems seem to be upgrading, presumably due to a roll out of new code in BT. An older modem that has not been on-line a while is fine. A re-flashed modem with non-BT firmware is fine. A working modem on the line for a while suddenly stopped working, presumably upgraded.

The bug appears to be that the modem manages to "blacklist" some UDP packets after a PPP restart.

If we send a number of UDP packets, using various UDP ports, then cause PPP to drop and reconnect, we then find that around 254 combinations of UDP IP/ports are now blacklisted. I.e. they no longer get sent on the line. Other packets are fine.

Sending 500 different packets, around 254 of them will not work again after the PPP restart. It is not actually the first or last 254 packets, some in the middle, but it seems to be 254 combinations. They work as much as you like before the PPP restart, and then never work after it.

We can send a batch of packets, wait 5 minutes, PPP restart, and still find that packets are now blacklisted. We have tried a wide range of ports, high and low, different src and dst ports, and so on - they are all affected.

The only way to "fix" it, is to disconnect the Ethernet port on the modem and reconnect. This does not even have to be long enough to drop PPP. Then it is fine until the next PPP restart. And yes, we have been running a load of scripts to systematically test this and reproduce the fault.

The problem is that a lot of VPNs use UDP and use the same set of ports for all of the packets, so if that combination is blacklisted by the modem the VPN stops after a PPP restart. The only way to fix it is manual intervention.

The modem is meant to be an Ethernet bridge. It should not know anything about PPP restarting or UDP packets and ports. It makes no sense that it would do this. We have tested swapping working and broken modems back and forth. We have tested with a variety of different equipment doing PPPoE and IP behind the modem.

BT are working on this, but it is a serious concern that this is being rolled out.

24 comments:

  1. Modems being loaded with government mandated code to spy on VPNs, perhaps? ;-)

    ReplyDelete
    Replies
    1. I am not sure who wins on that, 23 minutes before the NSA or some such mentioned. Well done.

      Delete
    2. You're welcome. At least my comment was clearly tongue-in-cheek!

      Delete
    3. Which government I wonder? I would have scoffed at this idea in the past. Now, hmm not so sure.

      Delete
    4. I know, and I favour incompetence over conspiracy every time.

      Delete
    5. Don't be so quick to rule out the middle option, incompetent conspiracy: the firmware developer was told "send VPN packets to GCHQ", but misheard it as "send VPN packets to Timbuktu"...

      On the bright side, at least it's not IPv6 specific like that core router bug! Have you tried sending various other packets through to see if they're affected, or it's specific to UDP?

      254 ip,port tuples seems a very small fraction to be affected - less than 2^-40, hardly likely to be found by chance, and very different from the c 50% drop rate mentioned. What is it, 254 out of 512?

      Delete
    6. Paul has been doing the testing and I expect we will do more today. Testing UDP over IPv6 was one thing I wanted to try, as well as trying again for TCP as we had reports of TCP being affected. Our initial tests suggested not.

      Delete
    7. Some sidechannel attack perhaps? Would it be possible to analyze the BT firmware?

      Delete
  2. Query: Do you know which VPNs are having this problem?

    I'm specifically concerned about OpenVPN, but all info is useful...

    ReplyDelete
    Replies
    1. Anything using UDP, and we have had reports from customers of this specifically affecting OpenVPN.

      Delete
  3. Is there a way to tell if your modem has been upgraded?

    And I presume just Huawei, and not the newer LTE (?) boxes?

    ReplyDelete
    Replies
    1. Well, it stops working properly. I don't know how to tell versions. It is a bridge, so normally you can't talk to it. not been able to test the other make yet, but I doubt it is affected.

      Delete
    2. ECI are the other brand they use.
      http://beusergroup.co.uk/technotes/index.php?title=Diary_of_an_FTTC_Install#Interesting_Notes

      Delete
  4. Does this affect all FTTC modem types Adrian, or just certain models?

    ReplyDelete
    Replies
    1. We are assuming just the Huawei ones, which is like half of them or some such.

      Delete
  5. I hope this is not going to affect me when I next work...

    ReplyDelete
  6. Just had a ppp session restart followed by no internet access on devices in the house.
    A quick look on the Firebrick (which could ping the internet) showed a whole load of UDP port 53 DNS sessions were going unanswered.
    I tried some netcats to servers to confirm no UDP/53 traffic could be sent.
    Fortunately I'd only just read your blog post an hour earlier and bouncing the Ethernet port in and out almost instantly resolved the problem.

    Thanks for the info, saved me a good hour of troubleshooting.

    ReplyDelete
  7. This will I guess be another of those 'modems' that's actually a router pretending to be a modem by running in 'bridge mode' and regardless of mode runs packets through layer 3 / 4 processing of some sort.

    ReplyDelete
    Replies
    1. Indeed, it does have routing to allow for it to work with a management LAN in to BT and a TR069 LAN.

      Delete
  8. will this also affect GEA ethernet provided circuits? or does it just affect pppoe connections through the modem?

    ReplyDelete
    Replies
    1. We assume so, as it is at the modem level. It is going to be difficult to test, and I was going to ask you about this. We do not normally do any PPPoE over the GEA access, so it would mean setting that up for testing on such a circuit.

      Delete
  9. Think I'll be going back to my Unlocked/Hacked one shortly, only put the unmodified one back on when having some work done so that I didn't risk upsetting BT.

    ReplyDelete
  10. On TalkTalk using their provided DLINK DSL-3680 (Firmware Version: v1.06t Hardware Version: A1), which seems to suffer the same problem. Using SSH over OpenVPN UDP connection is fine until I attempt an scp/sftp or git pull. On my end I can see packets been retransmitted and can confirm they are not hitting the VPN endpoint. So not sure if it's volume of packets or size of packets yet need to perform further testing.

    Either a very strange coincidence or this modem is also suffering from the same issue. Would be good to if someone else can reproduce the issue on a DSL-3680.

    (Switching to another ADSL modem with different firmware does work fine)

    ReplyDelete
  11. Probably just as cheap to buy a tplink tl3040 and put purevpn openwrt on it

    ReplyDelete