Friday, 7 February 2014

BT official: 3% packet loss is not a fault!

Yes, you heard right. BT plc have confirmed, and I quote: "3% packet loss is not considered as a fault" [Krishanu Sanyal, Broadband Customer Service Team Manager - BT Wholesale].

Please do bear in mind we are talking about a random 3% packet loss on an otherwise idle line, not loss due to full queues on a router.

This is BTs flagship super fast broadband product, FTTC (known as Infinity by BT Retail).

3% packet loss is sufficient to severely break TCP connections, making them unusably slow, as well as impact many other protocols such as VoIP.

Do not be fooled, we are not talking about a line working at 97% of its speed, this is packet loss. Whilst IP does not guarantee no loss, any loss on an idle line is an indication of some sort of fault. The occasional packet once in a while is not usually an issue, but levels like 3% are serious.

In the case of this specific customer, the LCP loss is averaging under 1% but clearly visible on our monitoring. Testing 1k pings shows an ongoing loss between 2% and 4% causing significant problems for the customer. BT have now sent four engineers, after suggesting that a lift and shift (move to a new port on DSLAM) will solve the issue, and each time the engineer has refused to do the lift and shift but also refused to actually fix the fault by any other means and just left.

So, that's the story - BT's FTTC super fast broadband would appear to be officially crap!

As a BT Group plc shareholder, I am very disappointed.

Update: fifth engineer, actually did a port change (lift and shift) and problem is same, so we are now trying to get BT to look at the fibre backhaul, etc. But with BT insisting that 3% loss is not a fault, it is an uphill struggle to get this fixed! I may have to take more drastic action.

11 comments:

  1. This really is nuts! 1% on an idle line is enough to be troublesome... I've heard providers try to excuse 10% loss as "not a fault" before - even more ridiculous.

    You're paying for better than this - it needs to be fixed. How can 3% packet loss on an idle line not be a fault? Can they show any justification beyond the "can't be bothered" factor?

    ReplyDelete
  2. Only pay 97% of all you bills - tell them the failure to pay the remaining 3% is not a fault.

    ReplyDelete
  3. Take him a FireBrick or an ASA configured as transparent layer 2 device that drops a percentage of packets that you set. Let him try and surf the web, make VOIP calls etc while your box drops random frames.

    ReplyDelete
  4. On a serious note - I was just going to add a simulate-packet-loss option to the FireBrick and do some tests - maybe make a video with speed tests, Skype calls, etc.

    ReplyDelete
  5. Using the formulae from http://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html I conclude that with zero packet loss other than due to congestion, BT Infinity is only limited by my end host (as it should be).

    With 1% packet loss, I calculate a throughput cap of (1460 bytes / 10 milliseconds) * (1/sqrt(0.01)) = 11 Mbit/s. Increase packet loss to 3%, and it's down to just 7 Mbit/s. At the 10% loss mentioned by another commenter, TCP caps out at 3.6 Mbit/s.

    Are BT really claiming that you shouldn't expect Infinity to permit you to run at more than 7Mbit/s?

    ReplyDelete
    Replies
    1. As ever, farnz, nice reference and actual figures. Thanks for that.

      Delete
    2. That would fit nicely with my actual 5 Mbps upstream measurement (sync at 20 Mbps, but never come close to transferring at that speed). Downloads are much less badly affected: the tiny little ACK packets are less likely to get dropped.

      Openreach engineer #4 told me "ah, that'll be your ISP throttling uploads". I don't think RevK liked that suggestion - or the "maybe you should be on a business line", or something like "that's what comes of not being with BT", as if BT Retail customers get better diagnostics from Openreach.

      I'm wondering if spending 3 months on a building site - the concrete bridge that cabinet is on was partly demolished and replaced recently, with the cabinet inside the dustsheets - might have affected it. Probably not good for electronics, and the cabinet's well ventilated, seeming to rely on gravity to stop rain getting in - no use against conductive, corrosive dust. Maybe just coincidence, but hopefully BT will check that out soon.

      To see BT put in writing "our packet transport system failing to transport packets is not considered a fault" is quite stunning really. Would BT be happy with Royal Mail if 3% of their outgoing invoices went missing each month?!

      Delete
    3. > as if BT Retail customers get better diagnostics from Openreach.

      Well, don't forget that BT Retail still shares many of the same backend systems as Openreach. They CLAIM they don't have access, but it's funny how BT Retail has been able to look up details of our orders and request specific engineers for faults where we have no ability to.

      Delete
  6. Keep us posted on this. Am looking at options to migrate to FTTC, and my current ASDL provider is pulling their hair out over what BT are up to with the back end systems. :(

    ReplyDelete
    Replies
    1. We are good at this, honest - you only see some of the worst examples on my blog when I need to escalate to the public to get BT to listen. A lot of what we do every day helps lots of ISPs.

      Delete
    2. I have been impressed with A&A's handling, certainly; indeed, on IRC another user had a similar line problem at the same time as mine, but on ADSL with occasional loss of sync as well as packet loss - because of the sync loss aspect, BT changed his port without weeks of argument and repeat visits, and that fixed his problem straight away.

      BT's handling, though, could perhaps generously be described as "braindead". In 2008, they dispatched engineers to a customer's home on several occasions to "investigate" a core network fault (a known packet corruption bug in Cisco router software), claiming each time to have fixed it without even testing for the fault, even lying about work done (claiming to have replaced a section of the line - without the connection being interrupted at all). Almost six years later, they haven't improved: five visits now, one with no notice, with no effort on their part to investigate the actual fault. Even now, the last fault-related communication showing from BT Wholesale is urging A&A to book yet another "engineer visit", as if that might achieve something the previous five didn't.

      Most absurd of all, Openreach said several times that BT Wholesale were still failing even to order the procedure properly: they have a job-ordering option for "do a lift and shift at the cabinet", but on all five occasions had actually selected "fault investigation at user's home" on the form - fair enough the first time, but the other four, when they knew lift+shift was the next step and were being told by both sides to order one?

      Delete