Monday, 24 February 2014

BT 21CN not fit for purpose?

Update: The BT Wholesale Incident Helpdesk manager has stated:
Please rest assured that BT Wholesale are fully committed to ensuring that all capacity issues are addressed.

I posted about this some years ago when BT had recently started their 21CN network. They had congestion in parts of their network and were refusing to believe it, let alone fix it.

Eventually, after the press got hold of my blog post, BT took action, and our monitoring graphs were used internally within BT where BT Operate staff were able to track down a series of errors and mis-configurations that were causing the congestion.

Following that (as we understand it) BT set up a new department to pro-actively manage the various links and perform necessary upgrades. Occasionally there are still issues, usually a mis-configuration or a fault and hence something BT do not pick up themselves. These usually get fixed once we report them.

Unfortunately there must have been a change of staff or attitudes in BT and once again we see core links within BTs networks showing congestion which BT are basically refusing to address.

Importantly these congestion issues are present even when paying extra for "business grade" premium services (elevated weighting) on the broadband tails.

We recently had a statement from BT that "3% packet loss is not considered a fault", which is outrageous. Despite being published on ispreview they have not refuted that statement. This was an idle line 3% loss all the time (which we believe we have tracked down to a fault on a core link within BT), but the suggestion that 3% idle line loss is not a fault is totally crazy.

Now it seems that serious back-haul congestion is also acceptable to BT. They insist on SFI engineers (who cannot test packet loss anyway) sent during the day (when the packet loss is not present).

The good news is that we don't see this on TalkTalk back-haul, and we, like you, have a choice. If we have to move all customers on a congested area to Talk Talk back-haul to fix this, we will. I am sure other ISPs can do the same.

What is even odder is the comment from a senior member of BT plc staff, Neil McRae, that it was "rubbish" and  "there are no BT exchanges at all that have any congestion". I pointed out that we have clear evidence of exchanges where all 21CN show packet loss, i.e. "Every line we have on Hampton exchange shows signs of congestion". At the suggestion it was the SVLAN not the exchange I said "If you say it not the exchange it is the SVLAN, then that is the same thing - if the pipe to the exchange is getting full, then that is congestion at the exchange". I was told "no" and "but if you want to be stupid then I won't argue with you". These are the comments from someone that describes himself as Chief Network Architect at BT.

We know there are exchanges showing serious signs of congestion, such as Coventry, Hainault, Southwark, Canonbury, Loughton, where 21CN lines are showing loss in the evenings.

We have, of course, confirmed that this is real and general problem by checking with other ISPs, who are also chasing BT

Update: For one area BT have stated: "The issue that is currently causing packet loss and slow speeds to end users is with the backhaul links being over utilised. In relation to over utilisation, we are talking about ports or lag groups trying to send more than 100% of their capacity. When the buffer on these ports or lags fills up, it will cause packets to be dropped." confirming BT have congestion in their core network.

The question from me:
This is a really simple question for BT plc: Are BT going to formally commit to addressing congestion faults, or do we (and other ISPs) need to start moving people to an uncongested network like TalkTalk wholesale instead?
Example of a line on Canonbury exchange over the last week

18 comments:

  1. I presume that "the senior member of BT plc staff, Neil McRae" is not a engineer or tech.

    If someone (particularly another engineer type) comes to you and says "There is a problem, and I have evidence of the problem. Here is the evidence".

    The *right* response is to look at the evidence and either refute their assertation with a logical reason why it is not the case (perhaps your own measurements do not take into account something that they know) and also back it up with your own evidence when possible (Well, here are *our* metrics, which take into account X & Y)

    OR

    You listen to what you are being told and investigate!

    You do not say "Don't be silly". That's pretty much an abusive ad hominem response (You don't know what you are talking about, shut up!)

    ReplyDelete
    Replies
    1. This was at LINX on the irc, i.e. in front of 150 people who are UK ISPs.

      Delete
    2. https://uk.linkedin.com/in/neilmcrae

      Delete
    3. Sadly this sort of response is endemic within BT plc. "We are BT! We can do no wrong! Bow down before us and worship our mighty telecoms skillz!"

      I remember years ago when broadband was first launched, attempting to use the XML back end for ordering and faults and finding serious issues with it (like at times being told orders had failed when they hadn't and such). Eventually I got to speak to one of the engineers behind the software who simply told me I was wrong and the system was perfect. So perfect that he even admitted that we were one of the few attempting to use it, and most other ISPs used screen-scraping software on the shitty Siebel-based front end. He literally could not understand the contradiction in what he had just said and continued to insist the system was perfect and would not even investigate the issues.

      Delete
    4. Indeed - it was even more fun when you have an & in your company name.

      Delete
    5. I'm guessing that broke things? I've only seen 1 broken file on any of my (as a office worker, no technical involvement) employers system (Banks etc) which was caused by a web form allowing numbers in names. Only 1 or so customers ever made that typo, but when they did, I doubt anyone knew how to fix it. :P

      Delete
  2. I wonder if "not congested" has the same kind of reverse logic ISP-specific definition that "unlimited" seems to have.

    ReplyDelete
  3. To be fair to Neil, what he said was that there were no exchanges where BT did not have sufficient 'capacity', and that what you're seeing is configured PVCs being full, and that BT would need to expand those PVCs (or add additional ones) to make use of spare capacity into the exchange.

    We are, however, arguing semantics. As far as you, the service provider, or your customers, the end users, are concerned, there is congestion into the exchange.

    ReplyDelete
    Replies
    1. If you have log of the irc, can you email me. He was pretty clear, even when I said it was full SVLANs (which I think is the term they use for 21CN rather than PVCs).

      But yes, I did try and make clear - whatever he called it - the effect is congested exchanges :-)

      Delete
    2. The statement we have from BT now states that actual physical links and legs of LACs are at capacity and that new fibres and cards are needed tofix it. That is pretty conclusively congestion at the exchange and insufficient capacity. When challenged on this, Neil would not confirm if that was true (i.e. he had lied) or if we were now being lied to.

      Delete
    3. "Neil would not confirm if that was true (i.e. he had lied) or if we were now being lied to"

      Perhaps 'clueless' would be a suitable word?

      In many organisations it would be plausible that "chiefs of" were not fully aware of the real situation outside HQ. Whose fault that is, is a different discussion.

      Delete
    4. Well, yes, and if Neil said he did not know and he'd look in to it then that would be fine. He was, instead, adamant that there is no congestion and that I was stupid for saying there was.

      Delete
  4. Is there likely to be an alternative wholesaler providing backhaul for FTTC in the future?

    (An alternative shared backhaul, that is, rather than the Ethernet-over-FTTC service which I drool over, but definitely can't afford...)

    ReplyDelete
    Replies
    1. TT will have FTTC real soon now. We have the EoFTTC service now, but I think it is really soon to do normal FTTC tails.

      Delete
  5. Is it safe to assume there are no SLAs in this picture?

    Meanwhile, is this "BT chief network architect" the same one as posts as neil123 in the comments on ADSLguide's news pages? See e.g.
    http://www.thinkbroadband.com/news/6260-bt-infinity-users-having-problems-with-ubisoft-uplay-servers.html
    and search for "chief network architect".

    Plenty other posts there from neil123 to find via your favourite search engine. Have a read, see what you think.

    ReplyDelete
    Replies
    1. The same who also posts news comments on ISPReview from time to time as Neil McRae. The most recent implying it's fine Openreach cut right back on their FTTP deployment and the UK barely shows on said statistics as they've vectored VDSL 2, G.Fast and G.Fast 2 in the labs / under extremely limited trials.

      Delete
    2. The UK also barely shows on worldwide IPv6 statistics for the same reason: lack of investment by BT and Virgin (who account for the bulk of the UK market). They're trying to spend as little money as possible while claiming to be fast and other such nonsense.

      Delete