tag:blogger.com,1999:blog-3993498847203183398.post1663321615424778415..comments2024-03-18T12:28:29.902+00:00Comments on RevK<sup>®</sup>'s ramblings: Helping BTRevKhttp://www.blogger.com/profile/12369263214193333422noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-3993498847203183398.post-19173700027730260122015-03-07T16:05:25.049+00:002015-03-07T16:05:25.049+00:00What surprised me in that tale was how reactive BT...What surprised me in that tale was how reactive BT's fault-ignoring process was - to the extent that when you eventually managed to prod them into investigating my fault properly, they found and fixed multiple unrelated BRAS faults in the process (which were presumably affecting other lines) before eventually stumbling across and bypassing the faulty 10G card/link that was to blame.<br /><br />I'd always assumed they had monitoring in place that would alert the NOC when a backbone segment was showing errors like that. Perhaps they only monitor the aggregate as a whole, so data loss on one component gets diluted below the alarm threshold?<br /><br />Things seem to have gone awfully quiet since the initial flurry of network diagrams and presentations about 21CN; a Google search still shows some people panicking about BT shutting 20CN down in 2014, and a distinct lack of information from recent years! About time somebody got an update out of BT on that front.jas88https://www.blogger.com/profile/05563592458314214904noreply@blogger.comtag:blogger.com,1999:blog-3993498847203183398.post-26187140486884918292015-03-07T11:11:10.832+00:002015-03-07T11:11:10.832+00:00The alternative IP was all manually adjusted befor...The alternative IP was all manually adjusted before. This sort of thing happens rarely, but it is clear we needed to make it slicker. It is, however, not always obvious that it is an LAG issue. When we saw this particular fault it was all lines, but by fluke all lines were on the same LNS at the time. Had there been more lines the LNS specific effect would have been more obvious as we would have tried the alternative IPs anyway. The big concern is that BT seem to lack any monitoring or alarms for links within an LAG.RevKhttps://www.blogger.com/profile/12369263214193333422noreply@blogger.comtag:blogger.com,1999:blog-3993498847203183398.post-26157259593228932782015-03-07T10:12:59.558+00:002015-03-07T10:12:59.558+00:00"We have improved our system of managing alte..."We have improved our system of managing alternative IP addresses now so that we are able to quickly switch addresses more easily to allow customers to get on line when we have this sort of issue."<br /><br />I'm glad to see this in place, having been a customer affected (the first?) and suggested exactly that last month.<br /><br />Is the BT fault-finding entirely ad-hoc, or is there a more sensible fault-handling process in place now for "that line can't communicate reliably with our LNS - this is not a line fault so do not attempt SFI"? (My best guess last time was to try raising it as a fault against the A&A end of the link rather than the end-user end.)jas88https://www.blogger.com/profile/05563592458314214904noreply@blogger.comtag:blogger.com,1999:blog-3993498847203183398.post-3586407278899201692015-03-06T22:05:11.938+00:002015-03-06T22:05:11.938+00:00Maybe tweet this to BT also RevK :)Maybe tweet this to BT also RevK :)JohnnyDhttps://www.blogger.com/profile/15436621035695461628noreply@blogger.com