Well, we had one of the first real tests of the dual redundant set-up in Telehouse today - where our main broadband links, data SIMs, and half our Ethernet customers connect.
We are upgrading core network switches so we have more ports for additional links (adding talk talk lines, and extra peers).
As you can imagine, physically replacing a core switch could be somewhat disruptive!
The design uses an "A" and a "B" side. We configured L2TP links to all be on one side a couple of days ago. We were then able to carefully shutdown BGP sessions to move all traffic off one side at a BGP level. This allowed the switch to be removed, and replaced.
It is always tricky picking when to do stuff like this. Middle of the night is all very well, but that is much harder to have people available to monitor and fix things. In this case we were quite rightly confident that this would work with little or no disruption. The main trick here is taking one step at a time, carefully, and checking everything.
Well done to Andrew, Paul and Alex on this. Even though it pretty much went to plan, this is always stressful work.
The result is that we managed to replace the switch with no impact on our broadband lines, data SIMs or Ethernet customers. In theory there would not have been one dropped packet because of this, and it looks like things did indeed work as per theory. Pings running over the system showed none dropped, as expected.
Of course, just to throw a slight spanner in the works, BT manage to disconnect most of Scotland shortly before we started. We can only guess that they are not quite as careful as us, and we have seen that they have single points of failure in their network.
We plan to do the other switch, probably in a few weeks time. The only change being that we are now monitoring one of our wholesale customers who managed to have their backup link down during the work, which had the result that you might expect, albeit only for a few minutes. We may show them how to set up nagios.
As I say, I love it when a plan comes together - you can rely on the A&A team :-)
That was fun
Subscribe to: Post Comments (Atom)
Companies bad at banking
I was discussing with a colleague the other day how so many companies are so bad with banking. In some ways we have been lucky, but to be fa...
Broadband services are a wonderful innovation of our time, using multiple frequency bands (hence the name) to carry signals over wires (us...
For many years I used a small stand-alone air-conditioning unit in my study (the box room in the house) and I even had a hole in the wall fo...
It seems there is something of a standard test string for anti virus ( wikipedia has more on this). The idea is that systems that look fo...
Out of interest, which switches do you entrust your network to? Are they running any routing protocols, or do you leave that to the FB6000's?ReplyDelete
We only do switching, not routing on the switches, but they are managed, with VLANs and SNMP and traps. They do SFLOW as well, which is nice. But to be honest I am not sure which make it is - I left that to the tech guys to decide after lots of research! Eeek, I am turning in to a manager.Delete
The lights are still on in TeleCity Hex 8/9 at the moment so they haven't broken anything here yet... even though I did bump into them in Reception.ReplyDelete
Accused me of being a terrorist in front of TeleCity staff too...
I'll start drafting up the ADR complaint.
> adding talk talk linesReplyDelete
Ooh, is that the hint of an announcement that I've missed somewhere?
Making the (possibly wild) assumption that this is something like Be but for CPW / Talk Talk, that would be amazing, as they're the only LLU provider on my exchange (EACTM), which BT haven't deigned to upgrade to 21CN yet, so currently stuck at 8Mbps on a 500m long line...
Is the future TalkTalk product going to be ADSL only, or will it be FTTC/P/GEA too?ReplyDelete
Oh the irony, I got more updates on the EDINBURGH BBLA1 & BBLPA2 outage from the A&AISP status page than I did from BT. I even had a contact in Operate who didn't give me as much info. maybe its time to change Job role.ReplyDelete