Well, today has been interesting.
First thing was that we had a report of a /16 not routing to the Internet... The result was baffling and led to finding a rather obscure bug in BGP when using route reflectors (yes, Jon, OSPF OSPF OSPF, I know).
Basically, there are reasons to ignore a route - the RFC specifies these (cluster list showing our cluster, originid being us, etc). We do this. Good.
Sadly though we actually ignore the whole update, including the incidental withdraw prefixes in the same update. Bugger...
So upgrading around 15 boxes during the day, and I am pretty sure without losing a packet - win! - we have that fixed, and all seems fine.
Now to start seriously moving stuff over. Seems a visit to site needed - one cable showing unplugged?!?; A DSL router to install (backup management LAN); and some nice environmental sensors to install. That will be tomorrow.
DNS resolvers all working - linked in to route reflectors as local versions of our published resolvers. In fact everything now linked to two core route reflectors. Yay!
Tonight I started allowing lines to new LNSs as a test - i.e. any lines that reconnect were sent to new LNSs. We had tested a lot. We got Be, BT 20CN and BT 21CN on line and working... Good!
Then a snag - at least one wholesale L2TP customer did not route back to us on the new LNSs. Some worked, some did not. So job for tomorrow is chase them all to ensure routing all in place and allowing new LNS IP addresses through firewalls, etc. Fun!
So lines back to existing LNSs for now. If we can sort that tomorrow we can move everyone at the weekend.
We will probably set up at least one transit and one peering link on new kit tomorrow as well. Should be pretty simple and low risk (we always say that).
Still - progress...
Two steps forward, one step back
Subscribe to: Post Comments (Atom)
Companies bad at banking
I was discussing with a colleague the other day how so many companies are so bad with banking. In some ways we have been lucky, but to be fa...
Broadband services are a wonderful innovation of our time, using multiple frequency bands (hence the name) to carry signals over wires (us...
For many years I used a small stand-alone air-conditioning unit in my study (the box room in the house) and I even had a hole in the wall fo...
It seems there is something of a standard test string for anti virus ( wikipedia has more on this). The idea is that systems that look fo...
Can you explain what "withdraw prefixes" are? Your concerns was about a /16 not being advertised, so how are withdraw prefixes related?ReplyDelete
A BGP update includes announced and withdrawn prefixes, though they are unrelated. In this case a /16 was withdrawn but that was not seen because of the bug (withdraw sent in same update as an announce that was to be ignored under route reflector rules), so the route stayed in the table. The route then came back but as a longer prefix. The result was a bogus route in the route reflectors that was shortest path causing traffic to loop rather than go where it should.ReplyDelete
+1 for Firebrick OSPF please!ReplyDelete
I know, I know... We have several really subtle things to change on the code first, including stuff to handle non standard BGP.ReplyDelete
But I can take a hint re OSPF, honest.