However, there have been a small number of consequences which we have been working on. Obviously not show stoppers otherwise the planned work would have been reversed, but oddities.
One of them was that we were having difficulty getting SNMP from some of our LNSs, which meant some of our monitoring was unavailable. This had left us scratching our heads somewhat as the LNSs were not rebooted or reloaded or anything.
Then, another snag was that today one of our servers that does syslog started to run out of disk. Again, a puzzle. But this was easier to understand just by looking at the logs.
It turns out these are related. We have some debug logs from the LNSs related to setting up PPP sessions and allocation of IP addresses. These are kept for a couple of days to help resolve any connection problems.
One of the things logged is the IPv6 allocation, and this is logged by logging the DHCPv6 request/reply exchange from the customer router. Usually these either happen once after connection or maybe once an hour.
The problem, it seems, is rather odd. Some customers still use the Technicolor ADSL broadband routers that we used to sell from years ago. It seems many of these got upset in a rather odd way after the work on Thursday. We can see no logical reason for this, but they are now in a state where they are on-line and working, but generating approximately 1GB of uplink traffic a day, each, sending DHPCv6 requests! We were logging all of these. It seems the logging may actually have been so much load that it was impacting the SNMP responses.
The fix is rebooting the Technicolor routers, which, thankfully, we can do remotely.
But this gives me a slight insight in to the difficulty of collecting Internet Connection Records. Each of these DHCPv6 exchanges would be something that might well be logged as an ICR.
In practice, just trying to log this one type of packet we could not keep up - the log file was only 16GB (158 million entries) since 4am today. Looking at the traffic levels, that is a tiny fraction of the number of requests being sent by these routers. Our LNS logging system has built in limiting to try and avoid overloading things, and it was being pushed to the limits.
If we had to log every session (TCP/UDP/SCTP/IPSEC/ICMP, etc) there is just no way any of our existing kit could keep up. Of course it wasn't designed to! It was designed to shift packets quickly and provide Internet access to our customers, not snoop on anybody.
This also highlights the issue with any deliberate generation of ICRs by s/w on customer networks. It is easy with relatively low levels of traffic to cause a lot of ICRs to be created, if the #IPBill passes.