Friday, 5 May 2017

Contention ratios

In light of the broadband speed post, I am (again) going to try and explain why "content ratio" is also a bad metric. Someone suggested it as a "quality" measure for an ISP, which seems sensible at first, but sadly not.

What is a contention ratio?

It is a measure of how much a link is shared (aka "contended"). The idea is you take the total demand on the link, and the link's capacity, and divide one by the other.

For example, if you have a 1Gb/s fibre going to a RAS which serves 100 separate 80Mb/s connections to customers: The demand possible is that all 100 of those customers may want 80Mb/s of traffic, so 8Gb/s total, but you only have a 1Gb/s connection which they are sharing, so 8:1 contention ratio.

The ratio is normally expressed as N:1, and a lower "N" is "good" because it means the link is not shared as much - simple... Basically, a link that is not shared so much is not as likely to slow down because of sharing of a link.

Sadly not so simple...

Contention ratio measured where exactly?

The first issue is you cannot really say the contention ratio "of the ISP" in a meaningful way. For a start, the end user is buying access to the internet, they do not really care if there is some arbitrary point that has been picked to quote contention - they want to know how likely it is that their communications slows down. But let's try an explain a little here using FTTC from AAISP.

For most end users the first contention is that there is likely to be more than one person in their house using the internet. That is contention. Their (share of) the link will slow down if the other people are also using it. However, we are trying to measure the ISP here, so one side of the ratio is normally the expected speed which is the sync speed of the modem the end user has. Sharing beyond that in to the home is not the ISPs problem. So that gives us one side of the ratio - add up the sync speed of all of the people sharing the "link" in question.

The first link is the customer modem linking to the modem in a cabinet - that is not shared at all - the phone line carrying your 80Mb/s (if you are close to cabinet) is not shared. So 1:1 contention, excellent - let's quote that shall we?

The next link is a fibre from the cabinet - that is shared with the hundreds of people on that cabinet - your neighbours. Openreach may publish metrics for that, I am not sure. We (AAISP) don't know what they are. A contention ratio measured there is not AAISP specific - but it is a perfectly valid concern that this could be the point your traffic slows down due to sharing. In practice Openreach operate VLANs and you can buy "lower contention" on a cabinet for more money.

The next link will be the cable link in the exchange. This is where a group of cabinets are linked to a switch and that links to a back-haul carrier in the exchange. We use TT and BTW. The sharing here is all of the people using that same back-haul carrier in that group of cabinets. I don't know if BTW or TT publish contention ratios, I doubt it. Again that is not an AAISP specific contention ratio, and will be different depending on which backhaul we have attached your service. Again, it is a perfectly valid concern that this could be the link that slows down traffic.

There is then an exchange backhaul where one or more cable links go to a metro node and other links before they reach a node that hands over to AAISP. This is, again, down to the back-haul carrier.

We then have links to carriers (BTW and TT). But these are multiple links and multiple sites, you will be sharing with all of the people on the link you happen to use, and that will change dynamically. So even if we worked out exactly what the total capacity of all of our customers is on each link, it changes if you (or they) reconnect. This is something we probably could work out if we tried.

We have links between switches, and we have links to LNSs, again, dynamic.

Then we have links to edge routers, and that will depend on where on the internet you (and other people) are trying to connect as to how much that is shared. These then connect to transit and peering, and again there is a contention.

Then out on the internet via transit, around the world there are shared links.

Ultimately you could say "what is the contention ratio" of a web site, e.g. aa.net.uk. That has a 1Gb/s port - so what is the total bandwidth of every internet connection in the world that may wish to access that site? Well the contention ratio there is millions to one.

Every single one of these shared points could be a cause of congestion, of slow down, so which contention ratio would you publish, and how would you handle that they can change dynamically. Even contention on your FTTC cabinet changes as other people take service in your neighbourhood.

There is no sensible way to come up with one meaningful point to publish as "the ISPs contention ratio". And even if you did, an ISP may have to publish several values for different services and back-haul carriers or even parts of the country. It would not be a simple number with which to compare two ISPs.

10:1 is not the same as 100:10 or 1000:100, really!

Even if you wanted to look at a specific link, and say what contention is that, it is not so simple. If you had 10 people with 10Mb/s links using one 10Mb/s shared link, then that is 10:1. But the chance of slowdown is high as you only get 10Mb/s if none of the other 9 people are using their service at all!

But what if you have 1000 people with 10Mb/s links sharing a 1Gb/s link, that is still 10:1. Now the chance of slow down is low, even if 99 other people are running their connection flat out at 10Mb/s you can still do the same. In practice, the larger the pipes, the more these things average out and you don't see congestion.

But both of these are 10:1 ratio, even though very different risk of slow down due to sharing. The ratio alone is not a good indicator of likely congestion - you need to know the size of the links as well.

Is the link congested?

The issue is actually congestion. Is a link slowing down because of sharing, basically is a link getting full.

The contention ratio does not actually tell you if you will get congestion. It depends on the usage of the shared link - how much do others sharing that link actually try to use - so an ISP that does a lot of "telemetry" type of customers (bus stop signage, etc) may have a massive contention ratio (if you worked out where to measure that) because their customers all send a few bytes a second, but they may see no congestion. An ISP that targets customers doing 24/7 UHD streaming would need more capacity per customer (a lower contention ratio), but even a low contention ratio may leave people buffering because of congestion.

The ratio is only meaningful if you know the usage level / demand as well. So basically, it is not meaningful.

Indeed, as one half of the equation is the expected speed, an ISP only selling 2Mb/s links would have a really good contention ratio because the backhaul links in the networks are all much larger now (to allow for all these 80Mb/s links). That would make such an ISP look good, when all it actually means is they sell slow services. Of course, if you have to advertise a contention ration, it makes sense to have an FTTC service capped at 2Mb/s, and quote that contention ratio as your headline package.

A better metric would be the average throughput per customer on a link. This is not a ratio, but Mb/s, and means the speed of the end links is not a factor any more on the assumption that people will use what they use (on average) regardless of their line speed. You still have the problem of which point in the network you are measuring though, and the problem of it mattering what actual usage/demand is.

What do AAISP do exactly?

At AAISP we aim not to be the bottleneck (i.e. not have the links we control getting full). All links are shared, so it is always possible, but we aim to invest as necessary to allow for changing trends in usage. We also challenge the back-haul carriers if we see congestion in their network (which means monitoring every line all the time). This does happen, and is normally either temporary (waiting for someone to upgrade a link) or a fault which we get fixed.

12 comments:

  1. Assuming you can get metrics from BT/TT, and collect them from your network, you should be able to plot the bandwidth headroom for every link between the CPE and the internet at large. Since you know which links may have been needed to transfer traffic from each customer to the rest of the internet, you can figure out the proportion of time that each user was relying on a link with no headroom (and therefore the proportion of time that user's connection speed was limited by congestion either within BT/TT or within the ISP).

    Aggregate that across all the customers, and surely you can get some meaningful statistics - the average percentage of time customers see their internet connection congested, standard deviation of that, etc.

    Of course, it relies on BT and TT to provide metrics, but it seems to be a reasonable way of quantifying what kind of performance a customer should expect: an ISP with a rubbish internal network is going to produce a higher "average percent of time spent congested" figure than an ISP that tries to avoid being the bottleneck.

    That said, it incentivises ISPs to get rid of customers who are on highly contended BT links for whom a less contended TT link isn't available (or similar).

    ReplyDelete
    Replies
    1. All of the stuff in BT and TT involves sharing links with other people that are not A&A customers, do we could not work out total usage of such links to see if congested. Simpler is if TT/BT tell us a link is congested. But we can see from LCP echoes we do on every line every second when that happens which is simpler than trying to do the maths when we don't have the data. In general the BT and TT back-haul is not congested.

      Delete
  2. I think you're throwing in a lot of irrelevant material here. Clearly when someone asks about "the contention ratio of an ISP", what they want to know is the contention ratio of the most contended link that is part of the ISP's responsibility to the customer (eg, between exiting the user's house and exiting the ISP's network to the rest of the internet). That doesn't seem difficult to define unambiguously, and while it may be hard to measure for the reasons you outlined, it's a concrete figure.

    That figure may not be a useful indication of ISP quality for the reasons you outline, but I don't think it's fair to claim that it's confusing as to what you should actually measure to get the figure.

    ReplyDelete
    Replies
    1. No, sorry. Just the fact that an ISP can purchase different "contention at the cabinet" from BT Openreach means that contention at that point as a service differentiator may be a factor in quoting a service quality metric. At present we are not seeing cabinet congestion, but if that ever does start, it will matter to the end user. Also, "exiting the ISP" is not simple either, if we have only 1Gb/s to a peering point, do we quote contention for that just because that is the link that may be used for a customer to access something, or do we quote based on the 10Gb/s link to a transit point. Both could be a bottleneck and could be used by all customers at once. Also, why the arbitrary "existing the ISP network to rest of internet" - why does that point matter to the end user rather than any other point? If Level 3 started a broadband offering, where would that be for them as they also run a transit network? What would be nice with your proposal though is an ISP with a BT wholesale link that is heavily limited because they do not buy much bandwidth from BT will be able to quote low contention as they can afford a huge connection on their "exit to the rest of the internet" as such links are much much cheaper by comparison to BT wholesale interconnects. It really is not simple to define, and even harder to avoid someone "engineering" a good result, which you also have to consider.

      Delete
    2. Shorter answer: The very fact you picked "exiting the ISP's network to the rest of the internet" as the point to measure is part of the problem - that is rarely the bottleneck even on really slow ISPs.

      Delete
    3. Isn't the (theoretical) "contention from the cab" figure already differentiated into three tiers in terms of priority? ie 80/55/40

      I know 80 gets higher priority between cab and head-end than 40, logical 55 does too.

      I don't suppose you have enough customers but given your monitoring systems have you ever seen any cab->head-end congestion? It surely must exist in places?

      Delete
  3. There's a term I haven't heard in a long time.

    Last term I heard that term I was doing first line training for o2 broadband (I was there from official launch day to the day o2 decided customer service wasn't core to the function of being a Telco) and we were told that we didn't have one. If we needed more capacity then regardless of what capacity was already there we simply got more.

    How true that was I really can't say though.. o2 did have a habit of lying to its staff... lying about the outsourcing and sale to sky days before they were announced, lying about the value of TUPE, lying about the newspaper report about the plan to shut the centre 2 years after outsourcing...

    ReplyDelete
  4. It used to be so much simpler. My first is (dial up) had 16 oncoming lines and 200+ customers.. Contention was most evident at the engaged tone..

    1996 I was on the local cable company (NYNEX) cable modem trial. 200 of us. Mix of customer types. All sharing a 64kbps, later 2mbps, outbound link. You could tell immediately when others were using their connection!

    ReplyDelete
    Replies
    1. I was on the same cable modem trial too, Rob. (Manchester, right? Around 1998 if memory serves me correctly.)

      It was of course pretty magical to have an 'always-on' connection, but I think we all learnt a lot about the practical effects of contention through this!

      I do remember staying up until 3am to download news from usenet, as that was pretty much the only time I'd ever see the full speed.

      Hard to imagine an ISP trial, even in those days, thinking that was a good idea to put 200 customers with 10mbps cable modems behind a single 64kbps link :) It was certainly more usable when they upgraded to 2mbps, albeit after some months if not years. I bet Nynex learnt a lot from our usage, as well..

      JMH

      Delete
  5. A very interesting point and discussion.

    ReplyDelete
  6. Not sure if you were referring to my comment to your earlier post. In any case, I completely agree that contention ratio is not a useful thing to quote (or even to try to define). However, I believe that the effects of it are a very important quality differentiator between ISPs and something that an ISP like AAISP needs to be able to promote.

    I think the problem is that the discussion has got into a lot of discussion of technical internal analysis and metrics. What should be measured here (and used in advertising) is a standardised quantification of customer experience. Telcos learnt this a long time ago for voice when they stopped quoting bandwidth, latency and jitter and created MOS. We need a MOS for broadband.

    My suggestion is that CAP should be defining a "UK Broadband MOS" based around something meaningful to UK consumers. It could be as simple as "90-th percentile effective streaming speed for BBC iPlayer in the busy hour". That would be a metric useful and meaningful to almost everyone in the UK.

    Sure, it would encourage some specific ISP behaviours to optimise iPlayer access (special peering arrangements, hosting of BBC CDNs, etc) but those would probably help most consumers and would include things that are beneficial for accessing many other sites as well.

    If it got to the stage where that particular metric was distorting ISP behaviour unreasonably then it would be time for CAP to step in and define a new metric. Meanwhile we would have had a period of several years where quality-focused ISPs such as AAISP could effectively advertise their differentiated value.

    Maybe this suggestion is too simplistic, but I think the need is for a MOS, defined in a way which promotes the behaviour consumers want to see. Of course, business connections would need to compete in a different way.

    ReplyDelete
    Replies
    1. Graham Cobb: That might help me choose between two ISPs who are both selling BTW FTTC connections, but how does it help me choose between ADSL, FTTC, FTTP, Coax, etc? Fundamentally, anything that you aggregate across multiple addresses isn't terribly useful because what can be offered varies so widely depending on your location.

      Delete