Friday, 24 February 2012

Below average

Well I know doing stats at school was rare, but surely everyone knows that "half the population are below average". It is kind of (one of) the definition of  "average"!

Apparently not, Julia Stent states on the BBC that :-

"Britain might be riding the wave of a super-fast broadband revolution, but for 49% who get less than the national average broadband speed, the wave isn't causing so much a splash as a ripple," said Julia Stent, director of telecoms at uSwitch.

Really how thick can you be?

8 comments:

  1. What about the other 51% of people who get less than the national average....?

    ReplyDelete
  2. I did wonder the same when I read the article myself. Even if the slowest speed was 100Mb, around 49% would still get less than the national average

    ReplyDelete
  3. Hmm, well.
    Or maybe RevK gets 1GB/s and everyone else 10MB/b meaning that 99.999999% of people get less than the average?

    ReplyDelete
  4. Or maybe RevK gets 1Gb/s and everyone else 10Mb/s meaning that 99.9999% of people get less than the national average?

    ReplyDelete
  5. Depending on what they mean by "average" the 50% thing might be less obvious.

    E.g. if there are 5 people in a country who have speeds of 1,10,10,10,469 respectively the average could be 100 (depending on what you mean by average) but 80% of people are well below that.

    I agree though in this case it sounds like a failure of arse/elbow differentiation rather than a commentary on the large standard deviation observed in their sample.

    ReplyDelete
  6. @Ian, it depends on what figure you are using as average, there are three averages, the Mean, the Median and the Mode.

    The Mean is the statistical average, all the speeds in the table added up, and then divided by the number of results.

    The Median is the speed slap bang in the middle of the table.

    The Mode is the most common speed of the table.

    So, while RevK might be assuming she means Median, she is actually talking about the Median.

    ReplyDelete
  7. Either I'm missing something or it's perfectly possible for 49% to below the mean speed, it only requires that the speeds don't follow a normal distribution and are slightly skewed.

    It *isn't* possible for 49% to be below the median speed because, by definition, 50% are below and 50% are above the median - but the median isn't what most people think of as "the average".

    ReplyDelete
  8. OK, the stupidity of the comment is that it is essentially meaningless. "half the population are below average" sounds like a shocker to some people but is perfectly what you expect for any normal distribution.

    With a large sample of reasonably spread values you will have a mean and median around the same giving around 50% be below average and 50% above average. Saying 49% get below average is entirely reasonable and not a bad thing. If everyone got significantly higher speeds, you would still expect that around half got below the national average.

    Of course it is possible for 49% to be below the median even, if the samples were 1, 2, and 3, then the mean and median is 2 and one third are below that. But we are not talking a contrived small value set, we are talking a wide range of speeds. If looking at "measured download speeds" you are talking a near continuous range rather than distinct steps. Distinct steps such as 20CN BRAS rates would skew matters a lot.

    The whole idea of thinking that it is somehow shocking that "half the population are below average" is what gets me. It is the same thinking behind OFCOMs code of practice on speeds. They specifically state that below a specific percentile shall be considered a faulty line - thereby making is so that percentage of lines are considered faulty regardless of what any ISP does to try and "fix" that. They went further and said that "at or below" that percentile is a fault and that it is within specific types of line, thereby making all FTTP lines faulty (as all are 100M exactly) and making all "short 20CN lines" faulty as some huge percentage of that class of line get the full rate 8.128M making that the fault threshold.

    Why can't people learn simple stats?

    ReplyDelete