2011-12-05

Major outage, I tell you! MAJOR!

We have a text reporting for major outages. We have lots of monitoring but there is nothing like customers telling you something subtle has broken even when it is 2am. So we have a texting system for major service outages. They text real people, including me!

So you are sat at home, Monday evening, thinking "My broadband has been out since Friday lunchtime, and I don't have a dial tone. This must be a major outage that affects lots of people and somehow my ISP does not know even though it has been over 3 days and I have not mentioned it - I know, I'll text their MSO number to let them know. Who knows, I may even get someone to pause Top Gear to go look in to it."

Grrrr...

Please! MSO texts are for major outages.

A phone line fault, or a physical fault on a broadband line, needs an engineer and they will not rush out at 8pm in the evening - they need booking, which can be done just as efficiently at 9am as now, so no need to hassle someone now! This is why we don't do 24 hour support because faults like this are not fixed any more quickly if we did... You can text (not as an MSO) or email and we will get in touch to arrange an engineer.

Now, back to Top Gear.

/me refrains from waiting until 2am to ring end user to arrange the engineer visit - after all, a major outage justifies working at such hours doesn't it?

OK, OK, lets be a bit fairer shall I - and be a tad constructive. It is fair to say that most people with one broadband line cannot tell when it just stops working if it is just them or a major outage, and so do not know if they should alert us or not. That is true, so a few tips :-
  • Most major outages, whether in our network, or far more common somewhere within the carrier network, we know about, and things are being done to fix them. It is worth waiting a few minutes before panicing.
  • If you can get to the internet (e.g. mobile) check the status pages. If nothing on there then go on to the irc channel (via the support pages). If there is a major outage it will be clear on the irc channel and people will say if they have texted already - staff may even be on-line answering questions.
  • If you have multiple lines and they go off together you have a clue something more major may have happened, this is especially true if you have many different sites (as some of our dealers and IT consultant customers have).
  • If you are more technical and can see there are, for example, some routing issues on the internet or something in our network, then that is the ideal time to send an MSO text as it may not have been picked up by our automation. Please do check status pages and irc first still.
  • If the problem has been going on days - it is not an MSO - it is a normal fault report so email or text normally, or call support during the day
  • If the phone line is not working then it is not at all likely to be a major outage (and if it was, it would be a local cabling or exchange issue handled by BT and not us). So just email or text normally, or call support during the day.
Remember, an MSO hassles staff, day or night, and if it is abused it will be removed. It is there to help us and you when there is a major problem. We cannot answer all MSO texts when there is a major outage but will update status pages and irc.

Sorry if I sound a bit harsh on this particular one. I am just gobsmacked that anyone can think a three day old non-dial-tone issue can be an MSO!

12 comments:

  1. To the person who doesn't have Internet access, it is an MSO. To you it isn't.

    Putting myself into the shoes of Joe User and addressing your points:

    * Most major outages ... we know about ...

    Then why have you given me a MSO number?

    * If you can get to the internet (e.g. mobile) check the status pages.

    I can't get to the Internet - that's why I texted you.

    * If you have multiple lines ...

    I don't.

    * If you are more technical

    I'm not.

    * If the problem has been going on days

    I don't know whether it has or it hasn't - I've been out for the weekend and only just got back in the house.

    * If the phone line is not working then it is not at all likely to be a major outage (and if it was, it would be a local cabling or exchange issue handled by BT and not us).

    So, if my phone isn't working, it's not a MSO and if my phone is working it's not a MSO.

    Doesn't make any sense to me - I'd better text the MSO number just in case.



    I'd venture to suggest that the problem is not with the user, but with your systems. If you really don't want people using the MSO number, then it's up to you to put processes in place to stop this happening.

    How's about something simple - make the MSO a two stage process - have the user call a number and script a "have you tried this", "checking our servers" type system before saying "If you still think this is a MSO affecting a large number of customers, please text 07xxxxxx starting your message with the PIN 12345.". Mind you, if you're at the point where they're pressing '1' to confirm that something is or is not the case, you can issue the MSO notification directly from your phone system...

    ReplyDelete
  2. Well, yes, except (1) we define MSO as affecting lots of lines, so someone with one line down is not an MSO and they know that but how we defined it where we told them of the MSO number. (2) I said "most" not "all". (3) I said "if". (3) If you don't have multiple lines you don't know it is an MSO. (4) If you are not more technical you won't pick up on subtle routing issues, but others will, so that is OK. (5) In this instance the text said the line had been down since Friday so they did.

    The phone line point is simple - you need a working phone line to have ADSL. Phone lines do not stop working as a major outage unless it is the local exchange or major cable break, both of which are not issues we handle as a broadband incident. It is simple. However, if it made no sense then this blog has been useful in educating you. A phone line fault is either your line faulty (not an MSO) or a major telco issue so not a broadband MSO.

    As for setting up a process - we have - we tell people what is an MSO and provide some suggestions. A blog post like this is part of that process of education so that the feature is used when it is most use to everyone.

    We could make it more complex. It is not really a big enough issue to make more complex as only occasional gets messages at all. Most people can read what it says where we publish the MSO number and remember it and don't need a patronising IVR system on the end of a phone line.

    So, key points.
    1. MSO is many lines, if you can't tell it is many lines affected then don't text
    2. If your phone line is down it is not broadband MSO, don't text
    3. If the issue has been going on for days it is not a broadband MSO (or if it is, we already know), so don't text

    I'll add these to the web site.

    ReplyDelete
  3. Nicholas - thanks for the feedback anyway - what I have done is remove the "MSO" from the main support page so you have to go to the MSO page to see it (which was linked from there anyway). I have updated the MSO page with more constructive tips on sending MSO texts rather than the previous message in big bold letters that one line down is not an MSO...

    Hopefully that helps matters.

    ReplyDelete
  4. May be worth addressing the first comment there to. It is not "you don't count" if you are only one person, far from it. Yes, it is major for you. The reason we do not want MSO texts for a single line fault is simply that we will not be able to do anything with a single line fault until the next working day. If we could do something, they we would have 24 hour support, or at least evening support.

    ReplyDelete
  5. Because of the abuse, many organizations have a call center company to take the calls and filter the issues.

    ReplyDelete
  6. Yes, but I am sure such call centres would just pass on the message anyway if someone is prepared to say it is a major outage (as they did by putting MSO in the text in this case) and anyway we really really do not want faceless non-technical call centres taking messages. It would not add anything over emails and texts and irc and so on that we have now, and would just give the wrong impression of the company and would create the Chinese whispers issue which I hate when I am forced to call such people.

    ReplyDelete
  7. Even your updated page seems to be coming at the problem from the wrong angle: you've decided to provide people with the ability to send you MSO alerts (great, shows trust and a desire to both be proactive and encourage proactive behaviour in customers) and yet rather than first setting up an explanation of what exactly that means and then 'here's how you do it', you tell people the facility exists, talk about its abuse and how it doesn't apply to them in most cases, and *then* give some tips on how to decide how to use it.

    You've assumed a whole pile of foreknowledge on the part of your readers. Why not keep the MSO details on that page but begin by explaining what an MSO would be, why it's useful for you to know about them and why there are certain categories of problems that you may not pick up automatically, then outline the set of users who are likely to be able to spot such things ahead of yourselves, then provide the contact details and some examples, and finally provide a disclaimer/what not to do at the end?

    At the minute the page just sounds a little peevish. I imagine that's because you're a little peeved. Which is fine for your blog but not the main A&A site, especially if there has historically been a lack of context around what categorised an MSO.

    ReplyDelete
  8. Interesting feedback, yes I was peeved, obviously. But the previous version of that page had little more than bold capital letters saying a single line down is not an MSO. This version is way better, and provides much more detail on what might be an MSO so that people know. It also say what cases the texts are particularly useful. I'll look at it again and perhaps re-order some of it.

    ReplyDelete
  9. By the way - competent doesn't have an A in it - (...We have some very technically competant customers and we...) on http://aa.net.uk/support-mso.html

    ReplyDelete
  10. Just one thought: Having all this stuff on a web page is completely useless when the outage occurs - the victim won't be able to read it. They may have noted the phone numbers and text information, so they are unlikely to change what they would have done based on the web page.
    So... why not put it in a document that they will have already downloaded (one or more of the ones you already provide), so will be able to read it and act accordingly when they can't get on line?
    Cheers, Howard

    ReplyDelete
  11. Well, having the contact number, which I am pretty sure we put on the router, covers most normal line outages like that. The major outage issue is much more relevant for people than see it is a major outage, and they will only really be able to see that if they have other communications such as mobile, or it is a more subtle issue such as routing.

    ReplyDelete

Comments are moderated purely to filter out obvious spam, but it means they may not show immediately.

Missing unix/linux/posix file open option

What I would like is a file open option for "create replacement file". The idea is that this makes a new inode in the same mount p...