Monday, 31 July 2017

Very careful blog post...

I blog on many things, some I am closely involved in, some I have views on, and some I am puzzled by or curious of. This is a blog post in which I find I have to be very careful.

This is meant to be a discussion point, honest!

Pedophiles!

The actual definition of the word is simply someone that "likes" children. Well, we all like children, I have 5 and I have 2 grandchildren. I like children... If you are a francophile you like the French. It does not mean you want to have sex with all French people! Pedophile, as a word, has become more refined, not only someone sexually attracted to children but someone that actually commits illegal acts and abuses children. The word has become quite specific.

I have children (grown up) and grandchildren - I would be horrified if any of them, at any time in their life, were at any risk from someone even thinking of abusing them in any way, sexual or otherwise.

I personally have no sexual desire for children, and I feel that I have to make that clear when making such a blog post. I find the whole idea repulsive and wrong. The issue is that even someone wanting to debate such issues is at risk of being branded a pervert!

However, I also have no sexual desire for sex with men, and this is where the whole issue gets complicated.

We now, in society, accept that there are homosexuals. I know several, and I have nothing against them. That sounds condescending somehow, sorry - the fact that a person is gay or not really does not matter to me, and why should it? Yes, if I were not married, and I found a woman sexually attractive, it would matter that she was a lesbian, as I would not pursue sexual relations with her, but in general such things do not matter... We do not need to know or care of such things normally.

We accept that homosexuals, even as a statistical minority, are just as they are, it is the "way they are", and not a choice, and the same goes for a whole spectrum of people that "identify" as a different sex to their organs, etc.

These days the whole notion that we just know another people's gender is rather odd. It is locked in to our language and culture, but why? It probably stems from 500,000 years of that mattering and basically identifying if you can fuck someone or not. But in modern society it is not quite the same.

The world is changing and accepting that people's feelings and desires in relation to gender and sex are a fact of life and not a choice, or an illness, is now the way we view people. It is good.

I support this view. I have my, somewhat conventional, feelings and desires, but I completely respect that other people may have very different feelings and desires.

Except, when people are "cursed" with a sexual desire for children - that is different somehow, and for a very good reason. Children cannot consent to such things, and should not be abused. They are inherently vulnerable and absolutely need protection. The "desire" is not the problem, the "act" is the problem.

The issue is where people are "cursed" with that being "the way they are", and sexually attracted to children. Now, they can never legally, or morally, pursue that desire. Children are out of bounds, and rightfully so! But do we accept that is "the way they are" or do we assume they are "ill"?

If we accept that is "the way they are" then things like "dolls", and "cartoons", and even videos with young looking actors and actresses that are actually of legal age, should not be an issue - surely?

What you do in the privacy of your own home should surely never be an issue if nobody is abused. Right now, legally, it is an issue. Right now, cartoons of child abuse are illegal, and even some dolls are illegal to import, even a video if the person "looks" too young is illegal! The idea that a judge is trying to asses the age that a cartoon looks is crazy, in my view. Imagine if you have videos like Avatar - if that is sex with an alien that is depicted as under 18 Earth years old, as that is how they are depicted as being mature on their world - that matters in law now!? How crazy is that?!?

The idea is that such things are a "gateway" to abusing children. I am no psychologist, but is that true? or is it only true because such things are already the wrong side of the law? If they were all allowed, but actual abuse of children was clearly where the line is drawn, would allowing such things, in the privacy of your own home, help such people live out their lives without actually abusing children? I really do not know!

Maybe this is where I do not know enough? Maybe this is simple and such things lead to actual abuse? If so, then maybe the law is right as now.

It only makes sense to stop them if we consider them "ill" rather than "the way they were born". So that is the question... How is it that a man attracted to a man is "the way you are", but a man attracted to a child is "perverted and ill"? What is the actual difference?

I will now be very non PC, and say that not only do I feel the whole idea of sex with a child is inconceivably repulsive to me, but so is sex with a man. That is "the way I am", somewhat "regular heterosexual". I am not meaning to be offensive, that is just the way I was born.

I imagine some gay men I know would find the idea of sex with a woman repulsive too. At least they may find it unappealing. That is the way they were born. That is life...

That does not mean I do not respect the rights of homosexuals to their life and desires and consensual activities, I do. But apparently sexual desire for a child is placed in a different category - why?

I post this actually to spark debate. I say this as someone promoting privacy, the very privacy that people in the whole transgender, homosexual, whatever, communities now, want and need. Privacy we all want, to be honest! Why is what you do alone or with consenting adults without physical harm in your own home not always legal? If you are one of those wanking over a cartoon or a doll of a child, does that actually harm anyone? Can we have some clear lines of actual harm, and harmless private fantasies here?

I always find it strange that we endorse fantasies over killing and carnage in films with no problem - serious issues, but nobody bats an eyelid over watching Die Hard or some cowboy western. Why are films over sex so much more taboo?

Or, is sexual attraction to a child actually "special" and "a mental illness" and not "the way you are", and if so, why? Really, why is that not just the way those people were born?

Comments welcome, and once again, I have to stress so much that I don't know any pedophiles, and I would not want anyone going near any of my children or grandchildren with such thoughts, sorry. I am rather prejudice myself on this - what I am trying to do is see past any wish to regulate people's private desires and fantasies in their own home. Really, what you do in your mind, your dreams, and your own home with no actual abuse, I really feel is not my concern.

I have a strong view that criminalising one's thoughts is an issue. If we are not careful, one day, we will have "dream police". And whilst I consider myself pretty "normal" in many ways, I would never ever want to be judged on my dreams and desires!

So..., comments?

P.S. Why that picture? Well, Carry-On films were made in a day when the idea of someone dressing up in a school uniform for fun in the bedroom was not seen as being wrong in any way - I am sure that featured in some of them, but could not find the relevant picture. Maybe I am misremembering. These days, a cartoon can be deemed to be depicting someone under age because it shows someone in school uniform, from what I understand of cases I have seen reported.

Maths test

This is an invoice from Viking (Office Depot International (UK) Ltd). I am at a loss as to how it is worked out.



The items are simple, there are £58.25 of items with 20% VAT and £53.98 of items with zero VAT. There is also a phantom £2.26 "Protection Plus", which I have no clue about and no idea if VAT applies or not.

The items are not clear whether they are quoting VAT inclusive or exclusive, as neither way do they add to the stated NET total of £102.23.

VAT of £11.06 is not right on the VAT listed items, whether they are VAT inclusive or exclusive and whether the £2.26 has VAT or not or whether it is VAT inclusive or exclusive.

Basically, I cannot find a way to make it add up at all. Any maths or accounting genius out there that can explain it to me?

P.S. Their customer service number is an expensive 0844 number which is a problem under current legislation!

Saturday, 29 July 2017

Honeywell RF stuff

Thanks to Mike, I have a selection of Honeywell RF stuff and a new RF RIO. This includes a PIR, Key fob, flood/temp detector, and smoke alarm.

The RF RIO appears on the bus as a device, like any other, and the panel (I am testing with a G2 here) allows up to 8 inputs on the RIO to be set up. This very much mirrors the normal RIO with 8 inputs, and they can be set to intruder, or whatever...

Each input is set with a "serial number" of the attached device, which can be auto detected, to save time.

The device can report a signal strength 0-10 out of ten, or as the panel says, out of 00 (seriously is says 10/00).

So far so good. It seems to work.

But why only 8 devices. Well, the instructions actually suggest that to avoid RF congestion you do not have more than 24 devices, so I assume other panels allow more than 8.

But looking at the bus, it is simple. I am using the V2GY protocol, but there is a later ALPHA protocol which is somewhat different, but for both the RIO simply reports status updates from the devices with its serial number.

Importantly the RIO does not care if the device is configured on the panel or not. Any device in range at RF level will be reported, including, it seems, around four devices that my neighbours must have! So any suggestions of avoiding RF congestion fall down when it is not in your control as neighbours could be using these devices in the same RF "space".

Indeed, any idea of limiting traffic on the bus is irrelevant as all these extra devices get reported to the panel even if it knows nothing about them!

I am working on including in my alarm code, and I will not limit to 8 per RIO - there is no point - all devices in range of a RIO will use RF and bus bandwidth anyway, so why not allow them?

I am wary of RF connected alarm bits, to be honest. I think wiring in is better. But this is progress for the alarm system to support these.


Playhouse alarm

The alarm project has been pottering on since I installed in my house. I have it working well here and on a bench setup. I have also been working on decoding more stuff such as the newer RF RIO and RF PIRs that you can get. They are fun. I can even tell when there is movement in one of my neighbours house - privacy implications I wonder?

It is all on GitHub now.

However, one thing I also did was wire up the kids' playhouse in the garden.


Yes, that is a kids' playhouse with swings, and slide, and Galaxy Max Reader, and big green exit button, and Galaxy keypad, and small (cabinet/drawer) mag-lock and (large) reed switches. The grandkids love it.

It has been, however, a good test, as I expected it to be. I wanted a platform with a working door that I could test on more accurately than my work bench but not messing up my house alarm and doors to my office here. An alpha test site, if you will. It has raised some fun issues.

First off, I needed a way to configure it so they could not in fact set the alarm and lock the door - yes they can get out by other means, but they were rather puzzled the door would not open. Easy enough.

However, the biggest issue I have is the bus is showing a lot of errors. I have retries, and they work, and the door works and the max reader works, but there are errors, several a second.

One thought is bus termination resistors. This is maybe 20m or 30m of cable, outside. The screen is not (yet) earthed. No termination resistors. The reason is that I found the whole thing fell apart when I added them.

I have worked out on a bench test that this seems to be a relatively simple matter. With the resistors, the bus comes cleanly back to zero after we transmit, and that causes us to catch a BREAK after we transmit, which we were taking as the start of a response message. A simple bit of code to ignore that and it looks like they will no longer be a problem. So that is one mystery solved.

Now I have a scope I can also see that we are nowhere near seeing nasty bus reflections or ringing to get in the way of things working without resistors. They are a good idea, yes, and will help in some noisy environments or long lines I am sure, but at 9600 they are probably not strictly necessary. In some ways the flaw in the design is the long (10ms) bus idle periods. A protocol with message lengths and overlapping bus driving turn around could have eliminated the bus ever being idle and made the whole thing a lot cleaner.

So lack of resistors does not explain all the errors on the bus.

So, time to play with an oscilloscope on that bus... This looks wrong!


Top (yellow) is the "A" side, bottom (blue) is the "B" side, and middle (red) is the difference.

As you can see, we only have one side. It looks a lot like this bugger is broken... The RS485 driver. This is the second one I have seen like this, the first was when I first was started the project and I wasted hours trying to find why I could not transmit. It seems only one side of the bus is driving!

I'll get another and I bet all of the errors go away. To be honest it is a miracle that the bus works at all with only one leg. Somehow it limps along and the devices cope.

It has, however, been a good test of the logging and fault reporting which clearly shows the bus errors and retries.

In the mean time, it is quite amusing watching a neighbours RF PIR tripping as they move about!

Thursday, 27 July 2017

Say the words three times, stand on one leg, what else?

After a couple of months I tried the Apple TV again.

Still broken, so I hassled Apple again yesterday, and to my shock they called back today saying they need some more logs. Can I make it go wrong three times and record the times.

This is sort of say "Bloody hell Apple!" three times, turn around, stand on one leg, sacrifice a chicken, burn special candles, throw salt over your shoulder, fuck knows. Does that conjure their engineering department. I hope so.

We will see. They say they will call back tomorrow for the details.

P.S. They did call back, as promised, and have the times so they can check logs...

Wednesday, 26 July 2017

Nominet Domain Lock

Nominet have a new service called domain lock, and it is described here.

A little while ago I got an email about this and was puzzled, not only that it was a chargeable service, or the disproportionately high cost, but also that it seems to be targeted at the registrars and not the registrants.

It looks like a lock that stops certain things, like change of DNS or registrant details, unless/until unlocked. It looks like the unlocking is done by the registrar. I would have expected locking to be tied to a 2FA that is known only to the registrant, but reading it, that does not seem to be the case.

The email explained, if I remember correctly, that the idea was to stop any risk of things changing without authorisation. That is odd, as surely any unauthorised change would mean someone was being negligent (possibly Nominet) and that the change can be quickly corrected.

It also does not seem to protect against DNS injection attacks, etc. This is something DNSSEC should do, and is something Nominet do not charge for.

As a registrar, we have ability (acting on our customer's authority) to make changes to a domain. There is the risk that our security checks are not good and we take instructions to make a change from someone that is not the registrant. We are careful, obviously. We have actually added two factor authentication to our systems (free of charge) to help our customers have the assurance that we would not fall for such scams. But having 2FA from us to nominate seems like a pointless step, if we lacked good security, we'd take bogus instructions, unlock the domain, make the change, and lock it again.

Indeed, one of the assurances registrants have at present is that, if they fall out with their chosen registrar, they can go to Nominet to change registrar and details on the domain directly for a fee. This means rogue registrars cannot hold people to ransom in any way. The domain locking feature seems to undermine that - as there cannot be any way to bypass the registrars domain lock by pleading to Nominet, obviously. If that was possible then it would make this service useless.

So I asked Nominet, listing some of the ways a domain could be changed without authority of the registrant or registrar... I really struggle to find many where Nominet would not already be negligent to allow such a change. But I asked about...
  • If the police ask for a domain to be shut down (I say "ask" as I am not sure proper legal authority to do so always exists or that they always present it in such cases)
  • If some copyright related notice requests a domain to be shut down
  • A court orders nominate to change DNS or other details
  • If someone takes a case to DRS and the registrant loses the case and domain ownership is to be transferred
  • If the registrar does not pay Nominet fees and the domain becomes overdue
Of course, if the registrar stops paying the domain lock fees, does it automatically unlock too?

I have not even had an acknowledgement of my questions, let alone a reply. I assume none of those cases are in fact "protected". Yet, dubious allegations sent to the police against a domain holder, or even hacking one of their pages and then sending allegations, or faking dodgy email from a domain, is one way for someone to "take down" a major domain if they want to, so something to protect against.

Can a company that has the responsibility for the integrity of a database really say "that's a nice domain, it would be a pity if someone was to make an unauthorised change to it, wouldn't it?" and start asking for such a large sum to do its job and protect the integrity of the database?

Have I missed the point of this "service" somehow? Maybe someone can explain the logic here...

Friday, 21 July 2017

warning: comparison between signed and unsigned integer expressions

This is one of the stupidities in the C language and it bugs me because it would be so simple for C to just code it correctly. I'd really like a gcc option to do this!

When you store whole numbers in binary you usually have a choice of signed or unsigned. The signed version allows negative numbers but at the cost of the range of positive values possible.

For example a signed char allows values -128 to +127, but an unsigned char allows values 0 to 255.

If you compare them, using ==, !=, >, or < for example, the operation converts the signed value to an unsigned value and then compares.

Example

signed int a = -1;
unsigned int b = 1;  
if (a > b)
   printf ("a>b\n");
if (b > a)
   printf ("b>a\n");

This print a>b even though a is -1 and b is 1!

This is because -1, converted to an unsigned value, is a big number, in fact the biggest an unsigned int can be.

What pisses me off is that, even when C was invented, the code to make the comparison work would have been one check of one bit extra. Basically, whatever the comparison, you just have to check the signed value is negative or not before making the comparison. If it is negative that means it is not equal to the unsigned value, and is smaller than the unsigned value, so whatever comparison you were doing is decided by the signed value being negative before going on to do the comparison as normal.

To me this would have been a far more logical behaviour than changing the value of the signed variable by making it unsigned.

Thursday, 20 July 2017

More on pitfalls of redundancy...

Hindsight is a wonderful thing!

I have been having long discussions this week and today. Many have the benefit of hindsight.

For kit we have in Maidenhead we have two possible ways to connect to the world! One if via the local transit, a single point of failure link. Another if via multiple (well, soon to be) diverse fibre lines to different London data centres where we have multiple transit and peering.

Even before the second leg of our ring, one leg is a single transit and the other is several transit and peering links via multiple (pairs of) routers. And even that allows fallback via the single transit link, just in case.

The problem, as ever, is a partly ill link: one that seems to be valid for traffic but is not. We had that today.

Announcing their routes primarily via local transit could work, but transit back out to the world being local is more complex. We would be offering a less redundant, and somewhat specialised, solution...

So we have the issue of hindsight verses reality. Ongoing, a link with more redundancy is better. The last few days is was not...

So do we offer knee-jerk services that are technically worse looking forward? Or do we say no, this is shit that happened, and was only wrong in hindsight?

It is almost like the good old days, err...

Today we (A&A) had another brief outage impacting broadband, ethernet and hosted customers, and VoIP. It was a bit complicated as it was one side of an LACP and so probably half of things were working and half not, and it looks like pretty much all broadband went down.

It was an error on our part - the ops team have been working hard all week, and working with a consultant, to help us investigate last week's issues with the CISCO switches. They have done a number of changes (adding more logging, etc) and diagnostics during the week. At each stage they have to assess the risk and decide if they can go ahead or wait until evening or even over night. A change today to bring back one of the links between the London data centres (one shut down on Friday) so we can test it independently of the normal operation resulted in breaking the switch links. Even the consultant thought it would be OK.

I think I can elaborate a tad more on things we know. I am sure the ops team will shout if I have misunderstood. At this stage there are aspects of what happened that are still unclear. This means we are adding some "defensive" config to try and address possible causes for the future.

The main issue, it now seems, was that the BGP links to all of our carriers from all of our switches all failed at the same time. Yeh, so much for redundancy! These are private links on a separate VRF and not connected to other BGP. The BGP is with routers on the end of locally connected single fibre links (of which we have many), not LACP or anything complicated. So the failure has to be entirely within the cisco switches. We can almost certainly rule out hardware impacting all at once. Also, being on separate VRF and not seeing Internet traffic at all, it seems unlikely some attack from outside. This leads us with the possibility of some sort of unstable config on the switches, maybe something spanning tree related (I hate spanning tree), or maybe some BGP issue with routes received from carriers, which seems unlikely, but maybe not impossible. So there is a lot of careful review of things like BGP filters from carriers, and spanning tree config, and so on.

The "fix" was rebooting half a dozen cisco switches. On Thursday this worked, but it took some time to conclude that was a sane thing to do, when other options were exhausted.

As I am sure you can appreciate, just "turning it off and back on again", or rebooting the switches, really is a last resort. We have highly skilled engineers who spent some time trying to diagnose the actual issue before taking such a step, and that is one reason these issues can take some time to fix. Sometimes a reboot can fail to solve anything but lose valuable clues.

On Friday that worked too, again we tried to understand the issue first, and got a lot more information. The reboots seems to have triggered a second issue with one of the switches being stupid (as per my other blog post) and coming up in a half broken state. Rebooting that one switch again sorted it. It is almost unheard of to have two different issues like this, one after the other, and that really threw us as well.

A lot of this week has been understanding the way the cisco switches are set up in much more detail, and adding more logging, and updating processes so we have a better idea what to do if it ever happens again - both fixing things more quickly, and finding more clues as to the cause. It may be that we have mitigated the risk of it happening by the changes being done. We hope so.

Obviously this sort of thing is pretty devastating - I am really unhappy about this, and really sorry for the hassle it has caused customers.

As I say, it is not really like the "good old days" when BT would have a BRAS crash pretty much every day. These days we expect more, and our customers expect more.

So, please do accept my apologies for the ongoing issues, and my reassurance that they are being taken very seriously.

Adrian
Director, A&A

Wednesday, 19 July 2017

Github

I have now issued a number of projects on GitHub... All under GPL.

Here.

They include the current build of my alarm panel.

Bricking it!

Well, pictures are out on twitter now - the new FireBrick sort of exists - but don't go trying to order them yet. We have many more steps to take before we will have stock, some months (I'll have a better idea tomorrow). There are little details like EMC testing for CE marking, and so on, some of which could causes delays. And still, it is made in UK!

However, seeing as there are pictures, I think I should say a few words about the new FireBrick model, the FB2900. This should avoid speculation, at least. We are still selling the FB2700, and I am not in a position to say anything about FB2900 pricing yet, this is purely some technical comment.


That is not the whole box, with no SFP screen or light pipes, etc. But you can already see some of the changes.



SFP port

One of the most obvious changes is that we have moved back to a 5 port format, as we had on the older FB105 models. But the extra port is SFP. This means it will be able to take a normal copper Ethernet port, but also various types of direct fibre links. Apart from use in data centres, and one each end of fibre links between buildings, this is thinking ahead to the days of true fibre internet services in the future.

Power supply

Anyone that has looked inside the existing FB2700 model will see we have a completely new PSU design. The change in design has allowed us to make a variety of different PSU options.

We have an option for automotive (12V and 24V). This is far more complex than it sounds - really! Automative supplies allow for something called "alternator load dump", and high voltage spikes, and a range of voltages from the supply. They have a lot of safety aspects to consider as well. However, this allows for FireBricks to run in cars, and trucks, and alarm panels, and all sorts of places where there are DC supplies.

We also have an option for higher DC voltages found in telecoms racks in data centres (-48V), an option we already have on the FB6000 series.

Even the mains voltage option is different, with the main board using 12V, we have a wide choice of suppliers for the PSU components. We have stuck with the "figure eight" power lead though.

Faster, better, stronger, we can rebuild it... etc, etc.

The new design has a faster processor, and removes a key limitation that stops the FB2700 doing much more than 350Mb/s. We have not got to the stage of benchmarking yet but expect it will be a lot faster. We are expecting faster crypto as well - we'll say more on that once we do have it all coded and benchmarked.

Brackets!

We have said this before so many times and not followed through, but this time it is real, honest. We have wall mount brackets and 19" rack mount brackets (for one or two FB2900s in 1U). I know, pics or it did not happen - just watch this space.

It's cool!

The power usage is lower, so the whole FireBrick will be a lot less on fire. The existing models are designed to cope with the heat, but in a confined space can get warm. The new model uses less power in the first place, and so we expect it to be a lot cooler...

Firmware?

Obviously we are always adding more to the firmware and more features will come along for the FB2500, FB2700 and FB2900 models. Software upgrades are still free, as always.

P.S. (and it should not be a P.S.) there are some good people working on making this happen, like Cliff and Kev, and they need some credit for this all coming together.

Porn ID checks set to start in April 2018

As per BBC article. I hate having to repeat myself, so I'll make it quick.
  1. Is there a problem to solve? Show us the evidence please.
  2. Is there a solution already? PCs can be set up with basic parental controls, and ISPs (even A&A) can help with that, so young kids that have no interest in porn can avoid it.
  3. Is this a solution? Older kids that want to see porn will absolutely not be stopped by these measures, so no.
  4. Will this work at all? Foreign web sites that are free are not covered, those paid by advertisers cannot be stopped by blocking card payments. Maybe some that take cards now will comply, but they sort of do as they take cards already.
  5. Kids can get cards! (albeit pre-pay or debit rather than credit cards) which makes one means of age checking somewhat harder
  6. How do you tell my age? It is almost impossible to make an age verification system that works remotely over the internet that cannot be fudged somehow. E.g. use a parents card details, simple as that. Anything an adult can type in can be typed in by a teenager. Even live video chat cannot be trusted - how long before there is an app to make your live face and voice seem a lot older in a convincing way.
Now for where it gets really bad...
  1. The age verification companies appear under no special obligation to secure the data, in spite of calls for this. If any are outside the UK they may not even have the Data Protection Act to consider.
  2. It is going to be nearly impossible to verify age without verifying identity, and almost impossible to come up with a way that cannot then be correlated to the site, and specific pages and sections of sites that identifiable people are accessing. This data will be hacked and leaked.
  3. Free porn sites are simply not covered by the legislation anyway, so what is the point.
  4. Web sites offering free porn (even those linking to or copying other sites) will now start asking for personal details to prove you are 18, including card details. They will be able to link to UK law and UK government information to justify asking. People will expect such sites to want lots of personal information. People will give details and then when scammed they are not that likely to complain or claw back as they are too embarrassed to go to their bank.
  5. The scams and risks target minorities especially - those that, more than most of us, do not want anyone to know their sexual preferences, even though totally legal.
Also, if you do access a porn site, using incognito or privacy mode so not in your browser history, you won't have cookies, etc, and so will have to do age verification every time?!

Remember, we are talking about legal content from legitimate businesses here...

Tuesday, 18 July 2017

To open source, or not to open source? That is the question...

As I have posted, I have had some fun making a new alarm panel. There are higher levels still to do, and I am sure many features will be added as it is used in anger. I have it at home now, and office will follow later. I am even making a system for the climbing frame/swing thing in the garden as a bit of an alpha test platform.

It was started as both a bit of fun and out of necessity. The necessity is that we have a number of small niggles with the Honeywell Galaxy systems we have here and at the office (see below). One way to address some of these would be to get an new alarm installer and maintenance company - but I am confident that it would (a) not solve all the niggles, (b) cost a lot to solve those that can be, and (c) cause more problems as we would mean calling them if we have to reset the alarm for any reason. So, yes, making my own alarm panel is not as daft as it sounds, especially as I have the skills to do that. Reinventing the wheel is one of my specialties!

Having spent a couple of weeks making some code, and making prototypes, and a working system - I now have a nice solution.

Will it monetise?

As a businessman I have to consider if we can now make money out of this. Right now it is A&A copyright, and A&A has paid for my time and the various bits of kit to develop it all. The company deserves something, either money, or customer acquiring kudos. A&A will get a new and better alarm system, which is a start!

It is tricky. I expect that most can be made selling, installing, and maintaining the actual systems for people. Retrofitting where a Galaxy panel exists for example, selling a system review, and selling maintenance and monitoring. Now, A&A have done telephone system installation in the past, on a small scale, but we are really not geared up for "field engineers" for installation or maintenance, to be honest.

So could I just sell the system - well maybe, but ultimately it is just s/w, and that is not that easy to police and sell. We could make physical systems, a panel box with RIO PSU and Pi and the RS485 leads, all neatly put together - maybe. We could resell the alarm system parts even? It also creates some possible liability issues - what if someone is robbed because of my code having a bug? And then we would need to establish dealers and relationships with those companies that do in fact do installation and maintenance and so on. These are all, for now, well out of our comfort zone as a company.

So, to be honest, I am not convinced I can make a lot out of it, yet...

Should I open source?

Well, this is, in part unrelated to the above. It is possible to publish source and have restrictions on use. It is possible to have some controls. It is actually quite hard to avoid and police pirate copies even if not publishing source. It is also quite possible to make and sell systems that are built on free open source software. Publishing the source code is a good thing to do, and if we are unlikely to make no money from it, that helps make it a no brainer. Or does it?

Vulnerabilities?

I may have vulnerabilities in my code that I have not spotted!. If published, someone could find them and exploit them to rob us, or other people. Of course, they don't know if we have any backup systems to alert us anyway - some part of the old alarm panel still running just for alarm/security and not the problematic door control issues... It would be a risk to try and exploit such issues. I also have slightly more faith in humanity and would expect someone to tell us if they saw an issue in the code, or at least not bother to exploit it, or at least not have the contacts to let someone else exploit it. So maybe safe to publish anyway.

Who controls the "official" version?

It always helps if there is an "official" version, and if we ever get it to meet security specs or British Standards that also helps confirm which versions does that. But who controls it? There are many ways to release code from just putting it out there, to putting on a community repository, to accepting code updates, or suggestions, or changes, and simply controlling the master copy ourselves.

If anyone can submit new code, or make changes, one has to be careful. It would be easy to hide a deliberate vulnerability or introduce a accidental one.

I was pondering, for example, I have key fob codes (numbers) and zero is invalid. If someone just removed the if(e->fob) line from the code that does nothing with a zero, then zero would be valid and may match any user without a key fob defined as they have zero stored - thus allowing a crafted key fob with zero ID to work to unlock and enter. But they could be a lot more subtle - e.g. if the key fob codes have a check digit (I assume not, but I don't know) someone could replace with if(keyfob_valid(e->fob)) and code that function to check the check digit but carefully so that an all zero fob will pass. That would look like a good and valid enhancement, and the loophole for zero would be subtly hidden. Just one idea of ways to hack the code with only a few minutes thought.

Go for it!

Let's assume we do want to make it open source, what is the best platform? Ideally where I have control of at least the main parts of the code in the "official" version. I have not done this for a while, so interested to hear.

I released some L2TP code for linux many years ago and apparently now it is some big and sophisticated system that some ISPs actually use! Totally unrecognisable to me, with loads of new features. Still has my name, which is nice.

I issued some SMS code for Asterisk once, and asterisk has changed and so has my code. Asterisk have an interesting community code copyright model. Apparently it has changed to the point it stopped working, and nobody was interested in fixing it (not even me, now). I coded something from scratch for when I needed it recently, outside asterisk framework (SIP, and alaw RTP).

We have published code on the A&A web site, and people use that with rarely a suggestion or bug report, but I know some is used by lots of people just from some of the feedback I have had (the linux drivers for Epilog laser engraver, for example).

To some extent it depends how good, and complete, my code is, which is partly why I have delayed releasing so far (I do not think complete enough, yet). If I have done it right people will use it and have hardly any suggestions or bug reports.

Open source code also benefits from the usual disclaimer of no fitness for any purpose nor liability for bugs. That is a good start. I am happy with our normal "money back guarantee" on stuff we provide for free, and I think it is more than fair :-)

The code is also layered, so a good chance the low level will just work and the higher levers will have lots of ideas and suggestions and changes, so it may even be appropriate to publish different layers using different publication models.

Indeed, one model may be open source the components but make a "system" with the web based config and status that is not open source and looks nice, and we sell that as a solution (with usually open source disclaimers?).

Do say what platform we should use to open source this...



P.S. Some of the niggles with the galaxy system and how I have addressed them - how many of these would be addressed by getting in a "proper installer/maintainer" I wonder?
  • You hold the fob 3 seconds to arm/set the system, but actually it is implemented such that you can use fob, and then use fob again in 3 seconds, so two separate uses can mean arming by mistake. New system expects actual hold for N seconds.
  • Once set you use the fob to unset, but then have to use again to open the door. Be careful not to be 3 seconds apart doing that. New system unsets and opens door in one go but does beeps to confirm unset.
  • The doors seem to have a lag at the office. New system allows 4 buses per Pi so can minimise lag. It also allows a lot of debug to find underlying issues on buses.
  • If you have a door with a mag lock at the top and reed switch at the side (even at the top by the mag lock in one case at the office), pulling the door whilst locked will trip the door forced alarm. New system can reduce lag, and also has configurable tug timer for this - if below say 0.5s then don't trip door force.
  • If the door almost closes and you grab it before it hits the mag lock but after it hits the reed switch, you get door forced. New system has locking stage and locking timer, or monitored mag lock input, which means opening at this point is not a door force.
  • If you have V locks, they motorise to engage, and so are not instant like a mag lock. This allows a door to bounce and trip the door forced alarm, especially if "slammed". New system allows open whilst "locking" without being a door force.
  • The external reporting via ethernet is crap! We have coded it, but it is messy. New system has lots of more direct reporting including HTTP GET/POST, Email and SMS with multiple reporting, and reporting selecting different groups, users, event types, etc.
  • Even the proper programming system (rather than using keypad!) is some nasty windows software. New system is currently XML files, but small and easy to work with, and will be some nice web UI.
  • Max readers scramble the true number they see so you have to buy special Honeywell key fobs with the scrambled number printed on them in order to use with the Galaxy system. The new system reports the ID it saw from the reader so any compatible key fobs or cards will be usable.
  • We'd love to do clever things like expect alarm set by certain time and report, or disable specific PIRs for Roomba at specific times of night. These will all be possible as we add more to the code, including time profiles.
  • The Galaxy system allows you to add a new Max reader - it will scan from 8 down to 0 and show the highest reader ID it can see on the bus, then allow you to set it to a new ID (0-3, or 0-7 depending on panel). A new Max appears as ID 8. It does not even say which IDs are in use already, so even for a new install you have to keep track of which you have added. It does not allow you to reprogram any that are not the highest number on the bus! The new system lets you change any Max to any other and reports what it finds on the bus.
  • The Galaxy has no way to properly integrate door lock engaged inputs that come from monitored mag locks or V locks. New system does and can detect door ajar events neatly.
  • The Galaxy allows door propped timer limits but no way to override on per case basis, hence we don't use them. New system allows key fob to override (if they are allowed to), logging who allowed door prop.
  • The max reader beeps (loudly) when door lock released, but I have V locks that make a nice clunk and do not need a beep. Indeed, every Max reader I have seen has blu tack on the sounder as annoyingly loud. New system allows silent door open and then uses beeps as an actual error case, so can be loud.
  • User names on the Galaxy are stupidly short making for almost unreadable reports from the system for some users. Also makes it hard to use same names on other systems - we have used Galaxy logs for time recording but needed a mapping of the abbreviated names. New system allows any length user names making it much simpler.
  • Of course the Galaxy simply lacks the new features we have now dreamt up, like air-lock door pairs. This is less of a niggle though as we don't need this in the office, yet!
I bet I have missed a few niggles, but you get the idea...

Monday, 17 July 2017

When you know the design is bad

This is probably something only a proper software engineer knows, but there are times when you just know your code is not right.

One of those times is when, during the testing, you find you are having to code for each edge case specially.

Now, don't get me wrong, some code has to have code for special edge cases. Edge cases are the boundary definition of any system. They happen. Error case are always a problem, but this is edge cases or "normal" operation.

I don't like them and they are always a clue of bad design. I had an edge case in my door logic that had to be patched three times. That is a massive clue I have the design wrong.

I designed some new logic in my sleep, and had to get up and document it at 3am else I would never get back to sleep. Coding / design in my sleep is not new. Try and convince HMRC of that for R&D tax credit claims though?!

This evening I implemented the design and it was almost perfect - minor change of some ordering, but it avoids those annoying edge cases even cropping up. It makes for much "cleaner" code, which is a clue you have it right. In some ways the logic that beauty is simplicity is correct applies to code as much to quantum physics!

As ever, the trick is modelling reality. I have locks that take time to engage or disengage, so model that and model the state of the lock based on the outputs, timers, and the feedback for lock engaged input if we have one. That allows me to know the state of the lock(s) and use that as an input to defining the state of the door.

Yes, traditional state machines are entirely driven on state and event, but if you can make state from status, that makes for a no-brainer state machine and is usually nicer.

The result looks good, and I have reduced the number of door states as a consequences, which is always a good sign.

For a change it was documented (at 3am) and revised as I coded. My test door / lock was helpful before making live on my home alarm system...


Saturday, 15 July 2017

In theory, theory and practice are the same, but...

Most of this week was spent doing other things, but I did find the time to actually build an "alarm panel", and today I took the plunge to install it at home in place of the Honeywell Galaxy system.

It is made using a RIO PSU box, the size of an alarm panel with a RIO and power supply, and includes battery charger, with lots of space. I added a second RIO as I'll need more than 8 inputs in total. And I added the DC supply and PI. Simples.


It has taken me all day, and I still have more sensors to connect up now, but that should be the easy bit! However there were a few pitfalls, which is why it has taken all day.

No space

First off, as you can see, where I wanted to put it, and sort of had to put it because of where cables would reach, I had no space. In fact there was a second battery box there originally which I don't really need.

Power off

I turned off the extra battery box, and was going to use the lead for the panel. I realised it had lights on, well it would, it has a battery, but I disconnected the battery and it was still lit up. To my surprise the power switch did not work and it was all live. OK time to turn off the circuit, remove that switch, and sort it out - my daughter's partner was here, and is a sparky, so he lent a hand to do it pukka. Thanks Jim.

Resistance is futile!

As you would expect - when I installed the panel in the first place I fitted the proper RS485 termination resistors to the buses.

When fitted they simple do not work with the FTDI USB serial I have. It even had a termination resistor in it to use on two leads, but fitting those breaks it. I had to actually remove them?!?

Are you positive?

The first thing I connected was the existing doors, well, one of them. That was simple, but something was not right. It took me a moment, but the lock was locking when you wanted to open the door and unlocking when it was closed. It seems I had misunderstood the polarity of the bits in my testing somehow. It is good to have a working existing installation as otherwise I'd have assumed it was meant to work like this and wired up accordingly. As it happens the locks I use here need a relay board anyway as they expect a reversed polarity. Apart from fixing my bug I also added a feature to allow the control to be inverted for cases like the locks I am using. This will allow me to remove the relay boards.

Beeping hell

If you have ever used the "Max reader" you will know it beeps. If the installer has not included the door open sensor, it beeps for as long as the door open time is. It is a tad loud. With the sensor it beeps until the doors open. Most of them have blu tack in the beeper to quieten it.

However, here, the locks are motorised so you can hear it is open without the need for a beep. So, as I have full control, I have made the beep on open optional. It still beeps for error cases, but these can be a lot louder now with no blu tack. Well done James for suggesting it.

Too quick for me

We had one door in service, and in use and a lot of coming and going. I was working on the other door, but we quickly realised that the door we had would sometimes come up door forced. Given the work I had put in to avoid this I was rather puzzled.

It only happened if you basically lent on the door and hit the open button, and fell through it, sort of. Opening the instant it was possible. Turns out the window was up to 100ms.

Well, the doors work on a state and if the door is found open in closed state, that is forced. It expects to go from closed to opening and then find it is open in that state so as to move to open state. The problem is the door release was quick enough that the state machine in a different thread was not spotting the state change for unlocking the door and hence seeing the door open in the closed state and so decided it was forced. The fix was simple, if we released the door, then that is allowed.

And still too quick

A bit later we realised the door was not locking. It would be opened and thne closed, but not locked. The issue is my above fix actually stopped the door going to forced state but did not in fact move the door to open state, it sort of got stuck in a rather inconsistent state of closed state but the door open. OK, that was easily fixed by actually moving to the open state.

And still too quick

Later in the day it happened again and I was cursing. I think I finally have it sorted now. The state machine was actually looking for change from opening to any other state as the trigger to stop the lock opening - e.g. engage mag-lock, etc. So it was staying dis-engaged and then the door was closed with the door lock set to open so the it promptly decided it was opening state again.

Some days I like state machines, and most days I do not. This is why the play house would have been a good idea for testing.

Why no fireworks?

Anyway, with lots of things working, including the lock engaged inputs on the doors, I decided to try and sort battery errors. I fitted the battery, and I tested. I turned off the mains and it all died?!?!

I was cursing, and it took me a while to spot a red spade clip on a black terminal and a black spade clip on a red terminal. I am impressed that the RIO, which not only runs from a battery, but charges it, managed to safely do nothing. Having swapped, I can battery and mains error reports and voltage information. Yay!

Can't spell, tutt tutt...

I wasted ages with debug trying to work out why outputs were not working, until I realised, eventually, that I spelt it outut instead. The XML config is currently ignoring any extra objects or attributes to allow for higher levels to hold data on things like images and locations on a map and so on, and so it was ignoring my config link completely. I plan to have a check against an xsd or some such at a higher level, later.

Next step

Connect all of the rest of the PIRs, and a lot of testing. It already texts me for events, and will be added to nagios to tell if it fails. This should provide a pretty good alarm system for home.

Is a PI going to be OK?

This is a good point - they are hardly "industrial". However, it boots very quickly, probably as fast as the Galaxy. It seems to just work. Even so, Andrew at the office had a bright idea I may purse...

Dual redundant PIs! They could easily see if the bus is in use, and wait, stepping in if the first PI dies. RS485 is very nice for that sort of thing. One to consider if I do run in to any reliability issues.

Open source

Still not released, and even if I do, it won't be until I am really happy with it. I may allow some people to test though.

Friday, 14 July 2017

Pitfalls of system redundancy

Given the events of last night and today it is worth my writing up a bit about redundancy.

The way most things work in somewhat "industrial" IT is that you have two of everything (at least).

There are many ways this can work. There are things like VRRP which allows more than one device to be a router. There are things like LACP which allows more than one Ethernet port to work together as a bundle. Then there are things like BGP which allows more than one route with fallback.

Now, all of these work on a very simple logic - you have more than one bit of equipment or cable or fibre, and if one breaks you can carry on by using the other(s).

This is good, and this works. Mostly.

But there is a problem!

The problem is when something does not quite break!

All of these systems rely on the broken kit being dead, not working, same as turned off. And then it all detects the failure and falls back.

What if a switch is apparently working, all ports work, and some local communications work, and even some of the ports are passing traffic, but somehow some things are not passing some traffic?

That is bad, really bad. Not only do the fallback systems not realise, and they keep sending some the the traffic to the ill switch, but trying to identify the issue is a nightmare.

What happened today?

I have to say that we had some "special shit" today. At one point we had a case that I could not access router from my laptop but my colleague sat next to me could! I have two routers that were able to ARP each other, well actually one could ARP the other but not the other way around and they could not ping!

The usual tools like ping and traceroute to find the break in a network simply did not work!

We had links that allowed some traffic and not others, and that was mental.

The pain in the arse here is something called LACP. This works using two (or more) links, and we have a lot of it for the very purpose of redundancy. The LACP links pick an interface from a set using a "hash", typically of the IP addresses and maybe even the ports.

This means that some IP to IP traffic uses one link, and some uses another. And if one of those links is "ill" that will simply not work and drop the packet, and the BGP session, and all sorts!

A layer 2 problem...

The issue we ended up with was clearly a layer 2 issue, Ethernet. We had equipment only on that one switch that was not responding. We had weird issues when sending traffic to that switch with another on an LACP link. Basically, we ended up with a "half broken" switch.

A lot of CISCO debug (spell correct said "drug" not "debug" on that) was added and a reboot. It came back fine and is now working...

So do we have a switch with an intermittent partial failure? Is it hardware (self test says no)? Is it some hack and vulnerability? Who knows?

I am not sure we can fully explain the whole of the issues we have with that one switch being iffy, maybe at a stretch.

Next steps?

We need to make the switch less important (we have) and see if it fails again or something affecting other switches. It would be simple if this was a hardware issue, we swap out and/or repair.

The problem is that if this is some hack, then we have the same problem on other switches. We are not quite on latest code, but planned upgrades are in the pipeline already.

We have reports of people on TT retail that also dropped and maybe even BT? I am not convinced.

We have to wait and see.

What did we learn?

We went off on some tangents with this - the whole way normal dual redundancy works was not, well, "working". We had to try shutting things down to see what made it work. We even shut down some links that were a bad idea for hosted customers for a bit. Sorry.

We now know to look for the LACP related anomalies in this set up. We have found these in carrier networks before, but we simply were not used to this in our layer 2 network.

We learned a lot on how to extract info from the Cisco switches.

If it happens again we know where to look, and no, it is not to FireBrick, but to Cisco!

Thursday, 13 July 2017

Definition of income / turnover

I looked up "income" in a dictionary and got two definitions.
  • money that is earned from doing work or received from investments
  • a company's profit in a particular period of time
Neither of these meet the Office of National Statistics definition. They apparently want some stats from us, as they do occasionally, and this time it was our "income for June", or "turnover for June"...

Now, I, and Alex, actually assumed that would be VAT exclusive invoiced total for invoices in June. Or possible adjusted for the invoice period for services, e.g. if an invoice was for June+July, we would only total the June part. Both of these are figures we have available from our accounts without too much bother.

It is interesting that the dictionary (it was just one I looked at) says for a company it would be profit which is different yet again. But they did say "turnover" as well as "income" pretty interchangeably. Even so, may not be impossible to get from the accounts.

The ONS, however, have a special definition - they want to know the actual money received. I.e. what went in to the bank.

OK, that is again not either of the dictionary definitions, as we "earned" money simply for doing work and invoicing for it even if that money has not yet arrived. Even so, that is not too hard to find. We have the banking data, we can look at the total of incoming payments.

But no, they want the VAT exclusive total of received payments. Now that is harder. Whether a payment has a VAT element depends on which invoice it relates to. If someone sends money we will allocate by default to the earliest invoice. Now if that was a late payment penalty with no VAT then that payment has no VAT. If we then find they did not pay the late payment invoice but a later invoice, when we re-allocate the payment, now it does have VAT as part of it.

So working out the VAT exclusive value of payments is both complicated, and can change after your worked it out.

It is also odd as we have people that we trade both ways with and the actual money transferred either way will be the balance, and not reflect what we have invoiced at all. This will probably account for many thousands, and so will impact their statistics.

Why on earth are they picking such an obscure statistic to collect, and why not simply ask for actual total money instead of expecting us to work out a VAT exclusive figure. And finally, why are they not asking the bank! I suspect they have the power to.

Update: We called back with the figure, and now they say "no, we want the VAT exclusive invoice total". Here is the call recording to prove we are not going barmy here (MP3).

Tuesday, 11 July 2017

Busy week

A week ago I started on this fun little alarm project, and I am quite pleased with progress.

The list of things to do is dwindling, and the functionality of an alarm system is all there now. I have to improved the keypad UI to show which inputs were triggered, but I have a log for that. I could start using it now.

I have inputs and outputs and tampers and faults and warnings and fire alarm and latching things and resets by the user and login on the keypad and beeping and exit routes and entry routes and timed set and part set and instant set and delayed bell and bell time limit and rest time and almost all of the door control features I planned plus several more and, well, a compete alarm system really :-)

The next stage really is lots of testing and working with my friends who are making bench tests and the like. I need to look at the SSAIB stuff and see if we can meet their specs too.

It has been fun, and has had lots of back tracking and tidying up and re-thinks.

So now is a good time to actually try this outside I hear about, and go for a meal with some friends this evening. Should be fun. It looks wet out there though :-)

Sunday, 9 July 2017

SolarSystem?

I have been doing this alarm project for a few days now, and it started as a bit of fun, but is turning in to something quite interesting I think, so I'll keep posting updates.

As with many systems there are a lot of layers involved, and I have been working my way up the layers (which is how I usually do things). I usually get bored by the time we are talking about what colour the icon for a PIR is on the web page :-)

The low level layers are all about the RS485 bus, and the reverse engineering of a normal Holywell Galaxy alarm panel. That has gone well, but even today I was revisiting a few bits like control of the backlight on the keypad. Over the weekend I have slept on the design and as I expected re-done several parts of the low level library to handle the bus and "door state machine" logic.

One of the main reasons for a re-work like this is that things are never quite the same when you come to using a library. I made my initial design, but when using it I found I was having to make messy code, which is always a bad sign, so I went back and made changes. The result is something I am a lot happier with.

I have now started the higher level design of an alarm panel. I had to pick a name. I have started with a name of SolarSystem, as it is a lot smaller than a whole Galaxy. That said, it is likely to be a lot more powerful to be honest. This also meant a lot of documentation and more sleeping on things and a lot of changing my mind as I started to code things.

The nuts and bolts of an alarm panel are damn simple - you configure shit, which is not that hard, and you have a lot of inputs which set an "alarm" state if triggered when the alarm is "set". You make bells ring, and you send text messages and so on. OK, that is simplified a bit, but not rocket science.

This is where we get to another part of coding, and especially for embedded systems, you need a shit load of error checking, and you need ways to report errors without simply aborting. For most code in a typical environment, using errx(...) is fine - aborts the programme and tells the user what went wrong. For an alarm panel that means the doors stop working and you are locked in or locked out of your office. So you need to take a lot more creative approaches to error handling.


The other big issue, even at this stage, is "user interface". For the most part I am working on the actual operation. However, even that has the keypads, and they represent a user interface. So I have to have a way to use them sensibly - modes and states and partial inputs of data - all good fun.

So, in summary - I have the low level RS485 handling for RIO, MAX, and Keypad all working. I have a design for the alarm panel operation. I have a start on the code for the alarm panel operation.

What next - well, lots of work at this stage still, and a lot of testing. I decided the alarm panel code runs from a config file, which can be re-loaded, and is XML (or JSON). So the actual config is to be done later - perhaps a nice web interface or mysql database, whatever. I said I get bored at the high levels, but I have people that will code that all if I want. For now I am using vim on an XML file :-)

I also need to start using this in practice, which means setting up better bench tests than I have now. Ideally a board with some doors on it (small ones) with lock release and buttons and reed switches and max readers, and a keypad, and something to simulate PIRs and a bell and so on. Make a dummy system and try out the logic, and also to demo the way it works and use in videos!

One small step to that which I may do is fit door entry and alarm to this :-


It is nearly finished (thanks to all my family who have worked hard on this), but has a door, and so could be set up with a full alarm system - why not. Indeed, I suspect my grandsons will love it if the door needs a key fob and will take great delight showing their friends and locking them out even.

Changing my house is also a key step. I have a Galaxy panel now, but will change to a new system. I need a key switch so I can cut power to the fail safe door on my office here at home for when early versions do manage to crash in some way. Even I am not brave enough to lock myself, yet.

Them the next big step will be changing the office over. I expect that is weeks away, but when we do it, it will allow a lot of new features.

One of the key things we can do a lot better is door management, which I have already coded. Actually being able to use "lock engaged" inputs, and allowing secondary deadlock on a door, and handling door forced events way better as well as door propped.

It is probably worth explaining slightly. When the certified alarm installers did the office, they did not connect the door open sensors to the Max readers. Why? Because it causes door forced events. Grab a door as it is closing, or bounce a door shut then open (depends on type of lock), or even just pull hard enough on a mag locked door to trip the reed switch, and you have a door forced alarm. So they, as a matter of policy, don't enable the sensor as way fewer support calls. Bear in mind, for a lot of systems, this sort of issue may mean calling the installer to have them reset it.

We were robbed, and there was a complex sequence of errors on our part, but one of them is that we would have seen door forced and come running if those sensors were fitted.

Now they are - but we have a false alarm on this once a day at least and it is annoying. We have the system so we can reset the alarm, but still it is annoying people.

So simple things can be changed. One is a short timer on the input for the "tugged the door hard", though, to be honest, on the outer doors I would rather know someone is doing that. Actually, D'Oh, I need that only during the day when alarm not set. When alarm set I want that to trip instantly. /me makes note. This means we eliminate a lot of the errors.

The other is a door closing timer and integration with lock engaged inputs where available. This would allow door open just after closed without being a door forced.

Just those simple little things would stop the staff cursing the Galaxy system every day!

Another issue is door propped events. The Galaxy does this - a door left open too long is considered propped. A useful alert.... Except sometimes you want to prop a door. Hence we have it disabled. My design allows a propped alert to be cancelled (and to record who did it) by someone allowed to cancel propped alerts using a key fob on the Max for that door. That way we can prop doors if we want, and record who did it, but a door that has not quite closed for other unconfirmed reasons raises an alert.

By creating a door object as a concept and not restricting ourselves to what the Max does, we can allow all sorts of new things. One of which is air-lock door sets. I.e. two doors where one must never be open if the other is open. This is a requirement in many places, and indeed, one of the doors does not even need a Max reader at all, just buttons and sensors.

Another is cases of max reader both sides, which doubles as a time recording system for staff. My design allows these with ease.

All good fun. I'll post more on this next week I expect. Still considering the best way to open source the project.

Friday, 7 July 2017

PARMRK

One for the techies...

On linux, serial port processing has a lot of options, and I mean a lot. It is scary. In order to do this RS485 stuff I was setting the most raw, basic, just give me the bytes mode I could.

However, in a fit if stupidity I set the PARMRK setting in termios. It seemed like a useful thing to tell me if I was getting a BREAK or other framing error on the bus.

All was well, pretty much, but I had not realised, PARMRK does no just prefix framing and parity errors with a sequence like FF 00 so you can see them, it escapes a received FF byte as FF FF.

This makes sense, else how can you tell it is marked or not. I should have thought of that!

The problem is that the Honeywell Galaxy serial stuff uses a 1s complement checksum. Only if the checksum should be 00 would an extra FF byte matter making it FF. In all other cases FF wraps adding 1 more, and causing the checksum not to change.

So I was getting messages, and some would contain FF, and I would see as FF FF, but the checksum would be fine. I was assuming the messages I was debugging were meant to have FF FF. It did strike me as odd in one case, I have to admit. And most of the messages did not contain FF.

Only when I started on the RIO where I had things like tamper reported FF and resistance reported as FF FF (infinity) did it stick out, with the message length being inconsistent. Have eight lots of FF FF escaped by PARMRK in one message and it very quickly looks silly.

I could de-esacpe the input, but simpler was not use that feature. Well done using 1s complement checksums!!!

So far I quite like the RIO

One of the key pieces of equipment for this alarm panel project is a RIO (Remote I/O, I assume).

It is what handles the inputs and outputs for the alarm system itself, and one or two of these are normally included on the panel itself.

The basic set up has 4 outputs and 8 inputs. There is an RF RIO where the inputs are remote radio things - like battery powered PIRs. It seems to work in the same way, thankfully.

The outputs were noddy, I just send a message to tell it the state of the outputs, even if that has some overkill of 3 bytes per output rather than one bit.

The inputs were a tad more fun. The concept is that you have a pair of wires to a remote device, or more than one maybe, and you have switches - e.g. reed switch, and you have resistors. It seems (sorry Borg) resistance is not futile.

The typically config is a 1kΩ means "closed" and 2kΩ means "open". So you used 2 x 1kΩ resistors with one shorted by the reed switch. Simple.

The reason for this is to prevent tampering. Simply short circuiting or open circuiting the wiring would create a "tamper" condition. You can, of course, bodge it by measuring the voltage and fitting a few zenor diodes to match then removing the wiring, but apparently criminals are too dumb to work that out.

What is quite nice is that the RIO works out the resistance, to the Ω and reports that periodically. It also reposts the battery voltage to the mV. It is rather nice when something like this goes to the bother of converting data to nice usable units like this.

There are, however, a couple of config items, which are slightly fiddly. One of which actually needs the message sending twice, FFS. Why?!?! Does that pre-date checksums?

Anyway, one setting per input is response time, and again it converts from nice units an can be set is multiples of 10ms to 2550ms per input. Nice.

Another is the resistance thresholds, in multiples of 100Ω. These define the range for the 1kΩ "closed" and the 2kΩ "open", but also a band for "low res" and "high res" to report issues. This means in total five different resistance values defined, and so six states, two of which are "tamper" (open and closed circuit).

Anyway, all coded, with events passed up to the application, and inputs that can be used for the door entry system if needed.

This basically means I have the key components now for an alarm panel. Next step is the higher level logic for setting zones, and so on. I'll ponder over the weekend. I suspect I'll do a bit of tidying up of existing code before moving on.

Thursday, 6 July 2017

More coding

I have been having fun, and yes, it is fun.

There have been many set backs. Today I spend two hours assuming the damn Max Reader was being daft before finding a loose connector. Yesterday was an faulty RS485 lead.

I have discovered that the Max reader really does want around 10ms gap between messages on the bus else it gets confused and sulks. Shame, as keypad is happy with way less.

My code design has evolved - it now works as a library which creates a per bus thread to do the polling, and a separate "doorman" thread to handle logical "door" objects. By default these are created from new Max readers it finds.

My idea is that a "door" is a lot more than just what a Max reader can do. They are pretty simple, with two inputs (exit button and door open) and one output (door release relay).

I think a "door" needs more, so the design allows (optionally) a Max reader and then a number of inputs and outputs (which may be on that Max reader, or on a RIO), to make the door complete.

So you have the basics, a door exit button, and a door open sensor, as well as a lock control relay. But the logical door can have more. Indeed, those basic inputs do not need to be on the Max - its inputs and outputs can be for anything, just defaulting to their normal use.

Extra I/O for a "door" includes things like a door bell button input and a door bell output. Simple stuff. But also a door "lock engaged" input (which many locks can do), so you know you are not locked. This allows a "door ajar" state if door shows closed, by lock not engaged in a sensible time.

I also created the idea of a deadlock, so an output for that and a deadlock engaged input. This allows for doors to have extra locks when alarm set.

The system also allows for the notion that a closing door can take time to close, either for lock to engage or simply a time of a few seconds, and opening during that is NOT a door forced event! The source of so many texts I get from our Galaxy system. I will be really glad to kill the event!

The keypad code is fun - an event for key presses, but a simple memory mapped text area for display- sending new display text on next poll if changed. Makes for easy app coding.

At this stage my library has low level code for Max readers and keypads, and will soon have RIOs. It also has logical system management for doors.

Next step is "alarm panel" logic, and deploy on some Raspberry PIs, and RS485 USB cables, and sorted!

This is basically a complete alarm system coded in less that a week, I know! My plan is changing my house over this weekend.

The video blog of progress :-

Wednesday, 5 July 2017

Alarm panel code

Now I am over the whole broken RS485 cable, I am making a lot more progress.

I have pondered the design a few times and eventually came up with the idea of a C library to manage the low level polling as threads with real time scheduling priority. The library links in to an application.

The threads have shared data structures that allow the application to set things as needed, and the polling system picks up that something has changed and sends the messages needed. It handles resends and waiting for the reply for the message to confirm it is processed.

This means that the application can see the display for a keypad as no more than a char array to which it can write what it likes, and, as if by magic, that text ends up on the display within a few hundred milliseconds, if that. It makes for really easy application coding.

I do have the logic of "events" from the low level system as a queue to the application - being things like devices discovered or missing, tamper alerts, changes of state of inputs, keypad press events, etc... But for the application, all of the low level timing and sequencing is all hidden. It can see a key fob event and set an "open the door" bit, and it just happens.

I have coded the keypad logic so far and should be able to connect a max reader and a RIO soon to code those. Then I can make a proper "alarm panel" application with whatever integration of external systems I wish. I am thinking of a mysql backend for basic config and a web based system to edit and manage that - as that is pretty simple stuff.

Next step - do I open source my library? Maybe when it is finished.

Tuesday, 4 July 2017

Techie: How would I do it?

One of the things with the hacking I am doing on this alarm panel is the fact that the messaging is more sort of state based, and means keep sending the same message in case it did not arrive. No sequence numbers or acknowledgments or such.

To be fair, it is not that bad - you poll (and send some message) and expect a response - so not getting one is a clue. But it is not ideal.

Now, the max readers. They have a number of outputs, being 7 LEDs, a beeping noisy thing, and the door release relay. They have a number of inputs such as door exit switch and door open sensor and tamper sensor.

If I was making a simple system to work on a state basis I would report a single status that covered all of those inputs and outputs. I would have a simple message that controlled all of those outputs.

Technically there is one extra input which could be added which is 4 bytes BCD code of nearby key fob code - present or not...

Now, that means you could keep sending the command to set the outputs until you see the status showing the outputs are what you expect.

It means that if something else sent a command to impact the output, like me hacking, or the device reset, or any circumstance where the outputs are not what is expected, you simple send the message to set the outputs as wanted.

This is a simple system.

It means tamper is just an input not a different damn message!
It means you cannot get stuck bleeping forever.

It is not what Honeywell / Galaxy do. Oh well. "legacy"

Hacking

Obviously, in most cases, proper "hacking", as in breaking in to someone's system without permission, is illegal.

However, it is quite fun to be able to do some "hacking" in the sense of working out how something works, and breaking in to it and doing things it was not designed to do. It is also fun to document how it works, even when it is using a protocol from the last millennium. In fact, in some ways, that is what makes it extra fun as it is nostalgic too...

So, today, I am playing with the Honeywell Galaxy alarm system. You see them everywhere - a very popular system. We use them, and slightly hate them to be honest. Even with full installer access they are impossible to make do some thing sensibly, even though they are actually very flexible. They have some really annoying quirks - like you can disarm the alarm using a key fob (good), but that does not also open the door, you have to use the key fob again to open the door. You can also set the alarm using the key fob (holding it to reader), but that also unlocks the door whilst doing it which may not catch (depending on type of lock). Little things like that just make it that extra bit annoying.

However, the actual bits that go with a Galaxy system are not bad - there is the keypad (as per picture) with display, the Max readers which work a door entry system, and RIOs which provide a range on input / output for sensors like reed switches and PIRs. They all sit on an alarm bus which is typically untwisted screened wire carrying 0V/12V power and A/B of an RS485 bus. Different panels have different number of busses and allow different number of devices.

My hacking today is understanding the protocol on that bus, and it has taken me most of the day. The main delay was many hours trying to work out how to make the RS485 USB lead switch to transmit and drive the bus. My mistake was assuming that I needed a a separate control like using RTS/CTS or some such (as some devices do). What I did not realise is the UART itself had a TXDEN output which does the driving and in fact I had a faulty cable - arrrrg. Change to different cable and you literally just write bytes to the port (from linux) and it enables the transmission side and sends them on the bus cleanly. Obviously that was the first thing I tried, but, faulty cable. It could not be simpler.

The nostalgia really has started to kick in though - I remember busses like this from, well, the 90s maybe, perhaps even earlier. We are talking standard serial protocol, 8n2 at 9600. The messages have no length indicator, just using timing to find end of message. There is a simple 1's compliment sum on the message for checking. No sequence numbers or acknowledgement or retransmission - it seems to use a state based logic, so sending the same message repeatedly until something needs to change. A really noddy and really old fashioned protocol. I suspect the "legacy" is strong with this one.

There are a few lovely quirks, like the keypad codes "0" to "9" as 0x40 to 0x49, but codes key "A" and 0x4B and key "B" as 0x4A, LOL.

Anyway, I have not only been able to work out the protocol but also send messages, even ones I should not be able to - like sneaking a status from a Max reader saying that someone has pressed the door exit button during the 10ms turn around time before the reader actually responds! Hence making a door open. Also sneaking in some messages to the keypad/display for fun, or changing the LEDs on the max reader. Basically, if you get access to a bus on one of these systems you can do all sorts.

OK, so why?

Well, I am thinking of making some open source linux code to work with the Honeywell / galaxy kit. Allow an open source alarm panel to be made, even. I suspect the protocol documentation and low level tools are what we would release, and maybe we make a high level panel and sell that as a solution, or maybe we make that open source, or like asterisk - a community designed alarm panel. Linux based would allow use of small industrial computer boards that could fit in a box with a PSU, battery and RIO. USB to RS485 allows plenty of busses with no problems. If you wanted to do something on the cheap you could even use a Raspberry Pi, but they tend to be less reliable than you would want for an alarm panel.

Of course this could allow a panel with modern config, using a web based interface.
It could allow integration with IT systems, like allowing SNMP of status feeding in to Nagios.
It could allow alarm state reporting by SMS, and email, even tweet DMs.

I suspect that getting insurers to be happy is a trick. We decided long ago that, at the office, contents insurance was more expensive that making the place secure. This applied even though we did have a burglary with several expensive machines stolen. So we do not have an insurance company breathing down our neck if we want a more custom alarm system. At home, the insurance company said the premium is no different, so they did not care if this was a custom alarm system or not. I suspect, even without insurance endorsed alarm system, there are plenty of opportunities for a system like this, and if it does get at all good and popular they may well endorse such a system. After all, the components are all proper off-the-shelf parts, standard RIO and PIRs and so on.

Of course, there are many other non alarm systems, such as purely door entry systems or time recording systems, which could use these same components.

What can I say? watch this space and we'll see what we can do.

First step is probably publishing the protocol.

It is fun.