Learning to ACME
I thought it may be fun to explain what I have been doing over the last week in more technical detail... I have been coding ACME (which is the protocol to get a certificate issued for a domain) for FireBricks. It is aimed at making HTTPS set up really easy. Right now you have to install keys and a certificate manually, but ACME will make it simple and seamless.
The obvious first step is that the protocol talks to the ACME server using JSON. We send JSON objects and receive JSON objects back. It is all done over https to the ACME server.
Not the first time I have handled JSON, but I needed a JSON library for the FireBrick, which only took a few hours.
However, the JSON that we send to the ACME server is not just a simple JSON object, oh no. It is a JSON Web Signature protocol. This means you make some JSON, you then BASE64 code the JSON, and make that a "payload" field in an new JSON object. You also make some more JSON which is various fields defining the public key you are using. These fields (e.g. "e" and "n" for RSA) are BASE64 encoded. That chunk of JSON is then BASE64 encoded, and included as a "protected" field in JSON. Then a signature is made, BASE64 encoded and added as a "signature" field. So what you post is a JSON object with three BASE64 encoded fields, two of which are BASE64 of other JSON objects. Yes, I know, complicated, but I got all that working. Thankfully the reply is just a JSON object as normal.
BASE64 but not quite normal BASE64
Another fun detail is all of the BASE64 used is not normal BASE64, which is A-Za-z0-9+/ but a URL safe BASE64 which is A-Za-z0-9-_ instead. So even simple debugging using base64 command line tools on linux often failed. Also, normal BASE64 pads with = at the end, but this is all unpadded. I never really understood why padding is used anyway, so quite on board with that one. Fortunately BASE64 is a doddle.
Not just JSON
A somewhat frustrating part of the API is that it is not just about sending and receiving JSON objects. If only!
No, some of the key data you need is in the HTTP headers. Some of this is as per HTTP spec, but they did not have to do it that way - they could simply have sent all of the data you need within the JSON objects, or even duplicated in to JSON objects if they wanted. So you cannot use a simple HTTPS client to get a response, like curl, you have to also get selected header values as well. In some cases more than one such header.
So, my client library was updated to allow selected header extraction.
There is also a special field, a nonce, a code issued by the server which you have to send in the next message you post. This is one of those header fields, but only if you POST something, not if you just GET, so you only grab this header on some of the interactions not all (arg!). You then use this in the JSON (not as a header!!) when you post the next item. This is all good in that it stops someone capturing an interaction and replaying it for their own use, but it is annoyingly inconsistent, header one way, JSON the other, for example. It is, however, included in what is signed in the JSON to avoid tampering.
This is special. It is a hash, BASE64 encoded, of a chunk of JSON which holds the public key (JWK). You send exactly this JSON as part of the ACME messages (in the "protected" part). It is also part of the response the web server has to send when challenged. You have to prove you own the domain by making the web server respond to an HTTP request with a specific value.
What puzzles me why not simple send a nice random string as part of the ACME protocol and expect me to respond with that?
But, no, we have to make this Thumbprint. However, this is where it gets a tad special. First off, the JSON has to be exactly right, with the exact fields you need in exactly the right order and no whitespace. If not, then the signature does match and all you know is it does not match!
Now, this is not a question of using the same JWK you sent in the ACME messages, no. They can be fields in any order, for example, and work. No, it has to be exactly right. However, the ACME accepts it in that format so I can use one function to make it.
But it gets worse. The public key includes the "mod" value, which is a long string of bytes BASE64 encoded. A small note mentions that any leading zero bytes must be stripped. This is not needed for the ACME messages in JWS to work, but if you don't do it, you get a different JWK Thumbprint and so nothing works. It is not even quite what you might do in ASN.1 as the next byte has to not have the top bit set else you indicate the field is negative. This case is simply strip leading zero bytes. That took me hours of testing, comparing to examples, and re-reading the spec.
I am still quite surprised it is not simply some random string provided by the ACME message for the challenge.
Certificate Signing Request and ASN.1
Having got through the challenges and got as far as an authorised order I can send a final request with a CSR and get a certificate. yay!
But I have to make a CSR. So far the FireBrick code has has to decode ASN.1 for certificates and so on, but not generate much ASN.1 (SNMP is somewhat simplified in that area).
So, another couple of hours making an ASN.1 construction library, and then working out what goes in to a CSR. Thankfully tools like openssl will parse what I make at an ASN.1 and CSR level to tell me what I have.
ASN.1 is a bit like riding a bike. Every time you work on it, it all sort of comes back to you...
I am also really impressed with the Let's Encrypt staging server in terms of the error messages it returns. They tell me exactly what I have wrong.
It turns out the certificate only needs the common name, which makes sense as LE only sign that as that is all they have proved, so no need for company and locality and all that.
I was quite chuffed that the first attempt to make a signed CSR just worked, I got the signing right. That is rare in coding.
Two key pairs
So, I finally have a valid and signed CSR, and send that, and get an error telling me the key used for the "account" (all the messages to/from the ACME server, and for the JWK Thumbprint) must be different to the key for the domain (i.e. in the CSR).
So now I have to faff with a second set of keys and make sure they are used in the right place.
Finally we get the certificate and install as normal. Actually, for Let's Encrypt it is two certificate as they have an intermediary one as well.
Testing on a new box, I added a hostname to the config, and 4 seconds later we had working https using that hostname. That is how simple it should be :-)
I have a lot of tidying to do, and we need to make this a bit more polished before a release of FireBrick with this in place.
One idea is handling more than one hostname. I think this will be less common, and originally we thought we would get one certificate with "alt" names on it. However that does leak all of the other names for a brick if you access one. So plan is separately getting a certificate for each, and probably a status page showing progress, and expiry and so on.
To be fair, the host names used with Let's Encrypt are published anyway, which may be an issue for some. But ACME should work with other CAs, though we may have to add extra fields if someone wants to do that.
There are also access control issues over HTTP access during the authentication stage which needs allowing TCP port 80 automatically, even if only for a few seconds, and also being locked down to just the ACME authentication and no other access via that. Not hard, but needs doing with option to turn off.
So, maybe next week we will have alpha releases for people to test.
P.S. Some work over weekend - much more polished, and much better error reporting. Really close to an alpha for customers to test now.