Thursday, 22 June 2017

Badly written RFCs

There are many badly written RFCs, but I encountered one annoying one today

Update: Thanks for the comments, and I concede that the next page refers to "value" being quoted. Even so, the whole idea of using BNF syntax is to be unambiguous, and this is still therefor a rather badly written RFC!

So, for those less techie, "RFC" is "Request for comment" which is a sort of passive way of pushing proposals on other people when the Internet first started. The "standards" we now follow are all RFCs. There is a slightly more formal process that can promote an RFC to an actual standard.

The idea is that the RFC says how something works, especially when it is a protocol. People they try to make their systems work to the RFC.

Now, there is always a degree of ambiguity, and so there is a really good principle which has worked well for the Internet which is that you should be tolerant of what you receive and strict in what you send. Basically, if the standard says to do something you should aim to be as accurate and correct as possible in what you send. However, when someone sends something to you, and there is some flexibility in what you accept, you should try to be flexible and work out what it means.

This has allowed the slightly flawed and imperfect implementation of many standards.

Today's issue was an RFC over MIME email, specifically RFC 2387 which apparently has status of "proposed standard", and is 19 years old.

We were sending a MIME object in an email that is multipart/related with a "type" field, specifically type=text/html which says that the "root" of that object is of a type text/html. All well and good. Indeed, a type= attribute is mandatory in RFC2387.

The problem is that the email did not work properly when emailing yahoo addresses. But other email systems did work. We experimented and found the "fix" was to send type="text/html" instead.

Now, I am not happy about this! The RFC has this to say on the type attribute :-

This defines that type is specified (after a ;) has the text type, then the character = and then the type and the character / and then the subtype.

So, unless we are saying that the type is "text and the subtype is html", sending type="text/html" is simply wrong.

The problem is the examples in the RFC (and errata), such as :-

(the errata added the missing ; on the end of the first and third line)
Where the hell did those quote marks come from?

So, yahoo are not being tolerant in what they accept, they are actually expecting non standard data, and we were being strict in what we sent, but to make it work we are now being non standard.

There really is not much worse than an RFC where its own examples do not comply with the RFC!

I have submitted and errata to the RFC editors for this.


  1. This section spans a page. You missed the bit on the next page, which reads, in part, edited for line length:

    value := token / quoted-string
    ; value cannot begin with "<"

    Note that the parameter values will usually require quoting.

    I note that only cids, i.e. message-ids, should be unquoted, because you have to escape any value that contains a slash, and most content-types contain a slash.

    1. Thanks, interesting.
      Even do, it means yahoo recruiting the quoting is wrong to do so.

    2. Also it talks of parameter value, there is "value" in that syntax. I do not see what tells us type "/' subtype is in fact a parameter value. Maybe I am missing that too.

    3. Even so, thanks for pointing out my omission.

  2. I don't see how the definition of value is relevant to the type or subtype - it applies to the value in the start-info.

    As type and subtype seem to be undefined (unless they're defined in a reference somewhere), it does appear to allow either type="text"/"html" or type=text/html, but not type="text/html".

  3. P.S. Although I suppose you could argue that the un-specified definition of type and subtype are:

    type := '"' realtype
    subtype := realsubtype '"'

    1. I am glad it is not just me that found this an iffy RFC to be honest.

  4. The [MIME] bit on the end of the line that was missed at first is a reference to RFC2045 - that's where you're supposed to go for an explanation of the 'value' in this case.

    1. But the "value" isn't involved.

    2. Although RFC2045 does define type and subtype as well. They're not quoted, however:

  5. Some spec authors are just far too lax. I reported many errors, ambiguities and inconsistencies in the specs for CSS to the W3C many years ago, concerning the published grammar for CSS and distribution of whitespace shown in it.

  6. Did you put in an evil kludge just for yahoo selectively, or were you forced to do it wrongly for everyone?

    1. I decided, given the the examples quoted the type/subtype, and it worked with the mail clients we tried, I quoted them...

  7. The grammar also needs to be rewritten to have an additional rule, something like
    type := mimetype / (doublequote mimetype doublequote)
    mimetype := maintype "/" subtype

    and then get rid of the / subtype in the earlier rule.

  8. The problem here is that RFC 2387 conflicts with the general syntax of the Content-Type field in RFC 2045. RFC 2045 says:

    parameter := attribute "=" value
    value := token / quoted-string

    and states that a tokens cannot include '/'. So an unquoted type parameter containing a '/' is forbidden by RFC 2054. RFC 2045 also says (by example) that a token and quoted-string with the same value are equivalent, which conflicts with RFC2387's requirement that some values be unquoted.

    I think the upshot of this is that (1) by RFC 2054, everyone should treat a quoted type parameter here the same as an unquoted one and (2) also by RFC 2054, Yahoo! are quite entitled to reject an unquoted parameter containing a '/'. So your approach of always emitting the parameter quoted is the one I'd go for as well.