Tuesday, 3 November 2009

Arrg punycode

Why are they doing IDN this way.

EVERY app that wants to do domains has to understand it.

WTF are they not just making hostnames and domain names that are utf-8 just valid.

I just checked, and the version of bind we run as authorative server just works using UTF-8 hostnames in its config. It just works FFS, and that is the most popular authorative server and the most popular resolver FFS.

Looks like, for some things, the resolvers allow UTF-8 and for others not. But that has to be a lot easier to fix than every app getting punycode FFS. There are like two or three caching resolvers ISPs use.

The DNS protocol has always been 8 bit clean and it seems that many commands and apps just pass the hostname through the resolver library to the resolver as is so would need no changes at all.

That would have been way less work than changing every browser, every email client, ping, dig, nslookup, whois, telnet, ssh, and so on to understand punycode.

Of course, now the browsers have changed, we can't do this at DNS as the request no longer gets to DNS!!

Grrr.

I may make our auth servers put the utf-8 coding of punycode hosts in to the zone file anyway so things like ping and telnet will just work. That would be cool. I'll experiment.

3 comments:

  1. Yeah I never understood that. So some (broken) resolvers aren't 8 bit clean. It's the choice of fixing them or every single app that does DNS, everywhere.

    *even* if punycode is deemed necessary, why is it being done in the applications not the resolver libraries? Why do apps need to know about this stuff?

    ReplyDelete
  2. I'd like to see resolvers handle the utf-8 as well.

    It would be simple for bind to always understand the utf-8 or punycode as meaning the same thing.

    Then punycode and gradually die a death!

    ReplyDelete