I have managed to get the pile of poo to correctly display on my iPhone as an incoming SMS text (i.e. using normal GSM SMS not iMessage or some such).
This is actually quite a milestone. There are various gateways to send texts but they all seem to have limitations or ways in which they translate to/from the GSM SMS protocol.
We can usually manage to handle multi-part (i.e. very long) texts, just about. Most of the time we can even handle something called a User Data Header (UDH) which is extra binary data sent with the message.
Getting UDH right is actually crucial for iMessage registrations to work at all. Otherwise you iPhone would not believe you had the number you have (it sends a text and expects a response that has a UDH).
Getting those key things to work is hard enough, but character set coding is a nightmare. This is because texts can be sent in one of three character sets.
- GSM 7 bit character set. This has 128 characters, which include the normal letters (A-Z,a-z) numbers, punctuation, and a load of accented characters as well as upper case Greek. A text can have 160 characters using this coding. There are then extra characters using ESC (escape) as a prefix to get things like a Euro symbol (using two characters). Even just getting the @ character to work can be a challenge as it is coded on character 00 and not its usual place which breaks some things.
- USC 8 bit characters - the first 256 unicode characters. You can have 140 of these in a text.
- USC 16 bit characters - the first 65536 unicode characters. You can have 70 of these in a text.
The big issue is most text gateways are ASCII or some such, and do not map to/from these character sets. Even when XML is used that handles UTF-8, teh systems rarely give enough attention to detail to translate characters correctly. We have taken the view that the only right way to do things is to use UTF-8 coding for our interfaces with customers for texts and for us to do the translations right! For this reason we have been nagging the mobile operator, and they have finally come through for us.
The good news today is that the low level raw interface has been opened up allowing texts to and from our voice SIMs to use any of these character coding and UDH.
But even with all of that, the Pile of poo is extra special. It is 1F4A9 which is too big even for UCS16 coding. The trick is to use UTF-16 to use two of the UCS16 codes (total 32 bits) to code it. To my utter surprise this actually works and iPhones handle it!
We are gradually integrating various aspects of our new texting system now. The clean interface to and from our mobile SIMs is a really good start. If we can get other mobiles and even land line numbers all integrated more seamlessly, that will be even better.