New top level domains

ICANN have signed agreements for four new top level domains, all internationalized domains from the 2o12 applications for new TLDs.
They are شبكة (“network” or maybe “web” in arabic), 游戏 (“game” in chinese), онлайн and сайт (“online” and “website” in russian).
It’ll take a while for the registries to ramp up their infrastructure, but you might start seeing domains registered in these TLDs as soon as Q1 of next year. Email in internationalized domains is still not really viable, but web pages with fully internationalized URLs certainly are, and they’re likely to get much more popular with these new TLDs.
Can your message composition and reputation monitoring infrastructure handle non-ascii URLs correctly? If not, you’ve got six months or so to start getting that in place.

Related Posts

Which is better UTF-8 or ISO-?

Someone asked today on a mailing list whether they should be using UTF-8 or “ISO” encoding for sending email. What’s the best choice depends on some of the details of the situation, but here’s the answer I gave:
UTF-8 will work for pretty much anything, as it’s just an 8 bit encoding scheme for Unicode (which is supposed to be the one character encoding to rule them all). It’s well supported in most languages and development environments – Windows has been native UTF-16 under the covers since the mid 90s, for instance – and typical messages that use mainstream glyphs should render well from utf-8 in most western MUAs and browsers.
There are still a very few old or broken clients out there that will not handle UTF-8 well but (outside the asian language market, where there’s still some non-ASCII, non-Unicode legacy usage) they’re typically ones that don’t really handle any character set encoding well and the only thing safe to send to them is either plain ASCII or whichever ASCII superset their OS happens to support natively (which is probably an argument for sending Windows-1252 codepage, but not a terribly strong one).
The various extended ASCIIs (such as ISO-8859-*) will only work for messages that are written solely using characters from that character set. If you have even one character in a message that cannot be expressed in ISO-8859-1, then you can’t use ISO-8859-1 to send that message.
ISO-8859-1 (aka Latin1) is fairly sloppy in some respects – it has no apostrophe, nor single quotes, for instance – but it can handle an awful lot of languages, from Kurdish to Swahili. It can’t handle Dutch, Estonian, Finnish, Hungarian and Welsh particularly well, nor can it show the Euro symbol (ISO-8859-14 or -15 are needed for some characters there).
A common problem is that many people (and the software they write) think that Windows uses Latin1. It doesn’t, it uses Windows-1252. If you accept messages written on Windows, using the Windows-1252 code page, and throw them out on the wire as ISO-8859-1 what you end up with is not quite right. It mostly works, as the two codepages overlap quite a bit, but they have different glyphs in the 0x80-0x9f range. So if you use single or double quotes (“smart quotes”), or the Euro symbol, or ellipses, or bullet, or the trademark symbol in your message they’ll be garbled. This is so common that some mail clients and web browsers will actually treat a document that claims to be ISO-8859-1 as Windows-1252, but that’s a bug workaround and not something it’s really safe to rely on.
If you’re doing personalized messages, and you’re sending one of them to Győző and one of them to Eiður then you may have to use different character sets for the two messages. If you’re talking about Győző and personalizing it for Eiður then you might find things break horribly.
Someone probably has some concrete data on mail client character set support, broken down by region and language, but my understanding is that this is a reasonable approach:

Read More

XXX coming to a mailbox near you… eventually

ICANN voted to advance the .xxx TLD proposal today. As one might expect, this is a sponsored TLD intended for use porn and adult sites. This doesn’t mean that .xxx domains will be available immediately, there are still multiple steps in the process.
The next step is for ICANN to investigate the business model of ICM Registry, the sponsor of this TLD. If that passes, then .xxx domains will be coming to web browsers and mailboxes near you.
Update: For those of you who are shocked there is porn on the internet I only have to say (NSFW video + audio after the cut)

Read More