Perl 5 has a module on CPAN named Text::Unidecode that transliterates Unicode into ASCII. So, for instance, if you hand it the string "“北亰 — it’s the best”" it hands back the string ""Bei Jing -- it's the best"". A quick search for Java libraries to do the same thing only turned up code that would strip Unicode characters or turn accented characters into non-accented characters.
Does anyone know of a Java library that produces similar output to Text::Unidecode?
Advertisement
Answer
A quick Google says: http://junidecode.sourceforge.net/ – but looks like it hasn’t been updated for a while.