Perl 5 has a module on CPAN named Text::Unidecode
that transliterates Unicode into ASCII. So, for instance, if you hand it the string "“北亰 — it’s the best”"
it hands back the string ""Bei Jing -- it's the best""
. A quick search for Java libraries to do the same thing only turned up code that would strip Unicode characters or turn accented characters into non-accented characters.
Does anyone know of a Java library that produces similar output to Text::Unidecode
?
Advertisement
Answer
A quick Google says: http://junidecode.sourceforge.net/ – but looks like it hasn’t been updated for a while.