The answers to this question mostly suggest to use apache-common-text StringEscapeUtils
. But this (latest version of commons-text is 1.9) only supports HTML 4, and Mastodon appears to use HTML 5 which includes '
. How can I decode HTML 5 entities, including '
?
Advertisement
Answer
unbescape does the job well:
final String unescapedText = HtmlEscape.unescapeHtml("'"); System.out.println(unescapedText);
Result:
'
Maven:
<!-- https://mvnrepository.com/artifact/org.unbescape/unbescape --> <dependency> <groupId>org.unbescape</groupId> <artifactId>unbescape</artifactId> <version>1.1.6.RELEASE</version> </dependency>