How to unescape HTML 5 entities in Java (')



The answers to this question mostly suggest to use apache-common-text StringEscapeUtils. But this (latest version of commons-text is 1.9) only supports HTML 4, and Mastodon appears to use HTML 5 which includes '. How can I decode HTML 5 entities, including '?

Answer

unbescape does the job well:

final String unescapedText = HtmlEscape.unescapeHtml("'");
System.out.println(unescapedText);

Result:

'

Maven:

<!-- https://mvnrepository.com/artifact/org.unbescape/unbescape -->
<dependency>
    <groupId>org.unbescape</groupId>
    <artifactId>unbescape</artifactId>
    <version>1.1.6.RELEASE</version>
</dependency>


Source: stackoverflow