Skip to content
Advertisement

Replace ASCII codes and HTML tags in Java

How can i achieve below expecting results without using StringEscapeUtils ?

JavaScript

Current Results:

JavaScript

Expecting Results:

JavaScript

Already checked: How to unescape HTML character entities in Java?


PS: This is just a sample example, input may vary.

Advertisement

Answer

Your regexp is for html tags <something> would be matched byt the html entities will not be matched. Their pattern is something like &.*?; Which you are not replacing.

this should solve your trouble:

JavaScript

If you want to experiment with this in a sandbox, try regxr.com and use (<.*?>)|(&.*?;) the brackets make the two different capturing groups easy to identify on the tool and are not needed in your code. note that the does not need to be escaped on that sandbox playground, but it has to be in your code, since it’s in a string.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement