Skip to content
Advertisement

Apache Commons Text StringEscapeUtils vs JSoup for XSS prevention?

I want to clean user input for help preventing XSS attacks and we don’t necessarily care to have a HTML whitelist, as our users shouldn’t need to post any HTML / CSS.

Eyeing the alternatives out there, which would be better? [Apache Commons Text’s StringEscapeUtils] [1] or [JSoup Cleaner][2]?

Thanks!

Update:

I went with JSoup after writing some unit tests for both it and Apache Commons Text.

I like how JSoup won’t mess with single quotation marks (i.e. “Alan’s mom” isn’t unchanged, whereas Apache Commons Text turns it into “Alan’s mom”).

And the whitelist wasn’t a problem at all. It didn’t require any configuration, rather, they have some built-in options included which may come in handy if we choose to allow some subsets of HTML tags. [1]: https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/StringEscapeUtils.html [2]: http://jsoup.org/cookbook/cleaning-html/whitelist-sanitizer

Advertisement

Answer

“Better”? I don’t think it matters. Cleaner has a Whitelist.none(), escape utils will escape everything.

It depends on how you want the “cleaned” input to render: do you want just the text nodes, or do you want the escaped HTML to show up?

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement