A user enters text as HTML in a form, for example: I want to be able to output only a part of the string ( for example the first 20 characters ) without breaking the HTML structure of the user’s input. In this case: which renders as Is there a Java library able to do this, or a simple method
Tag: html-parsing
How can I efficiently parse HTML with Java?
I do a lot of HTML parsing in my line of work. Up until now, I was using the HtmlUnit headless browser for parsing and browser automation. Now, I want to separate both the tasks. I want to use a light HTML parser because it takes much time in HtmlUnit to first load a page, then get the source and