Skip to content
Advertisement

catch PDFBox warnings when loading erroneous PDFs

when loading a PDF with PDFBox one gets log-level warnings if the PDF is erroneous:

JavaScript

For example, this could lead to the following output on the console:

JavaScript

Obviously, the pdf has some errors in the content stream, but it does load into doc. But would it be possible to catch this warnings programmatically with PDFBox? Do some properties exist which tell you about the warnings after the document has been loaded?

I’ve tried PDFBox-Preflight, but that checks for PDF/A compliance, which leads to much more messages.

Advertisement

Answer

Try the non-lenient mode of the parser. This code is from the ShowSignature.java example:

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement