Skip to content

XML parsers should not be vulnerable to XXE attacks. Best way to solve with ZERO impact?

I have a few SonarQube vulnerabilities and one of them caught my eye.

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {           
    DocumentBuilder db = dbf.newDocumentBuilder();
    dom = db.parse(sIn);
} catch (ParserConfigurationException pce) {
    throw pce;
} catch (SAXException se) {
    throw se;
} catch (IOException ioe) {
    throw ioe;

As you can see in my code, I have a new DocumentBuilder(); and then I parse this:

InputStream sIn = new ByteArrayInputStream(contenidoXml.getBytes(StandardCharsets.UTF_8));

The Sonar “solution” is to do one of the following things:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// to be compliant, completely disable DOCTYPE declaration:
factory.setFeature("", true);
// or completely disable external entities declarations:
factory.setFeature("", false);
factory.setFeature("", false);
// or prohibit the use of all protocols by external entities:
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");

This is legacy code and I am quite lost here. Could someone explain me the differences between the three solutions and which one is more possible to have zero impact in the code (we have to update a different classes but last time this was deployed SQ didn’t even exist in my company).



Refer this for general XXE information

Disable DOCTYPE declaration – Disables processing of DOCTYPE, possible impact is the document may not be validated only wellformedness check would be done

Disable External Entity – External entities if declared in the document will not be de-referenced, the values of the entities (if used in the document) would be null or (depending on the underlying parser configured) could through parse exception

Prohibit use of protocol – The parser will not use any protocol to access external DTD or Schema. All external DTD, schema should be available locally through SYSTEM identifier or registered with the parser

The method you choose would be largely dependent on the document you are parsing. If it uses schema, disabling DOCTYPE could be a good solution. If the document is guaranteed not to use external entities, disabling both DOCTYPE and external entities could be a better approach

User contributions licensed under: CC BY-SA
7 People found this is helpful