Skip to content
Advertisement

How to find Self-Closing Tags with org.w3c.dom

Does anybody know, how to find self closing tags of the XML document?
I am able to get all the elements of specific type, but I am unable to find elements, that are self closing and also, I need to find elements, with no attributes.

var dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
var db = dbf.newDocumentBuilder();

var urlToFile = MyClass.class.getClassLoader().getResource("file.xml");
var file = Paths.get(urlToFile .toURI()).toFile();
var doc = db.parse(file);

doc.getDocumentElement().normalize();

var list = doc.getElementsByTagName("myTag");

for (int i = 0; i < list.getLength(); i++) {

     var node = list.item(i);

     if (node.getNodeType() == Node.ELEMENT_NODE) {

          var bits = node.getChildNodes();

          for (int j = 0; j < bits.getLength(); j++) {

               if (bits.item(j).hasAttributes()) {
                    // var parrentAttrName = bits.item(j).getNodeName();
                    // getValueFromAttribute is my private method
                    var nameAttrValue = getValueFromAttribute(bits, j, "name");
                    var stateAttrValue = getValueFromAttribute(bits, j, "state");

                    bits.addElementToList(new MyBit(nameAttrValue, stateAttrValue));
                }

                if(!bit.item(j).hasAttributes()) {
                     // not working 
                     System.out.println(bits.item(j));
                }
          }
     }
}

My XML file has two types of myTag tags:

  1. Pair tags, that contains another nested child elements <myTag><someElementHere /></myTag>
  2. Self-closing tags, that are specifying some other behaviour <myTag/>

Is there a mechanism, to find this kind elements? The one possible thing would be, to match the regex of self closing tags, but I was thinking of some other solution possible.

Advertisement

Answer

Self closing tags have no children but so do empty tags. That said, XPath could be used to find elements with no children or with attributes

Given

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <test/>
    <test a="a"/>
    <empty></empty>
    <test>
        <a>a</a>
    </test>
    <test>text</test>
    <deep>
        <some b="b" />
    </deep>
</root>

Find elements with no children with //*[count(./descendant::*) = 0 and count(./text()) = 0]

xmllint --shell test.xml
</ cat //*[count(./descendant::*) = 0 and count(./text()) = 0]
<test/>
 -------
<test a="a"/>
 -------
<empty/>
 -------
<some b="b"/>

Find elements with attributes with xpath //*[count(./@*)> 0]

/ > cat //*[count(./@*)> 0]
 -------
<test a="a"/>
 -------
<some b="b"/>

Note: XPath is language agnostic so it should work in java.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement