Skip to content
Advertisement

Reading XML with namespace using Apache Beam XmlIO

I am trying to read an XML file into an Apache Beam pipeline. Some elements have namespaces and the namespace declaration is declared at the root node. I am able to parse the xml outside of Apache Beam using the standard JAXB parser. However, when I use XmlIO.read() function with beam I get the following exception:

com.ctc.wstx.exc.WstxParsingException: Undeclared namespace prefix “g”.

JavaScript

Beam code:

JavaScript

XML without namespace works fine. Any pointers is much appreciated. Thanks

Advertisement

Answer

Looking at XmlSource code, unfortunately, I don’t think it supports XML namespaces by default if you only specify a root element.

Though, as a workaround you can try to do something like this:

JavaScript

and probably it will work.

Advertisement