Skip to content

Tag: apache-beam

Apache Beam Resampling of columns based on date

I am using ApacheBeam to process data and trying to achieve the following. read the data from CSV file. (Completed ) Group the records based on Customer ID (Completed) Resample the data based on month and calculate the sum for that particular month. Detailed Explanation: I have a CSV file as shown below. customerId date amount BS:89481 11/14/2012 124 BS:89480

Reading XML with namespace using Apache Beam XmlIO

I am trying to read an XML file into an Apache Beam pipeline. Some elements have namespaces and the namespace declaration is declared at the root node. I am able to parse the xml outside of Apache Beam using the standard JAXB parser. However, when I use function with beam I get the following exception: com.ctc.wstx.exc.WstxParsingException: Undeclared namespace prefix