Skip to content

Tag: apache-beam

Apache Beam Split to Multiple Pipeline Output

I am subscribing from one topic and contains different event types and they pass in with different attributes. After I read the element, based on their attribute, I need to move them to different places. This is the sample code look like: Basically I read an event and filter them on their attribute and write the file. The job failed

Apache Beam BigqueryIO.Write getSuccessfulInserts not working

We are creating a simple apache beam pipeline which inserts data into a bigquery table and We are trying to get the tableRows which have been successfully Inserted into the table and tableRows which are errored, the code is as shown in the screenshot According to the following documentation: BigQueryIO.writeTableRows() returns a WriteResult object which has getSuccessfulInserts() which will

Apache Beam Resampling of columns based on date

I am using ApacheBeam to process data and trying to achieve the following. read the data from CSV file. (Completed ) Group the records based on Customer ID (Completed) Resample the data based on month and calculate the sum for that particular month. Detailed Explanation: I have a CSV file as shown below. customerId date amount BS:89481 11/14/2012 124 BS:89480

Beam PAssert messes up the Row

I am exploring testing with Beam and encountered a weird problem. My driver program works as expected, but its test is failing with an error like this: And here is my PAssert code: On the last step of my pipeline, I log the element in question. This is the expected result. When I debugged the test, the problem boiled down