We are creating a simple apache beam pipeline which inserts data into a bigquery table and We are trying to get the tableRows which have been successfully Inserted into the table and tableRows which are errored, the code is as shown in the screenshot
According to the following documentation:
https://beam.apache.org/releases/javadoc/2.33.0/org/apache/beam/sdk/io/gcp/bigquery/WriteResult.html
BigQueryIO.writeTableRows()
returns a WriteResult
object which has getSuccessfulInserts()
which will return a PCollection<TableRow>
which contains the TableRows
that have been successfully inserted and getFailedInserts()
will return another PCollection<TableRow>
which contains the TableRows
that have been failed.
But when we are testing the pipeline getFailedInserts()
seems to be working, it is getting the TableRows
which are failed, but getSuccessfulInserts()
is always getting an empty PCollection
we have tried everything but it doesn’t seem to work. Are we missing something here?
Thanks in Advance.
Advertisement
Answer
This is a bug introduced in the Apache Beam 2.36.0 release following some refactoring. The fix is in https://github.com/apache/beam/pull/16768 and will be in the 2.37.0 release.