How to avoid warning message when read BigQuery data to custom data type: Can’t verify serialized elements of type BoundedSource

Tags: , ,



I defined a custom data type reference the document here. https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java#L127

And read data from BigQuery using the code below. https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java#L375

Warning message: Can’t verify serialized elements of type BoundedSource have well defined equals method. This may produce incorrect results on some PipelineRunner.

This message occurs at step TriggerIdCreation/Read(CreateSource)/Read(CreateSource)/Read(BoundedToUnboundedSourceAdapter)/StripIds.out0

I tried to add equals() method to the custom data type class like this

    @Override
    public boolean equals(Object object) {
        if (this == object) return true;
        if (object == null || getClass() != object.getClass()) return false;
        if (!super.equals(object)) return false;
        WeatherData that = (ErrorTelop) object;
        return Objects.equals(xxx, that.xxx) &&
               Objects.equals(yyy, that.yyy);
    }

which does not help.

Any one have an idea to avoid this warning?

Answer

The warning you’re getting doesn’t seem to be due to anything you’re doing. I think the warning is coming from something Apache Beam itself is doing. The actual type that it’s complaining about is BoundedSource, an internal Beam type, not your custom type, and from my looking through the code it’s most likely related to the BoundedToUnboundedSourceAdapter mentioned there.

If your pipeline is working correctly, then you can probably ignore this. If you do want to alert someone, you could contact the Beam user or dev lists.



Source: stackoverflow