I want to transform a non-parallel data source to a parallel data source in Apache Flink. In pseudocode, it would be something like this:
int partitions = env.getParallelim(); DataSource<String> input = new CustomDataSource<String>(); DataSource<String> parallel = input.setParallelism(partitions).suffle();
I got it done by implementing a noop map function but I suppose there are more elegant ways.
Thanks
Advertisement
Answer
You can use ParallelSourceFunction
instead of SourceFunction
as interface to be implemented in CustomDataSource
.