Skip to content
Advertisement

javadoc of SingleOutputStreamOperator#returns(TypeHint typeHint) method

I am reading the source code of SingleOutputStreamOperator#returns, its javadoc is:

/**
 * Adds a type information hint about the return type of this operator. This method
 * can be used in cases where Flink cannot determine automatically what the produced
 * type of a function is. That can be the case if the function uses generic type variables
 * in the return type that cannot be inferred from the input type.
 *
 * <p>Use this method the following way:
 * <pre>{@code
 *     DataStream<Tuple2<String, Double>> result =
 *         stream.flatMap(new FunctionWithNonInferrableReturnType())
 *               .returns(new TypeHint<Tuple2<String, Double>>(){});
 * }</pre>
 *
 * @param typeHint The type hint for the returned data type.
 * @return This operator with the type information corresponding to the given type hint.
 */

It mentions FunctionWithNonInferrableReturnType to show case the necessity of returns method, but I am unable to write such a class that is NonInferrableReturnType. Could you please help write a simple one? Thanks!

Advertisement

Answer

When the docs says NonInferrableReturnType it means that we can use the type variable <T>, or any other letter that you prefer. So you can create a MapFunction that return a T. But then you have to use .returns(TypeInformation.of(String.class) for example, if your goal is to return a String.

public class MyMapFunctionNonInferrableReturnType<T> implements MapFunction<AbstractDataModel, T> {
    @Override
    public T map(AbstractDataModel value) throws Exception {
        return (T) value.getValue();
    }
}

Here I am using the classes of your last question Compiling fails when creating MapFunction with super type . The same code without .returns(TypeInformation.of(String.class)) compiles but throw the runtime exception:

could not be determined automatically, due to type erasure. You can give type information hints by using the returns(…) method on the result of the transformation call, or by letting your function implement the ‘ResultTypeQueryable’ interface.

public class NonInferrableReturnTypeStreamJob {

    private final List<AbstractDataModel> abstractDataModelList;
    private final ValenciaSinkFunction sink;

    public NonInferrableReturnTypeStreamJob() {
        this.abstractDataModelList = new ArrayList<AbstractDataModel>();
        this.abstractDataModelList.add(new ConcreteModel("a", "1"));
        this.abstractDataModelList.add(new ConcreteModel("a", "2"));
        this.sink = new ValenciaSinkFunction();
    }

    public NonInferrableReturnTypeStreamJob(List<AbstractDataModel> abstractDataModelList, ValenciaSinkFunction sink) {
        this.abstractDataModelList = abstractDataModelList;
        this.sink = sink;
    }

    public static void main(String[] args) throws Exception {
        NonInferrableReturnTypeStreamJob concreteModelTest = new NonInferrableReturnTypeStreamJob();
        concreteModelTest.execute();
    }

    public void execute() throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        env.fromCollection(this.abstractDataModelList)
                .map(new MyMapFunctionNonInferrableReturnType())
                .returns(TypeInformation.of(String.class))
                .addSink(sink);

        env.execute();
    }
}

In case you wish, here is the integration test for this example:

import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
import org.apache.flink.test.util.MiniClusterWithClientResource;
import org.junit.ClassRule;
import org.junit.Test;
import org.sense.flink.examples.stream.valencia.ValenciaSinkFunction;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import static junit.framework.TestCase.assertEquals;
import static org.junit.Assert.assertTrue;

public class NonInferrableReturnTypeStreamJobTest {

    @ClassRule
    public static MiniClusterWithClientResource flinkCluster;
    private final int minAvailableProcessors = 4;
    private final boolean runInParallel;

    public NonInferrableReturnTypeStreamJobTest() {
        int availableProcessors = Runtime.getRuntime().availableProcessors();
        this.runInParallel = availableProcessors >= minAvailableProcessors;
        if (this.runInParallel) {
            flinkCluster = new MiniClusterWithClientResource(
                    new MiniClusterResourceConfiguration.Builder()
                            .setNumberSlotsPerTaskManager(minAvailableProcessors)
                            .setNumberTaskManagers(1)
                            .build());
        }
    }

    @Test
    public void execute() throws Exception {
        List<AbstractDataModel> abstractDataModelList = new ArrayList<AbstractDataModel>();
        abstractDataModelList.add(new ConcreteModel("a", "1"));
        abstractDataModelList.add(new ConcreteModel("a", "2"));
        ValenciaSinkFunction.values.clear();

        NonInferrableReturnTypeStreamJob streamJob = new NonInferrableReturnTypeStreamJob(abstractDataModelList, new ValenciaSinkFunction());
        streamJob.execute();

        List<String> results = ValenciaSinkFunction.values;
        assertEquals(2, results.size());
        assertTrue(results.containsAll(Arrays.asList("1", "2")));
    }
}
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement