Skip to content
Advertisement

How to get the result of a job in another job in Flink? [closed]

Here is the situation.
I have two data sources, a message queue and a MySQL table, which can be regarded as DataStream and DataSet respectively.I want to start a job based on DataStream to pull data from the message queue and perform some calculation. In the progress of calculation, a job based on DataSet(the MySQL table) is needed, whose OutputFormat should return the result to the DataStream job.
I’m stuck here and need some help.

Advertisement

Answer

You cannot mix the DataStream and DataSet APIs in the same job. But there are ways to access MySQL from a streaming job. You can:

  1. query MySQL from a flatmap
  2. use async i/o to do that more efficiently
  3. stream in the data from mysql using something like debezium

Depending on how you want to connect the data from mysql to your other stream(s), you may want to use a CoFlatmapFunction, or a CoProcessFunction.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement