Developing job for Flink

Question

I am building a simple data pipeline for learning purposes. I have real time data coming from Kafka, I would like to do some transformations using Flink. Unfortunately, I'm not sure if I understand correctly deployment options. In the the Flink docs I have found section about Docker Compose and application mode. It says that I can deploy only one

Accepted Answer

I suggest you take a look at Demystifying Flink Deployments.https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/overview/ also gives a good overview.If you&#8217;re interested in setting up a standalone cluster (without Docker or Kubernetes or YARN), see https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/resource-providers/standalone/overview/.And, is it possible to deploy Flink job as a separate docker image?I&#8217;m not sure how to interpret this question. Are you asking if the Flink client can run in a separate image from the Flink cluster that runs the job? You could dockerize a session cluster and submit a job into that cluster from outside it. You&#8217;ll find an example of that in https://github.com/apache/flink-playgrounds/blob/master/operations-playground/docker-compose.yaml.  (That operations playground is a good resource, btw.)Another approach builds a single image that can be run as either a job manager or a task manager, with the flink client and all of its dependencies built into that image. This approach is described in https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/resource-providers/native_kubernetes/#application-mode.It&#8217;s worth noting that a lot of folks aren&#8217;t doing any of this directly, and are instead relying on platforms that manage containerized Flink deployments at a higher level.

Advertisement

Answer