We are in the process of designing the migration of our monolithic Java application to microservices to meet various client requirements such as scalability, high availability, etc. The core function of our application is data processing, i.e. retrieve data from a source, put it through 0 or more transformations, and finally push the result to a destination. For this reason, we are looking at Spring Cloud Data Flow running on Kubernetes and Kafka to do the heavy lifting for us, with a few custom built stream applications to handle our business logic.
One thing we have not yet figured out though is how it could handle synchronous responses to requests being sent in via an HTTP source – specifically when some processing is required before responding. For example, let us say that a request is received containing two different amounts in a JSON packet. We then pass this on to a custom “addition” transformer that outputs the sum of these amounts, and needs to return the result back to the calling party. However, because the transformer is a completely separate process that received the data by consuming it from a Kafka topic, it has no access to the original HTTP connection to respond.
Is this something that is possible with Spring Cloud Data Flow, perhaps by combining it with something like Spring Cloud Gateway to manage the HTTP connection? Or are we barking up the wrong tree completely?
It is not easy combine Async Flows (Spring Cloud Data Flow) with a Sync HTTP Flow (HTTP Requests have a timeout and Async Flow processing time is not known upfront). What you can do is return an id as response for your HTTP source and have another HTTP Endpoint to check the status of the initial request and get the result back using that. This is a polling approach.