Skip to content
Advertisement

How to design my process – loop over a list(cannot use simple ListItemReader because list not at root level) + 1 query for each item – spring batch

If some of you could help me design my spring batch process it would be nice 😀 I need to perform ETL by consuming a REST API & then store some data from it. This process must be daily & Spring Batch seems perfect to achieve what I want since we already are using Spring framework for a lots of stuff at the company I work at. But I am struggling on how to design my job(s?)/tasklet etc.

Could you please help me designing what would be the most appropriate way to do what I want ?

Summary of what i need to do :

  1. Consume a summary list of all Items
  2. Loop over those items to retrieve an HREF field
  3. Query each HREF
  4. Insert in DB (only the data I need, 90% of the data are useless for me)

I am wondering how I should translate those steps into the spring batch way. Should i create 1 tasklet + 1 chunk job, tasklet for the main list + then write href to local file & job read from local file + write to db ? (it’s about 10k items only so local file would be ok) Should I create only 1 tasklet where the reader does both query summary + each individual endpoint ? Which one would be the most performant ? I don’t need to max perfs, i’m quite new at Spring batch and i’m wondering how to design the processing 🙂
Thanks !!


EDIT : I cannot use a simple list because the list is not at root level but in a “data” property at root level. Also by “Query each HREF” I meant perform an API call using the HREF value which is a link to the endpoint of a single item data that I must query because i need data from it not present in the 1st list given by the API.


EDIT 2 : See comments on accepted answer for solution.

Advertisement

Answer

How to design my process – loop over a list + 1 query for each item – spring batch

You can create a chunk-oriented step as follows:

  • An item reader that returns items from the list (ListItemReader might work)
  • An item processor that enriches items with HREF field
  • A JdbcBatchItemWriter to insert items in the DB

This is a common pattern, and is documented here: Driving Query Based ItemReaders. That said, this pattern works well with small/medium data sets, but not with large data sets as it requires one or more query for each item. The following threads might be helpful with regard to that matter:

9 People found this is helpful
Advertisement