Skip to content
Advertisement

Spring Batch – My Batch seems executing two steps at the same time?

I can’t really understand what’s going on. I’m studying Spring Batch and I’d like to execute two steps for some reasons, one after the other.

Now please don’t mind what the steps are currently doing, just keep in mind that I would like to perform two steps sequentially.

This is the code:

@Configuration
@EnableBatchProcessing
public class JobConfiguration {
    @Autowired
    private JobBuilderFactory jobBuilderFactory;

    @Autowired
    private StepBuilderFactory stepBuilderFactory;
    
    private List<Employee> employeesToSave = new ArrayList<Employee>();


    public JsonItemReader<Employee> jsonReader() {

        
        System.out.println("Try to read JSON");
        
        final ObjectMapper mapper = new ObjectMapper();

        final JacksonJsonObjectReader<Employee> jsonObjectReader = new JacksonJsonObjectReader<>(
                Employee.class);
        jsonObjectReader.setMapper(mapper);

        return new JsonItemReaderBuilder<Employee>().jsonObjectReader(jsonObjectReader)
                .resource(new ClassPathResource("input.json"))
                .name("myReader")
                .build();

    }
    

    public ListItemReader<Employee> listReader() {
        
        System.out.println("Read from list");
        
        return new ListItemReader<Employee>(employeesToSave);

*/
    }
    

    public ItemProcessor<Employee,Employee> filterProcessor() {
        return employee -> {
            
            System.out.println("Processing JSON");
            
            return employee;
        };
    }

    public ItemWriter<Employee> filterWriter() {

        return listEmployee -> {
            
            employeesToSave.addAll(listEmployee);
            System.out.println("Save on list " + listEmployee.toString());
                        
        };

    }

    public ItemWriter<Employee> insertToDBWriter() {

        System.out.println("Try to save on DB");
        return listEmployee -> {

            System.out.println("Save on DB " + listEmployee.toString());
                        
        };

    }

    public Step filterStep() {
        
        StepBuilder stepBuilder = stepBuilderFactory.get("filterStep");
        SimpleStepBuilder<Employee, Employee> simpleStepBuilder = stepBuilder.chunk(5);
        return simpleStepBuilder.reader(jsonReader()).processor(filterProcessor()).writer(filterWriter()).build();
    }

    public Step insertToDBStep() {
        
        StepBuilder stepBuilder = stepBuilderFactory.get("insertToDBStep");
        SimpleStepBuilder<Employee, Employee> simpleStepBuilder = stepBuilder.chunk(5);
        return simpleStepBuilder.reader(listReader()).writer(insertToDBWriter()).build();
    }

    @Bean
    public Job myJob(JobRepository jobRepository, PlatformTransactionManager platformTransactionManager) {

        return jobBuilderFactory.get("myJob").incrementer(new RunIdIncrementer())
                .start(filterStep())
                .next(insertToDBStep())
                .build();
    }
}



Why doesn’t the insertToDBStep starts at the end of the filterStep and it actually looks like the filter is running at the same time? And why it looks like the job starts after the initialization of Root WebApplicationContext?

This is the output.

2022-05-23 15:40:49.418  INFO 14008 --- [  restartedMain] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 1024 ms
Try to read JSON
Read from list
Try to save on DB
2022-05-23 15:40:49.882  INFO 14008 --- [  restartedMain] o.s.b.d.a.OptionalLiveReloadServer       : LiveReload server is running on port 35729
2022-05-23 15:40:49.917  INFO 14008 --- [  restartedMain] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8080 (http) with context path ''
2022-05-23 15:40:49.926  INFO 14008 --- [  restartedMain] c.marco.firstbatch.TestBatchApplication  : Started TestBatchApplication in 1.985 seconds (JVM running for 2.789)
2022-05-23 15:40:49.927  INFO 14008 --- [  restartedMain] o.s.b.a.b.JobLauncherApplicationRunner   : Running default command line with: []
2022-05-23 15:40:49.928  WARN 14008 --- [  restartedMain] o.s.b.c.c.a.DefaultBatchConfigurer       : No datasource was provided...using a Map based JobRepository
2022-05-23 15:40:49.928  WARN 14008 --- [  restartedMain] o.s.b.c.c.a.DefaultBatchConfigurer       : No transaction manager was provided, using a ResourcelessTransactionManager
2022-05-23 15:40:49.943  INFO 14008 --- [  restartedMain] o.s.b.c.l.support.SimpleJobLauncher      : No TaskExecutor has been set, defaulting to synchronous executor.
2022-05-23 15:40:49.972  INFO 14008 --- [  restartedMain] o.s.b.c.l.support.SimpleJobLauncher      : Job: [SimpleJob: [name=myJob]] launched with the following parameters: [{run.id=1}]
2022-05-23 15:40:50.003  INFO 14008 --- [  restartedMain] o.s.batch.core.job.SimpleStepHandler     : Executing step: [filterStep]
Processing JSON
Processing JSON
Processing JSON
Processing JSON
Processing JSON
Save on list [com.marco.firstbatch.Employee@958d6e7, com.marco.firstbatch.Employee@464d17f8, com.marco.firstbatch.Employee@705520ac, com.marco.firstbatch.Employee@1a9f8e93, com.marco.firstbatch.Employee@55bf8cc9]
Processing JSON
Processing JSON
Save on list [com.marco.firstbatch.Employee@55d706c0, com.marco.firstbatch.Employee@1bc46dd4]
2022-05-23 15:40:50.074  INFO 14008 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Step: [filterStep] executed in 70ms
2022-05-23 15:40:50.081  INFO 14008 --- [  restartedMain] o.s.batch.core.job.SimpleStepHandler     : Executing step: [insertToDBStep]
2022-05-23 15:40:50.084  INFO 14008 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Step: [insertToDBStep] executed in 3ms
2022-05-23 15:40:50.088  INFO 14008 --- [  restartedMain] o.s.b.c.l.support.SimpleJobLauncher      : Job: [SimpleJob: [name=myJob]] completed with the following parameters: [{run.id=1}] and the following status: [COMPLETED] in 96ms

Thanks in advance.

Advertisement

Answer

The steps are executed correctly in sequence. You are putting System.out.println statements in two “kind” of places:

  • In the bean definition methods executed by Spring Framework when configuring the application context
  • In the code of batch artefacts (item processor, item writer) which are called by Spring Batch when running your job

In your case, Spring Framework will call the following bean definition methods in order to define the first step, filterStep():

  • jsonReader(): prints Try to read JSON. The file is not read at this time, only the json reader bean is defined. A more accurate log message would be: json reader bean created.
  • listReader(): prints Read from list. Same here, the file reading has not started yet. A more accurate log message would be: list reader bean created.
  • filterProcessor(): prints nothing. The log statement is in the ItemProcessor#process method. This will be called by Spring Batch at runtime, not at this point in time which is a configuration time
  • filterWriter(): same here, the print statement is in the write method called at runtime and not at configuration time

This results in the following output for filterStep():

Try to read JSON
Read from list

Now Spring Framework moves to defining the next step, insertToDBStep(). For this, it will call the following methods in order, according to your step definition:

  • listReader(): this bean has already bean defined, Spring will reuse the same instance (by default, Spring beans are singletons). Hence, there is no output from this method.
  • insertToDBWriter(): prints Try to save on DB. Same here, there is no actual save to DB here. A more accurate log message would be insertToDBWriter bean created (or even more accurate, attempting to create insertToDBWriter bean, in case the code that follows throws an exception).

You now have the following cumulative output:

Try to read JSON
Read from list
Try to save on DB

At this point, Spring Framework has finished its job of configuring the application context, Spring Batch takes over and starts the job. The actual processing of filterStep() begins:

  • The reader (ListItemReader) does not have any output in the read method.
  • The processor prints Processing JSON
  • The writer prints Save on list ...

You seem to have two chunks (the first with 5 items and the second with 2 items), which leads to the following output:

2022-05-23 15:40:50.003  INFO 14008 --- [  restartedMain] o.s.batch.core.job.SimpleStepHandler     : Executing step: [filterStep]
Processing JSON
Processing JSON
Processing JSON
Processing JSON
Processing JSON
Save on list [com.marco.firstbatch.Employee@958d6e7, com.marco.firstbatch.Employee@464d17f8, com.marco.firstbatch.Employee@705520ac, com.marco.firstbatch.Employee@1a9f8e93, com.marco.firstbatch.Employee@55bf8cc9]
Processing JSON
Processing JSON
Save on list [com.marco.firstbatch.Employee@55d706c0, com.marco.firstbatch.Employee@1bc46dd4]
2022-05-23 15:40:50.074  INFO 14008 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Step: [filterStep] executed in 70ms

Then, the next step starts its execution and you get the following output:

2022-05-23 15:40:50.081  INFO 14008 --- [  restartedMain] o.s.batch.core.job.SimpleStepHandler     : Executing step: [insertToDBStep]
2022-05-23 15:40:50.084  INFO 14008 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Step: [insertToDBStep] executed in 3ms

Here you might be asking why there are no items written by insertToDBWriter() (ie why there are no Save on DB .. logs). This is because the listReader() is a singleton bean and you are using it in both steps, so when the second step calls its read method, it will still return null, because the same instance is used and which has already exhausted the list of items in step 1. Hence, this step ends immediately since there are no items to process. If you want to re-read the items from the list in the second step, you can annotate the reader method with @StepScope. This will create a distinct instance of the reader for each step.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement