Reading line breaks in CSV which are quoted in the file in FlatfileItemReader of spring batch



I am trying to parse a CSV file with FlatFileItemReader. This CSV contains some quoted newline characters as shown below.

email, name
abc@z.com, "NEW NAME
 ABC"

But this parsing is failing with required fields are 2 but actual is 1.

What I am missing in my FlatFileReader configuration?

<property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">

                <!-- The lineTokenizer divides individual lines up into units of work -->
                <property name="lineTokenizer">
                    <bean
                        class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">

                        <!-- Names of the CSV columns -->
                        <property name="names"
                            value="email,name" />
                    </bean>
                </property>

                <!-- The fieldSetMapper maps a line in the file to a Product object -->
                <property name="fieldSetMapper">
                    <bean
                        class="com.abc.testme.batchjobs.util.CustomerFieldSetMapper" />
                </property>
            </bean>
        </property>

Answer

out of the box the FlatFileItemReader uses a SimpleRecordSeparatorPolicy, for your usecase

  • commented part goes over 2 or more lines

you need to set the DefaultRecordSeparatorPolicy

Cited from its javadoc:

A RecordSeparatorPolicy that treats all lines as record endings, as long as they do not have unterminated quotes, and do not end in a continuation marker.

example xml configuration

<bean id="reader" 
      class="org.springframework.batch.item.file.FlatFileItemReader">
      ...
    <property name="recordSeparatorPolicy">
        <bean class="org.springframework.batch.item.file.separator.DefaultRecordSeparatorPolicy" />
    </property>
      ...
</bean>


Source: stackoverflow