I’m having some trouble parsing CSV with backslash escaped qoutes "
. Most of lines in source CSV don’t include escaped quotes but where there are I can’t seem to find appropriate settings for correct parsing.
CSV example (each line with 4 columns):
1,,No quote escape,test 2,,"One quote escape"",test 3,,"Two "quote escapes",test 4,,"Two "quote escapes" 2",test
CSV parser settings:
CsvFormat: Comment character=# Field delimiter=, Line separator (normalized)=n Line separator sequence=rn Quote character=" Quote escape character= Quote escape escape character=null
Code snippet:
CsvParserSettings settings = new CsvParserSettings(); settings.setDelimiterDetectionEnabled(true); settings.setLineSeparatorDetectionEnabled(true); settings.getFormat().setQuote('"'); settings.getFormat().setQuoteEscape('\'); CsvParser parser = new CsvParser(settings); parser.beginParsing(file, StandardCharsets.UTF_8); ...
Lines are parsed correctly until two escaped quotes are present in one line. Expected parsed lines are:
- 1,null,No quote escape,test - 2,null,One quote escape",test - 3,null,Two "quote escapes",test - 4,null,Two "quote escapes" 2,test
Advertisement
Answer
Upon further inspection I found an existing issue for v2.9.1
.