Skip to content
Advertisement

Java – Split String into sentences with character limitation

I want to split a text into sentences (split by . or BreakIterator). But: Each sentence mustn’t have more than 100 characters.

Example:

Lorem ipsum dolor sit. Amet consetetur sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et dolore
magna aliquyam erat, sed diam voluptua. At vero eos et accusam
et justo duo dolores.

To: (3 elements, without breaking a word, but a sentence)

" Lorem ipsum dolor sit. ",
" Amet consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt
  ut labore et dolore magna",
" aliquyam erat, sed diam voluptua. At vero eos et accusam
  et justo duo dolores. "

How can I do this properly?

Advertisement

Answer

Solved (thank you Macarse for the inspiration):

String[] words = text.split("(?=[\s\.])");
ArrayList<String> array = new ArrayList<String>();
int i = 0;
while (words.length > i) {
    String line = "";
    while ( words.length > i && line.length() + words[i].length() < 100 ) {
        line += words[i];
        i++;
    }
    array.add(line);
}
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement