I want to split a String in punctuation marks and white spaces, but keep the punctuation marks. E.x
String example = "How are you? I am fine!"
I want to have as a result
["How","are","you","?","I","am","fine","!"]
but instead I get
["how"," ","are"," ","you"," ","?"," ","i"," ","am"," ","fine"," ","!"].
what I used was example.toLowerCase().trim().split("(?<=\b|[^\p{L}])");
Advertisement
Answer
Why are you doing toLowerCase()
? This already messes up your expected result. And why the trim()
on the full string?
Doing this with a single split
call is probably not too simple.
An alternative would be to just filter out the unwanted entries:
String example = "How are you? I am fine!"; Pattern pattern = Pattern.compile("\b"); String[] result = pattern.splitAsStream(example) .filter(Predicate.not(String::isBlank)) .toArray(String[]::new); System.out.println(Arrays.toString(result));
Output:
[How, are, you, ? , I, am, fine, !]
Reacting to your comment of wanting [How,are,you,?,I,am,fine,!]
as output; simply dont print with Arrays.toString
but build the string yourself manually. The array does not contain any whitespaces.
System.out.println("[" + String.join(",", result) + "]");