I want to split a String in punctuation marks and white spaces, but keep the punctuation marks. E.x
JavaScript
x
String example = "How are you? I am fine!"
I want to have as a result
JavaScript
["How","are","you","?","I","am","fine","!"]
but instead I get
JavaScript
["how"," ","are"," ","you"," ","?"," ","i"," ","am"," ","fine"," ","!"].
what I used was example.toLowerCase().trim().split("(?<=\b|[^\p{L}])");
Advertisement
Answer
Why are you doing toLowerCase()
? This already messes up your expected result. And why the trim()
on the full string?
Doing this with a single split
call is probably not too simple.
An alternative would be to just filter out the unwanted entries:
JavaScript
String example = "How are you? I am fine!";
Pattern pattern = Pattern.compile("\b");
String[] result = pattern.splitAsStream(example)
.filter(Predicate.not(String::isBlank))
.toArray(String[]::new);
System.out.println(Arrays.toString(result));
Output:
JavaScript
[How, are, you, ? , I, am, fine, !]
Reacting to your comment of wanting [How,are,you,?,I,am,fine,!]
as output; simply dont print with Arrays.toString
but build the string yourself manually. The array does not contain any whitespaces.
JavaScript
System.out.println("[" + String.join(",", result) + "]");