In Java (JDK 11), consider the following string:
String hello = "333+444 5qwerty5 006 -7";
I am trying to come up with a RegEx that will split anything that isn’t a digit, whilst keeping the separators except space. So in the above example, I would like to end up with the following array:
["333" , "+" , "444" , "5" , "q" , "w" , "e" , "r" , "t" , "y" , "5" , "006" , "-7"]
Do note the leading zeroes in 006, and -7. The code I am using is the following:
String[] splited = s.split("((?<=[^0-9]+)|(?=[^0-9]+)|(\s+))");
However, I can see that my array is keeping spaces. I can’t for the life of me figure my mistake. Any thoughts?
EDIT: Turns out the requirement kept getting more complicated. Eventually I had to obtain the following collection, based on the sample input from above:
["333+444" , "5" , "q" , "w" , "e" , "r" , "t" , "y" , "5" , "006" , "-7"]
So if there is no space between an integer and operators + - * / % ^
, then do not split them. I have issues implementing this rule along with the fact that leading zeroes and negative numbers should not be split.
Based on that, it turns out that it is much simple to work with The fourth bird‘s sample where matcher()
is used instead of split()
. The RegEx syntax is simpler to understand, troubleshoot and build upon.
Perhaps I could have asked another question to cater for the additional complexity, but I do not think it is right to use StackOverflow to keep asking very similar questions because one got stuck.
Advertisement
Answer
Instead of using split, you could also match all the parts:
-?d+|S
The pattern matches:
-?
Optionally match a hyphend+
Match 1+ digits|
OrS
Match a single non whitespace char
See a regex demo and a Java demo.
Example
String regex = "-?\d+|\S"; String string = "333+444 5qwerty5 006 -7"; List<String> allMatches = new ArrayList<String>(); Matcher m = Pattern.compile(regex).matcher(string); while (m.find()) { allMatches.add(m.group()); } System.out.println(Arrays.toString(allMatches.toArray()));
Output
[333, +, 444, 5, q, w, e, r, t, y, 5, 006, -7]