Is it possible to create a regular expression with a variable number of groups?
After running this for instance…
Pattern p = Pattern.compile("ab([cd])*ef"); Matcher m = p.matcher("abcddcef"); m.matches();
… I would like to have something like
m.group(1)
="c"
m.group(2)
="d"
m.group(3)
="d"
m.group(4)
="c"
.
(Background: I’m parsing some lines of data, and one of the “fields” is repeating. I would like to avoid a matcher.find
loop for these fields.)
As pointed out by @Tim Pietzcker in the comments, perl6 and .NET have this feature.
Advertisement
Answer
According to the documentation, Java regular expressions can’t do this:
The captured input associated with a group is always the subsequence that the group most recently matched. If a group is evaluated a second time because of quantification then its previously-captured value, if any, will be retained if the second evaluation fails. Matching the string “aba” against the expression (a(b)?)+, for example, leaves group two set to “b”. All captured input is discarded at the beginning of each match.
(emphasis added)