Are optional non-capturing groups redundant?
Is the following regex:
(?:wo)?men
semantically equivalent to the following regex?
(wo)?men
Advertisement
Answer
Your (?:wo)?men
and (wo)?men
are semantically equivalent, but technically are different, namely, the first is using a non-capturing and the other a capturing group. Thus, the question is why use non-capturing groups when we have capturing ones?
Non-caprturing groups are of help sometimes.
- To avoid excessive number of backreferences (remember that it is sometimes difficult to use backreferences higher than 9)
- To avoid the problem with 99 numbered backreferences limit (by reducing the number of numbered capturing groups) (source: Regular-expressions.info: Most regex flavors support up to 99 capturing groups and double-digit backreferences.)
NOTE this does not pertain to Java regex engine, nor to PHP or .NET regex engines. - To lessen the overhead caused by storing the captures in the stack
- We can add more groupings to existing regex without ruining the order of capturing groups.
Also, it is just makes our matches cleaner:
You can use a non-capturing group to retain the organisational or grouping benefits but without the overhead of capturing.
It does not seem a good idea to re-factor existing regular expressions to convert capturing to non-capturing groups, since it may ruin the code or require too much effort.