Make RegEx optional groups either BOTH be present and match, or if ONE is missing not match/fail (java)

Tags: ,



I have a RegEx pattern I am using and it works (mostly), but there is one bug. I have 3 separate groups to capture the values, serverXXX (group 1), -Site (group 2, optional) and YY (group 3, optional). Below is the RegEx pattern definition:

String REGEX_PATTERN =  "server(\d{1,3})(-[a-zA-Z]*)?(\d{1,2})?\.mydomain.com";

As you can see by the regex pattern, the server is followed by three digits, 0-9. The site begins with a hyphen (-) and can be any word consisting of any chars a-z (case insensitive), and it is followed by two digits, which should be 00 to 99. Because group 2 and group 3 are optional, the user does not need to include the -SiteYY portion to a string and it should still pass.

Some tests:

server255.mydomain.com // passes, expected
server255-Site69.mydomain.com // passes, expected
server255699.mydomain.com // fails, expected
server25569-Site.mydomain.com // fails, expected
server25569.mydomain.com // passes, BUT SHOULD NOT PASS

So basically what is happening, is if the serverXXX extends 3 digits for the “XXX” portion, it will still pass sometimes since both group2 and group3 are optional because it will read the extra 2 digits as the group 3. However, of course, if more than 5 digits are used, or if 5 digits are used and the “-Site” comes after the “XXX”, then it will fail, since its reads it as violating group1’s quantifier {1,3} for the “XXX” portion of the string.

I think I can just combine group 2 and group 3 into a single group, but I would lose the ability to parse out the -Site and YY portions. How can I make the regex to FAIL on this case, server25569.mydomain.com, so it doesn’t read the extra digits as the third optional group?

Is there any way to Require that if ANY of the group2 (-Site) or group3 (YY) are present, that they must be present together??? or would there be an easier fix to this?

Answer

Is there any way to Require that if ANY of the group2 (-Site) or group3 (YY) are present, that they must be present together??? or would there be an easier fix to this?

You may use this regex:

server(d{1,3})(?:(-[a-zA-Z]+)(d{1,2}))?.mydomain.com

Take note of optional non-capturing group that contains capture groups #2 and $3.

For Java use:

server(\d{1,3})(?:(-[a-zA-Z]+)(\d{1,2}))?\.mydomain\.com

RegEx Demo



Source: stackoverflow