Given a domain string like aaaa.bbbb.cccc.dddd
I am trying to iterate over all of its subdomains i.e.
aaaa.bbbb.cccc.dddd bbbb.cccc.dddd cccc.dddd dddd
I thought this regex ((?:[a-zA-Z0-9]+.)*)([a-zA-Z0-9]+)$
should do the trick (please ignore the fact, that I am only matching these characters [a-zA-Z0-9]
), however it only matches the full string.
How can I modify it to make it work?
Edit 1: The following code
var pattern = Pattern.compile("((?:[a-zA-Z0-9]+\.)*)([a-zA-Z0-9]+)$"); //fixed regex here var matcher = pattern.matcher("aaaa.bbbb.cccc.dddd"); matcher.results() .forEach(matchResult -> System.out.println(matchResult.group()));
should print (in any order)
aaaa.bbbb.cccc.dddd bbbb.cccc.dddd cccc.dddd dddd
Advertisement
Answer
The regex you’re looking for is
(?=(?:^|.)([.w]+)*)
This pattern is based on lookahead. It can cross-match substrings that have already been matched in previous iterations.
Java Example
import java.util.regex.Matcher; import java.util.regex.Pattern; public class Main { public static void main(String[] args) { final String regex = "(?=(?:^|\.)([\.\w]+)*)"; final String domain = "aaaa.bbbb.cccc.dddd"; final Pattern pattern = Pattern.compile(regex); final Matcher matcher = pattern.matcher(domain); while (matcher.find()) { for (int i = 1; i <= matcher.groupCount(); i++) { System.out.println(matcher.group(i)); } } } }