Given a domain string like aaaa.bbbb.cccc.dddd
I am trying to iterate over all of its subdomains i.e.
JavaScript
x
aaaa.bbbb.cccc.dddd
bbbb.cccc.dddd
cccc.dddd
dddd
I thought this regex ((?:[a-zA-Z0-9]+.)*)([a-zA-Z0-9]+)$
should do the trick (please ignore the fact, that I am only matching these characters [a-zA-Z0-9]
), however it only matches the full string.
How can I modify it to make it work?
Edit 1: The following code
JavaScript
var pattern = Pattern.compile("((?:[a-zA-Z0-9]+\.)*)([a-zA-Z0-9]+)$"); //fixed regex here
var matcher = pattern.matcher("aaaa.bbbb.cccc.dddd");
matcher.results()
.forEach(matchResult -> System.out.println(matchResult.group()));
should print (in any order)
JavaScript
aaaa.bbbb.cccc.dddd
bbbb.cccc.dddd
cccc.dddd
dddd
Advertisement
Answer
The regex you’re looking for is
JavaScript
(?=(?:^|.)([.w]+)*)
This pattern is based on lookahead. It can cross-match substrings that have already been matched in previous iterations.
Java Example
JavaScript
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
final String regex = "(?=(?:^|\.)([\.\w]+)*)";
final String domain = "aaaa.bbbb.cccc.dddd";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(domain);
while (matcher.find()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
}
}
}
}