Skip to content
Advertisement

Partial Matching of Regular Expressions

In NFA it is easy to make all previously non-final states accepting to make it match language of all substrings of a given language.

In Java regex engine, is there a way to find out if a string is a starting substring of a string that matches given regex?

expression regexX ~ “any start of”, regexA = any normal regex

resulting expression “regexXregexA” matches all starting substrings of all matches of “regexA”:

example:

regexA = a*b, matches "ab" and not "a"
  
"regexXa*b", matches "a" because it is a start of "ab" (and "aab")  

edit:

Since some people still fail to understand, here is a program test for this question:

import java.util.regex.*;
public class Test1 {
    public static void main(String args[]){
       String regex = "a*b";
       System.out.println(
       partialMatch(regex, "aaa");
       );
     }
public boolean partialMatch(String regex, String begining){
//return true if there is a string which matches the regex and    
//startsWith(but not equal) begining, false otherwise 
}
}

must result in true.

Advertisement

Answer

What you’re looking for is called partial matching, and it’s natively supported by the Java regex API (for the record, other engines which offer this feature include PCRE and boost::regex).

You can tell if an input string matched partially by inspecting the result of the Matcher.hitEnd function, which tells if the match failed because the end of the input string was reached.

Pattern pattern = Pattern.compile("a*b");
Matcher matcher = pattern.matcher("aaa");
System.out.println("Matches: " + matcher.matches());
System.out.println("Partial match: " + matcher.hitEnd());

This outputs:

Matches: false
Partial match: true
Advertisement