I am referring to the test harness listed here http://docs.oracle.com/javase/tutorial/essential/regex/test_harness.html
The only change I made to the class is that the pattern is created as below:
Pattern pattern = Pattern.compile(console.readLine("%nEnter your regex(Pattern.CANON_EQ set): "),Pattern.CANON_EQ);
As the tutorial at http://docs.oracle.com/javase/tutorial/essential/regex/pattern.html suggests I put in the pattern or regex as au030A
and string to match as u00E5
but it ends on a No Match Found. I saw both the strings are a small case ‘a’ with a ring on top.
Have I not understood the use case correctly?
Advertisement
Answer
The behavior you’re seeing has nothing to do with the Pattern.CANON_EQ
flag.
Input read from the console is not the same as a Java string literal. When the user (presumably you, testing out this flag) types u00E5
into the console, the resultant string read by console.readLine
is equivalent to "\u00E5"
, not “å”. See for yourself: http://ideone.com/lF7D1
As for Pattern.CANON_EQ
, it behaves exactly as described:
Pattern withCE = Pattern.compile("^au030A$",Pattern.CANON_EQ); Pattern withoutCE = Pattern.compile("^au030A$"); String input = "u00E5"; System.out.println("Matches with canon eq: " + withCE.matcher(input).matches()); // true System.out.println("Matches without canon eq: " + withoutCE.matcher(input).matches()); // false