I’m parsing my scripts for wrong strings and have an array list with strings that shouldn’t occur in my lines of code.
But some strings should pass when they are not exactly the string.
Examples:
list contains “FOO” and “BAR”
textlines:
This is foo and this is BAR but not 'foo' and not 'BAR' but also not FOO_BAR but we want FOO%TEXT and also bar.text
result
This is foo and this is BAR but we want FOO%TEXT and also bar.text
I’ve tried a view examples I found online and on stackoverflow but these didn’t work for me, they don’t filter the quoted ones.
String pattern = ".*\"+strTables[i]+"\b.*[^(\w|')]"; Pattern r = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE); Matcher m = r.matcher(line); if (m.find()) { System.out.println (strTables[i] + ": " + line); break; }
Advertisement
Answer
You need b
(word boundary) before and after the word, to match a whole word.
Then, a look-behind like (?<!')
at the beginning of the pattern to denote “not preceded by a single quote” and a look-ahead like (?!')
at the end of the pattern to denote “not followed by a single quote”.
Putting it together makes
String pattern = "(?<!')\b" + strTables[i] + "\b(?!')";