If I am looking for a particular word inside a string, for example, in the string “how are you” I am looking for “are”. Would a regular indexOf() work faster and better or a Regex match()
String testStr = "how are you";
String lookUp = "are";
//METHOD1
if (testStr.indexOf(lookUp) != -1)
{
System.out.println("Found!");
}
//OR
//METHOD 2
if (testStr.match(".*"+lookUp+".*"))
{
System.out.println("Found!");
}
Which of the two methods above is a better way of looking for a string inside another string? Or is there a much better alternative?
- Ivard
Advertisement
Answer
If you don’t care whether it’s actually the entire word you’re matching, then indexOf()
will be a lot faster.
If, on the other hand, you need to be able to differentiate between are
, harebrained
, aren't
etc., then you need a regex: bareb
will only match are
as an entire word (\bare\b
in Java).
b
is a word boundary anchor, and it matches the empty space between an alphanumeric character (letter, digit, or underscore) and a non-alphanumeric character.
Caveat: This also means that if your search term isn’t actually a word (let’s say you’re looking for ###
), then these word boundary anchors will only match in a string like aaa###zzz
, but not in +++###+++
.
Further caveat: Java has by default a limited worldview on what constitutes an alphanumeric character. Only ASCII letters/digits (plus the underscore) count here, so word boundary anchors will fail on words like élève
, relevé
or ärgern
. Read more about this (and how to solve this problem) here.