Skip to content
Advertisement

How to extract word in java using regex

Suppose I have a string

String s = The | community | office | system | is here to help you with specific coding, algorithm, or language problems.

How can I extract all the words from string s into a list which is between the pipe delimiter?

So the list should save community, office, system.

I thought of using the following pattern. Will it work?

Matcher matcher = Pattern.compile("(\|\w+)").matcher(s);

Advertisement

Answer

You can use

|s*(w+)(?=s*|)
|s*(.*?)(?=s*|)

See the regex demo and regex #2 demo. Details:

  • | – a | char
  • s* – zero or more whitespaces
  • (w+) – Group 1: one or more word chars
  • (.*?) – any zero or more chars other than line break chars, as few as possible
  • (?=s*|) – a positive lookahead that matches a location that is immediately followed with zero or more whitespaces and a | char.

See a Java demo and a Java #2 demo:

String s = "The | community | office | system | is here to help you with specific coding, algorithm, or language problems.";
Pattern pattern = Pattern.compile("\|\s*(\w+)(?=\s*\|)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
} 
// => community
//    office
//    system

// The second pattern demo:
String s = "The | community section | office section  | system section  | is here to help you with specific coding, algorithm, or language problems.";
Pattern pattern = Pattern.compile("\|\s*(.*?)(?=\s*\|)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
}
// => community section
//    office section
//    system section
Advertisement