Skip to content
Advertisement

Extract digits in between two pipes padded with whitespace

I am struggling to use regex to find matches of the numbers between pipes {1, 2, 3, 4, 5, 6 …} in this line ;

| 2021-08-18 01:28 | Twitter | [INTL TWITTER AAA BBB CC ] (https://twitter.c.xx-xx-2.aaaa.com/#/groups/123) | Twitter XX (C++, C#) | 1 | 2 | 3 | 4 | [ aaaa ] | 5 | 6 | 7 |

my best attemt is this one :

| 2021-08-18 01:28 | Twitter | [INTL TWITTER AAA BBB CC ] (https://twitter.c.xx-xx-2.aaaa.com/#/groups/123) | Twitter XX (C++, C#) | (d+) | (d+) | (d+) | (d+) | [ aaaa ] | (d+) | (d+) | (d+) | 

It is actually working but it looks very hard coded … If you can suggest an improvement I would be thankfull Thanks in advance! 🙂

Advertisement

Answer

You can use

|s*(d+)(?=s*|)

See the regex demo. Details:

  • | – a pipe char
  • s* – zero or more whitespaces
  • (d+) – Group 1: one or more digits
  • (?=s*|) – a positive lookahead that matches a location that is immediately followed with zero or more whitespaces and a pipe char.

See the Java demo:

String s = "| 2021-08-18 01:28 | Twitter | [INTL TWITTER AAA BBB CC ] (https://twitter.c.xx-xx-2.aaaa.com/#/groups/123) | Twitter XX (C++, C#) | 1 | 2 | 3 | 4 | [ aaaa ] | 5 | 6 | 7 |";
Pattern pattern = Pattern.compile("\|\s*(\d+)(?=\s*\|)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
} 
// => 1, 2, 3, 4, 5, 6, 7
Advertisement