Skip to content
Advertisement

non-capturing group still shows?

I am trying to get the string starting from third / in a url.

here is the url:

http://192.168.1.253:18888/2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png

I wish to get /2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png

So I used the following regex (?://.+)/.+

?: marks a non-capturing group, so //192.168.1.253:18888 shouldn’t be matched.

But when I test in regex101.com, its result is //192.168.1.254:18888/2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png.

Why is that?

Advertisement

Answer

The reason the regex101.com result is //192.168.1.254:18888/2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png is that non-capturing groups (?: ... ) consume the text that they match with. Hence, where it has matched text Regex101 is showing that as a match.

For languages such as Java just match everything that doesn’t include a forward slash after the initial double slash, and only keep the group match:

Regex: `//[^/]+(.+)`
Input: `http://192.168.1.253:18888/2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png`
Ignore Match1: `//192.168.1.253:18888/2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png`
Keep Group1: `/2021/03/11/896459e4-875f-455a-a2cb-768c879555e7.png`
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement