Skip to content
Advertisement

Regex – recognise clusters of numbers and characters but only match a part with numbers

I want to write a regex that extracts numbers that might be related to age. It should recognise both simple ‘forms’ as 98, 1, 25 and more complex like 90-years-old. However, in case of more complex forms, I would like it to only match a number. I will later check if it concerns the age, but this is irrelevant for this question. What I have now, matches both numbers and character-parts:

b(1[0-4]d?|d{0,2}d{1})(-?years?)?b

For example, in the string: "18year old 23 year old 99 years old but not 25-year-old and 91year old cousin is 99 now and 90-year-old or 102 year old 505 0 year 1 year 11 year old 11 year 199 102 0-year13 13 14 22 33 45 8 99years", it matches 23, 99 etc. but 18year or 90-year is a single match with two groups.

How can I change it so that only a number in such a cluster is matched (a single group)?

Advertisement

Answer

You can use

b(d{1,3})(?=-?year|b)
bd{1,3}(?=-?year|b)

The second one has no capturing groups. See the regex demo.

Details:

  • b – a word boundary
  • d{1,3} – one, two or three digits
  • (?=-?year|b) – a position immediately followed with an optional -, year or a word boundary.
Advertisement