Why regex extract return wrong value in scala [closed]

Tags: , ,



val pattern = "[A-Z]{2,3}[0-9]{4}".r
val extractedData =  pattern.findFirstIn("find ABCD1234")

I have the above code to look for valid data.

Input:

find DTD0001

Expected Output:

DTD0001 

Input:

find ABCD1234

Expected Output:

i.e. nothing.

Currently, it is returning BCD1234 which is incorrect.

I want to make it return value only when it has 3 letters + 4 digits. otherwise don’t return any value. How to make this correct?

Answer

findFirstIn() returns the first sub-string that matches the pattern. If you don’t want a sub-string, but just to test for a match, then that’s not the tool you want to use.

scala> "ABCE1234" matches "[A-Z]{2,3}[0-9]{4}"
res5: Boolean = false

scala> "ABC1234" matches "[A-Z]{2,3}[0-9]{4}"
res6: Boolean = true

If you want to extract a matching word from a longer string then you need to incorporate the word boundaries into your pattern.

scala> val pattern = "\b[A-Z]{2,3}[0-9]{4}\b".r
pattern: scala.util.matching.Regex = b[A-Z]{2,3}[0-9]{4}b

scala> val extractedData = pattern.findFirstIn("find ABCD1234")
extractedData: Option[String] = None

scala> val extractedData = pattern.findFirstIn("find ABC1234")
extractedData: Option[String] = Some(ABC1234)


Source: stackoverflow