Skip to content
Advertisement

Recognise an arbitrary date string [closed]

I need to be able to recognise date strings. It doesn’t matter if I can not distinguish between month and date (e.g. 12/12/10), I just need to classify the string as being a date, rather than converting it to a Date object. So, this is really a classification rather than parsing problem.

I will have pieces of text such as:

“bla bla bla bla 12 Jan 09 bla bla bla 01/04/10 bla bla bla”

and I need to be able to recognise the start and end boundary for each date string within.

I was wondering if anyone knew of any java libraries that can do this. My google-fu hasn’t come up with anything so far.

UPDATE: I need to be able to recognise the widest possible set of ways of representing a dates. Of course the naive solution might be to write an if statement for every conceivable format, but a pattern recognition approach, with a trained model, is ideally what I’m after.

Advertisement

Answer

Use JChronic

You may want to use DateParser2 from edu.mit.broad.genome.utils package.

Advertisement