I need to be able to recognise date strings. It doesn’t matter if I can not distinguish between month and date (e.g. 12/12/10), I just need to classify the string as being a date, rather than converting it to a Date object. So, this is really a classification rather than parsing problem.
I will have pieces of text such as:
“bla bla bla bla 12 Jan 09 bla bla bla
01/04/10 bla bla bla”
and I need to be able to recognise the start and end boundary for each date string within.
I was wondering if anyone knew of any java libraries that can do this. My google-fu hasn’t come up with anything so far.
UPDATE: I need to be able to recognise the widest possible set of ways of representing a dates. Of course the naive solution might be to write an if statement for every conceivable format, but a pattern recognition approach, with a trained model, is ideally what I’m after.
You may want to use DateParser2 from edu.mit.broad.genome.utils package.