Having this String with decimal point, I would like to remove all non alphaNumeric expect the decimal point.
String toPharse = "the. book - cost 7.55 dollars.";
String newPharse = toPharse.replaceAll("[^A-Za-zd.0-9 ]", " ").replaceAll("\s+", " ");
Currently I get "the. book cost 7.55 dollars.";
However I would like to return "the book cost 7.55 dollars";
Advertisement
Answer
You can use:
String toPharse = "the. book - cost 7.55 dollars.";
toPhrase = toPharse
.replaceAll("(?<!\d)\.(?!\d)|[^a-zA-Z\d. ]+", "")
.replaceAll("\h{2,}", " ");
//=> "the book cost 7.55 dollars"
RegEx Details:
(?<!\d): Previous character is not a digit\.: Match a dot(?!\d): Next character is not a digit|: OR[^a-zA-Z\d. ]+: Match 1+ of non-alphanumeric characters that are not space or dot.replaceAll("\h{2,}", " "): is for replacing 2+ whitespaces with a single space