I’m looking for a way to find-and-replace words basing on queries in a text using Apache Lucene. Example – I have a text “Happy New Year!” and Lucene query “year~2” with fuzzy-detection and some replace characters (“###”). As the result I want the following – “Happy New ###!”. Is there a way to achieve this using Apache Lucene only?
Advertisement
Answer
Just in case for anyone who needs this. I managed to solve the problem using Apache Highlighter. See code sample below
Highlighter highlighter = new Highlighter((originalText, tokenGroup) -> { if (tokenGroup.getTotalScore() <= 0) { return originalText; } return "###"; }, new QueryScorer(query)); // ... String highlighted = highlighter.getBestFragments(tokenStream, fieldText, 100, "...");