Skip to content

Tag: lucene

Apache Lucene to replace found terms

I’m looking for a way to find-and-replace words basing on queries in a text using Apache Lucene. Example – I have a text “Happy New Year!” and Lucene query “year~2” with fuzzy-detection and some replace characters (“###”). As the result I want the following &#82…

How to sort a Lucene long date range by date

Problem: I want to search books by date range, but sort the result. Searching by date range works, but the documents are not sorted properly (Insertion order, see ID?): To sort them by date, I changed my code to: Add NumericDocValuesField: Add a Sort: Question: What am I doing wrong? What do I need to change …

Apache Solr – Indexing ZIP files

My web app is an e-mail service. It stores email messages in MySQL database and email attachments are on a disk. The database is similar to: I index it with the following data-config.xml: This is working good with all the files except compressed files such as .zip. For .zip files the attach_content field gets…

Add weights to documents Lucene 8

I am currently working on a small search engine for college using Lucene 8. I already built it before, but without applying any weights to documents. I am now required to add the PageRanks of documents as a weight for each document, and I already computed the PageRank values. How can I add a weight to a Docum…

Lemmatization with apache lucene

I’m developing a text analysis project using apache lucene. I need to lemmatize some text (transform the words to their canonical forms). I’ve already written the code that makes stemming. Using it, I am able to convert the following sentence The stem is the part of the word that never changes eve…

Lucene: Multi-word phrases as search terms

I’m trying to make a searchable phone/local business directory using Apache Lucene. I have fields for street name, business name, phone number etc. The problem that I’m having is that when I try to search by street where the street name has multiple words (e.g. ‘the crescent’), no resu…