Skip to content

Tag: lucene

Apache Lucene to replace found terms

I’m looking for a way to find-and-replace words basing on queries in a text using Apache Lucene. Example – I have a text “Happy New Year!” and Lucene query “year~2” with fuzzy-detection and some replace characters (“###”). As the result I want the following – “Happy New ###!”. Is there a way to achieve this using Apache Lucene only? Answer

How to sort a Lucene long date range by date

Problem: I want to search books by date range, but sort the result. Searching by date range works, but the documents are not sorted properly (Insertion order, see ID?): To sort them by date, I changed my code to: Add NumericDocValuesField: Add a Sort: Question: What am I doing wrong? What do I need to change so the documents get

Apache Solr – Indexing ZIP files

My web app is an e-mail service. It stores email messages in MySQL database and email attachments are on a disk. The database is similar to: I index it with the following data-config.xml: This is working good with all the files except compressed files such as .zip. For .zip files the attach_content field gets filled only with the file names

Can a Hibernate Search FieldBridge configure facets for dynamic fields?

Using Hibernate Search 5.11.3 with programmatic API (no annotations), is there a way to facet on dynamic fields added in a class or field bridge? I don’t see any ‘facet’ config available in FieldMetadataBuilder when using MetadataProvidingFieldBridge. I have tried various combinations of luceneOptions.addSortedDocValuesFieldToDocument() and luceneOptions.addFieldToDocument() in the set() method. This successfully updates the index, but I cannot perform facet

Add weights to documents Lucene 8

I am currently working on a small search engine for college using Lucene 8. I already built it before, but without applying any weights to documents. I am now required to add the PageRanks of documents as a weight for each document, and I already computed the PageRank values. How can I add a weight to a Document object (not

Lucene split package: module reads package ‘org.apache.lucene.analysis.standard’ from both ‘lucene.analyzers.common’ and ‘lucene.core’

Given my I get the following error: Module ‘my_module’ reads package ‘org.apache.lucene.analysis.standard’ from both ‘lucene.analyzers.common’ and ‘lucene.core’ In my code I use the following imports: How can resolve this split package problem? Answer As you may already know, Lucene doesn’t support the Java Platform Module System properly, so it doesn’t define modules and contains split packages, which don’t work

Lemmatization with apache lucene

I’m developing a text analysis project using apache lucene. I need to lemmatize some text (transform the words to their canonical forms). I’ve already written the code that makes stemming. Using it, I am able to convert the following sentence The stem is the part of the word that never changes even when morphologically inflected; a lemma is the base

Lucene: Multi-word phrases as search terms

I’m trying to make a searchable phone/local business directory using Apache Lucene. I have fields for street name, business name, phone number etc. The problem that I’m having is that when I try to search by street where the street name has multiple words (e.g. ‘the crescent’), no results are returned. But if I try to search with just one