Package org.aksw.palmetto.corpus.lucene
Class LuceneCorpusAdapter
- java.lang.Object
-
- org.aksw.palmetto.corpus.lucene.LuceneCorpusAdapter
-
- All Implemented Interfaces:
BooleanDocumentSupportingAdapter,CorpusAdapter
- Direct Known Subclasses:
WindowSupportingLuceneCorpusAdapter
public class LuceneCorpusAdapter extends Object implements BooleanDocumentSupportingAdapter
This class can make usage of a given Lucene index as corpus.- Author:
- m.roeder
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.lucene.index.AtomicReaderContext[]contextsprotected org.apache.lucene.index.DirectoryReaderdirReaderprotected StringfieldNameprivate static org.slf4j.LoggerLOGGERprotected org.apache.lucene.index.AtomicReader[]reader
-
Constructor Summary
Constructors Modifier Constructor Description protectedLuceneCorpusAdapter(org.apache.lucene.index.DirectoryReader dirReader, org.apache.lucene.index.AtomicReader[] reader, org.apache.lucene.index.AtomicReaderContext[] contexts, String fieldName)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Closes the Lucene index.static LuceneCorpusAdaptercreate(String indexPath, String fieldName)Creates a corpus adapter which uses the Lucene index with the given path and searches on the field with the given field name.voidgetDocumentsWithWord(String word, com.carrotsearch.hppc.IntArrayList documents)Determines the documents containing the given word.voidgetDocumentsWithWordAsSet(String word, com.carrotsearch.hppc.IntOpenHashSet documents)Determines the documents containing the given word.voidgetDocumentsWithWords(com.carrotsearch.hppc.ObjectObjectOpenHashMap<String,com.carrotsearch.hppc.IntArrayList> wordDocMapping)Determines the documents containing the words used as key in the given map.voidgetDocumentsWithWordsAsSet(com.carrotsearch.hppc.ObjectObjectOpenHashMap<String,com.carrotsearch.hppc.IntOpenHashSet> wordDocMapping)Determines the documents containing the words used as key in the given map.intgetNumberOfDocuments()Returns the number of documents the corpus contains.
-
-
-
Field Detail
-
LOGGER
private static final org.slf4j.Logger LOGGER
-
fieldName
protected String fieldName
-
dirReader
protected org.apache.lucene.index.DirectoryReader dirReader
-
reader
protected org.apache.lucene.index.AtomicReader[] reader
-
contexts
protected org.apache.lucene.index.AtomicReaderContext[] contexts
-
-
Constructor Detail
-
LuceneCorpusAdapter
protected LuceneCorpusAdapter(org.apache.lucene.index.DirectoryReader dirReader, org.apache.lucene.index.AtomicReader[] reader, org.apache.lucene.index.AtomicReaderContext[] contexts, String fieldName)
-
-
Method Detail
-
create
public static LuceneCorpusAdapter create(String indexPath, String fieldName) throws org.apache.lucene.index.CorruptIndexException, IOException
Creates a corpus adapter which uses the Lucene index with the given path and searches on the field with the given field name.- Parameters:
indexPath-fieldName-- Returns:
- Throws:
org.apache.lucene.index.CorruptIndexExceptionIOException
-
getDocumentsWithWordAsSet
public void getDocumentsWithWordAsSet(String word, com.carrotsearch.hppc.IntOpenHashSet documents)
Description copied from interface:BooleanDocumentSupportingAdapterDetermines the documents containing the given word. The ids of the found documents are inserted into the given set.- Specified by:
getDocumentsWithWordAsSetin interfaceBooleanDocumentSupportingAdapter- Parameters:
word- the word which should be searcheddocuments- the set in which the document ids will be stored
-
close
public void close()
Closes the Lucene index.- Specified by:
closein interfaceCorpusAdapter
-
getNumberOfDocuments
public int getNumberOfDocuments()
Description copied from interface:BooleanDocumentSupportingAdapterReturns the number of documents the corpus contains.- Specified by:
getNumberOfDocumentsin interfaceBooleanDocumentSupportingAdapter- Returns:
- the number of documents
-
getDocumentsWithWordsAsSet
public void getDocumentsWithWordsAsSet(com.carrotsearch.hppc.ObjectObjectOpenHashMap<String,com.carrotsearch.hppc.IntOpenHashSet> wordDocMapping)
Description copied from interface:BooleanDocumentSupportingAdapterDetermines the documents containing the words used as key in the given map. The resulting sets contain the ids of the documents and are inserted into the map.- Specified by:
getDocumentsWithWordsAsSetin interfaceBooleanDocumentSupportingAdapter- Parameters:
wordDocMapping- a mapping of words to documents in which the results are stored
-
getDocumentsWithWords
public void getDocumentsWithWords(com.carrotsearch.hppc.ObjectObjectOpenHashMap<String,com.carrotsearch.hppc.IntArrayList> wordDocMapping)
Description copied from interface:BooleanDocumentSupportingAdapterDetermines the documents containing the words used as key in the given map. The resulting int arrays contain the ids of the documents and are inserted into the map.- Specified by:
getDocumentsWithWordsin interfaceBooleanDocumentSupportingAdapter- Parameters:
wordDocMapping- a mapping of words to documents in which the results are stored
-
getDocumentsWithWord
public void getDocumentsWithWord(String word, com.carrotsearch.hppc.IntArrayList documents)
Description copied from interface:BooleanDocumentSupportingAdapterDetermines the documents containing the given word. The ids of the found documents are appended into the given list.- Specified by:
getDocumentsWithWordin interfaceBooleanDocumentSupportingAdapter- Parameters:
word- the word which should be searcheddocuments- the list to the document ids will be added
-
-