Package org.aksw.palmetto.corpus.lucene
Class WindowSupportingLuceneCorpusAdapter
java.lang.Object
org.aksw.palmetto.corpus.lucene.LuceneCorpusAdapter
org.aksw.palmetto.corpus.lucene.WindowSupportingLuceneCorpusAdapter
- All Implemented Interfaces:
BooleanDocumentSupportingAdapter,CorpusAdapter,WindowSupportingAdapter
public class WindowSupportingLuceneCorpusAdapter
extends LuceneCorpusAdapter
implements WindowSupportingAdapter
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Stringprotected int[][]static Stringprivate static org.slf4j.LoggerFields inherited from class org.aksw.palmetto.corpus.lucene.LuceneCorpusAdapter
contexts, dirReader, fieldName, reader -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedWindowSupportingLuceneCorpusAdapter(org.apache.lucene.index.DirectoryReader dirReader, org.apache.lucene.index.AtomicReader[] reader, org.apache.lucene.index.AtomicReaderContext[] contexts, String textFieldName, String docLengthFieldName, int[][] histogram) -
Method Summary
Modifier and TypeMethodDescriptionint[][]Returns the histogram of the document sizes of the corpus.protected voidrequestDocumentsWithWord(String word, com.carrotsearch.hppc.IntObjectOpenHashMap<com.carrotsearch.hppc.IntArrayList[]> positionsInDocs, com.carrotsearch.hppc.IntIntOpenHashMap docLengths, int wordId, int numberOfWords)com.carrotsearch.hppc.IntObjectOpenHashMap<com.carrotsearch.hppc.IntArrayList[]>requestWordPositionsInDocuments(String[] words, com.carrotsearch.hppc.IntIntOpenHashMap docLengths)Returns the positions of the given words inside the corpus.Methods inherited from class org.aksw.palmetto.corpus.lucene.LuceneCorpusAdapter
close, create, getDocumentsWithWord, getDocumentsWithWordAsSet, getDocumentsWithWords, getDocumentsWithWordsAsSet, getNumberOfDocumentsMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.aksw.palmetto.corpus.CorpusAdapter
close
-
Field Details
-
histogram
protected int[][] histogram -
docLengthFieldName
-
LOGGER
private static final org.slf4j.Logger LOGGER -
HISTOGRAM_FILE_SUFFIX
- See Also:
- Constant Field Values
-
-
Constructor Details
-
WindowSupportingLuceneCorpusAdapter
-
-
Method Details
-
create
public static WindowSupportingLuceneCorpusAdapter create(String indexPath, String textFieldName, String docLengthFieldName) throws org.apache.lucene.index.CorruptIndexException, IOException- Throws:
org.apache.lucene.index.CorruptIndexExceptionIOException
-
getDocumentSizeHistogram
public int[][] getDocumentSizeHistogram()Description copied from interface:WindowSupportingAdapterReturns the histogram of the document sizes of the corpus.- Specified by:
getDocumentSizeHistogramin interfaceWindowSupportingAdapter- Returns:
- the histogram of the document sizes
-
requestWordPositionsInDocuments
public com.carrotsearch.hppc.IntObjectOpenHashMap<com.carrotsearch.hppc.IntArrayList[]> requestWordPositionsInDocuments(String[] words, com.carrotsearch.hppc.IntIntOpenHashMap docLengths)Description copied from interface:WindowSupportingAdapterReturns the positions of the given words inside the corpus.- Specified by:
requestWordPositionsInDocumentsin interfaceWindowSupportingAdapter- Parameters:
words- the words for which the positions inside the documents should be determineddocLengths- empty int int map in which the document lengths and counts are inserted- Returns:
- the positions of the given words inside the corpus
-
requestDocumentsWithWord
protected void requestDocumentsWithWord(String word, com.carrotsearch.hppc.IntObjectOpenHashMap<com.carrotsearch.hppc.IntArrayList[]> positionsInDocs, com.carrotsearch.hppc.IntIntOpenHashMap docLengths, int wordId, int numberOfWords)
-