Package org.aksw.palmetto.corpus
Interface WindowSupportingAdapter
- All Superinterfaces:
CorpusAdapter
- All Known Implementing Classes:
WindowSupportingLuceneCorpusAdapter
This adapter supports window based probability estimation methods.
- Author:
- m.roeder
-
Method Summary
Modifier and TypeMethodDescriptionint[][]Returns the histogram of the document sizes of the corpus.com.carrotsearch.hppc.IntObjectOpenHashMap<com.carrotsearch.hppc.IntArrayList[]>requestWordPositionsInDocuments(String[] words, com.carrotsearch.hppc.IntIntOpenHashMap docLengths)Returns the positions of the given words inside the corpus.Methods inherited from interface org.aksw.palmetto.corpus.CorpusAdapter
close
-
Method Details
-
getDocumentSizeHistogram
int[][] getDocumentSizeHistogram()Returns the histogram of the document sizes of the corpus.- Returns:
- the histogram of the document sizes
-
requestWordPositionsInDocuments
com.carrotsearch.hppc.IntObjectOpenHashMap<com.carrotsearch.hppc.IntArrayList[]> requestWordPositionsInDocuments(String[] words, com.carrotsearch.hppc.IntIntOpenHashMap docLengths)Returns the positions of the given words inside the corpus.- Parameters:
words- the words for which the positions inside the documents should be determineddocLengths- empty int int map in which the document lengths and counts are inserted- Returns:
- the positions of the given words inside the corpus
-