|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectgate.creole.annic.lucene.LuceneSearchThread
public class LuceneSearchThread
Given a boolean query, it is translated into one or more AND normalized queries. For example: (A|B)C is translated into AC and BC. For each such query an instance of LuceneSearchThread is created. Here, each query is issued separately and results are submitted to main instance of LuceneSearch.
| Nested Class Summary | |
|---|---|
private class |
LuceneSearchThread.PatternResult
Inner class to store pattern results. |
private class |
LuceneSearchThread.QueryItem
Inner class to store query Item. |
| Field Summary | |
|---|---|
private String |
baseTokenAnnotationType
BaseTokenAnnotationType. |
private int |
contextWindow
Number of base token annotations to be used in context. |
private static boolean |
DEBUG
Debug variable |
boolean |
finished
Indicates if searching process is finished. |
private List |
ftp
First term positions. |
private int |
ftpIndex
First term position index. |
private boolean |
fwdIterationEnded
Indicates if we've reached the end of search results. |
private String |
indexLocation
The location of index. |
private LuceneSearcher |
luceneSearcher
Instance of the LuceneSearcher. |
private String |
query
Query |
private int |
queryItemIndex
QueryItemIndex |
private QueryParser |
queryParser
Instance of a QueryParser. |
private Map<String,List<LuceneSearchThread.QueryItem>> |
searchResultInfoMap
A Map that holds information about search results. |
private int |
serializedFileIDIndex
Index of the serializedFileID we are currently searching for. |
private String |
serializedFileIDInUse
We keep track of what was the last ID of the serialized File that we visited. |
private List<String> |
serializedFilesIDsList
List of serialized Files IDs retrieved from the lucene index |
private boolean |
success
Indicates if the query was success. |
private List<Token> |
tokenStreamInUse
This is where we store the tokenStreamInUse |
| Constructor Summary | |
|---|---|
LuceneSearchThread()
|
|
| Method Summary | |
|---|---|
private boolean |
areTheyEqual(List ftp,
List ftp1,
int qType)
Checks if two first term positions are identical. |
private List<Pattern> |
createAnnicPatterns(LuceneQueryResult aResult)
Given an object of luceneQueryResult this method for each found pattern, converts it into the annic pattern. |
private String |
getCompatibleName(String name)
Given a file name, it replaces the all invalid characters with '_'. |
private LuceneSearchThread.PatternResult |
getPatternResult(List<Token> subTokens,
String annotationSetName,
int patLen,
int qType,
int patWindow,
String query,
String baseTokenAnnotationType,
int numberOfResultsToFetch)
this method takes the tokenStream as a text, the first term positions, pattern length, queryType and patternWindow and returns the GateAnnotations as an array for each pattern with left and right context |
private LuceneSearchThread.PatternResult |
getPatternResult(List<Token> subTokens,
String annotationSetName,
int patLen,
int patWindow,
String query,
String baseTokenAnnotationType,
int noOfResultsToFetch)
This method returns the valid patterns back and the respective GateAnnotations |
String |
getQuery()
Gets the query. |
private List<Token> |
getTokenStreamFromDisk(String indexDirectory,
String documentFolder,
String documentID)
This method looks on the disk to find the tokenStream |
private List<Pattern> |
locatePatterns(String docID,
String annotationSetName,
List<List<PatternAnnotation>> gateAnnotations,
List firstTermPositions,
List<Integer> patternLength,
String queryString)
Locates the valid patterns in token stream and discards the invalid first term positions returned by the lucene searcher. |
List<Pattern> |
next(int numberOfResults)
This method returns a list containing instances of Pattern |
private String |
removeUnitNumber(String documentID)
Each index unit is first converted into a separate lucene document. |
boolean |
search(String query,
int patternWindow,
String indexLocation,
String corpusToSearchIn,
String annotationSetToSearchIn,
LuceneSearcher luceneSearcher)
This method collects the necessary information from lucene and uses it when the next method is called |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
private static boolean DEBUG
private int contextWindow
private String indexLocation
private QueryParser queryParser
private String baseTokenAnnotationType
private LuceneSearcher luceneSearcher
public boolean finished
private int serializedFileIDIndex
private int queryItemIndex
private List<String> serializedFilesIDsList
private Map<String,List<LuceneSearchThread.QueryItem>> searchResultInfoMap
private int ftpIndex
private boolean success
private boolean fwdIterationEnded
private String serializedFileIDInUse
private List<Token> tokenStreamInUse
private String query
private List ftp
| Constructor Detail |
|---|
public LuceneSearchThread()
| Method Detail |
|---|
private String getCompatibleName(String name)
name -
public boolean search(String query,
int patternWindow,
String indexLocation,
String corpusToSearchIn,
String annotationSetToSearchIn,
LuceneSearcher luceneSearcher)
throws SearchException
limit - limit indicates the number of patterns to retrievequery - query supplied by the userpatternWindow - number of tokens to refer on left and right
contextindexLocation - location of the index the searcher should
search inluceneSearcher - an instance of lucene search from where the
instance of SearchThread is invoked
SearchException
public List<Pattern> next(int numberOfResults)
throws Exception
numberOfResults - the number of results to fetch
Exceptionprivate List<Pattern> createAnnicPatterns(LuceneQueryResult aResult)
aResult -
private List<Pattern> locatePatterns(String docID,
String annotationSetName,
List<List<PatternAnnotation>> gateAnnotations,
List firstTermPositions,
List<Integer> patternLength,
String queryString)
docID - gateAnnotations - firstTermPositions - patternLength - queryString -
private String removeUnitNumber(String documentID)
documentID -
private List<Token> getTokenStreamFromDisk(String indexDirectory,
String documentFolder,
String documentID)
throws Exception
location - String
Exception
private LuceneSearchThread.PatternResult getPatternResult(List<Token> subTokens,
String annotationSetName,
int patLen,
int qType,
int patWindow,
String query,
String baseTokenAnnotationType,
int numberOfResultsToFetch)
subTokens - ftp - patLen - qType - patWindow - query - baseTokenAnnotationType -
private LuceneSearchThread.PatternResult getPatternResult(List<Token> subTokens,
String annotationSetName,
int patLen,
int patWindow,
String query,
String baseTokenAnnotationType,
int noOfResultsToFetch)
subTokens - ArrayListftp - ArrayListpatLen - intpatWindow - intquery - String
private boolean areTheyEqual(List ftp,
List ftp1,
int qType)
ftp - ftp1 - qType -
public String getQuery()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||