gate.creole.annic.lucene
Class LuceneSearcher

java.lang.Object
  extended by gate.creole.annic.lucene.LuceneSearcher
All Implemented Interfaces:
Searcher

public class LuceneSearcher
extends Object
implements Searcher

This class provides the Searching functionality for annic.

Author:
niraj

Field Summary
private  List<Pattern> annicPatterns
          Found patterns.
private  String annotationSetToSearchIn
          Annotation set to search in.
 Map<String,List<String>> annotationTypesMap
          Found annotation types in the annic patterns.
private  int contextWindow
          The number of base token annotations to show in left and right context of the pattern.
private  String corpusToSearchIn
          Corpus to search in.
private  LuceneDataStoreImpl datastore
          Used with freq method for statistics.
private  boolean fwdIterationEnded
          Tells if we have reached at the end of of found results.
private  List<String> indexLocations
          A List of index locations.
private  Hits luceneHits
          Hits returned by the lucene.
private  int luceneSearchThreadIndex
          Tells which thread to use to retrieve results from.
private  List<LuceneSearchThread> luceneSearchThreads
          A query can result into multiple queries.
private  Map<String,Object> parameters
          Search parameters.
private  String query
          The submitted query.
private  Map<String,List<String>> queryTokens
          A Map used for caching query tokens created for a query.
private  boolean success
          Indicates if the search was successful.
private  boolean wasDeleteQuery
          Indicates if the query was to delete certain documents.
 
Constructor Summary
LuceneSearcher()
           
 
Method Summary
 void addQueryTokens(String query, List<String> queryTokens)
          Adds the query tokens for the given query.
 void exportResults(File outputFile)
          This method allow exporting results in to the provided file.
 int freq(List<Hit> patternsToSearchIn, String annotationType, boolean inMatchedSpan, boolean inContext)
           
 int freq(List<Hit> patternsToSearchIn, String annotationType, String feature, String value, boolean inMatchedSpan, boolean inContext)
           
 int freq(String corpusToSearchIn, String annotationSetToSearchIn, String annotationType)
           
 int freq(String corpusToSearchIn, String annotationSetToSearchIn, String annotationType, String featureName)
           
 int freq(String corpusToSearchIn, String annotationSetToSearchIn, String annotationType, String featureName, String value)
           
 Map<String,Integer> freqForAllValues(List<Hit> patternsToSearchIn, String annotationType, String feature, boolean inMatchedSpan, boolean inContext)
           
 Map<String,List<String>> getAnnotationTypesMap()
          Gets the map of found annotation types and annotation features.
 Integer getContextWindow()
          Gets the number of base token annotations to show in the context.
 Hit[] getHits()
          Gets the found hits (annic patterns).
 String[] getIndexedAnnotationSetNames()
          This method returns a set of annotation set names that are indexed.
 Map<String,Object> getParameters()
          Gets the search parameters set by user.
 String getQuery()
          Gets the submitted query.
 List<String> getQueryTokens(String query)
          Gets the query tokens for the given query.
 Hit[] next(int numberOfHits)
          Return the next numberOfHits -1 indicates all
 boolean search(String query, Map<String,Object> parameters)
          Method retunrs true/false indicating whether results were found or not.
 void setLuceneDatastore(LuceneDataStoreImpl datastore)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

indexLocations

private List<String> indexLocations
A List of index locations. It allows searching at multiple locations.


query

private String query
The submitted query.


contextWindow

private int contextWindow
The number of base token annotations to show in left and right context of the pattern. By default 5.


annicPatterns

private List<Pattern> annicPatterns
Found patterns.


annotationTypesMap

public Map<String,List<String>> annotationTypesMap
Found annotation types in the annic patterns. The maps keeps record of found annotation types and features for each of them.


parameters

private Map<String,Object> parameters
Search parameters.


corpusToSearchIn

private String corpusToSearchIn
Corpus to search in.


annotationSetToSearchIn

private String annotationSetToSearchIn
Annotation set to search in.


luceneHits

private Hits luceneHits
Hits returned by the lucene.


wasDeleteQuery

private boolean wasDeleteQuery
Indicates if the query was to delete certain documents.


luceneSearchThreads

private List<LuceneSearchThread> luceneSearchThreads
A query can result into multiple queries. For example: (A|B)C is converted into two queries: AC and AD. For each query a separate thread is started.


success

private boolean success
Indicates if the search was successful.


luceneSearchThreadIndex

private int luceneSearchThreadIndex
Tells which thread to use to retrieve results from.


fwdIterationEnded

private boolean fwdIterationEnded
Tells if we have reached at the end of of found results.


datastore

private LuceneDataStoreImpl datastore
Used with freq method for statistics.


queryTokens

private Map<String,List<String>> queryTokens
A Map used for caching query tokens created for a query.

Constructor Detail

LuceneSearcher

public LuceneSearcher()
Method Detail

next

public Hit[] next(int numberOfHits)
           throws SearchException
Return the next numberOfHits -1 indicates all

Specified by:
next in interface Searcher
Returns:
Throws:
SearchException

search

public boolean search(String query,
                      Map<String,Object> parameters)
               throws SearchException
Method retunrs true/false indicating whether results were found or not.

Specified by:
search in interface Searcher
Returns:
Throws:
SearchException

getQuery

public String getQuery()
Gets the submitted query.

Specified by:
getQuery in interface Searcher
Returns:

getContextWindow

public Integer getContextWindow()
Gets the number of base token annotations to show in the context.

Returns:

getHits

public Hit[] getHits()
Gets the found hits (annic patterns).

Specified by:
getHits in interface Searcher
Returns:

getAnnotationTypesMap

public Map<String,List<String>> getAnnotationTypesMap()
Gets the map of found annotation types and annotation features. This call must be invoked only after a call to the getIndexedAnnotationSetNames(String indexLocation) method. Otherwise this method doesn't guranttee the correct results. The results obtained has the following format. Key: CorpusName;AnnotationSetName;AnnotationType Value: respective features

Specified by:
getAnnotationTypesMap in interface Searcher
Returns:

getIndexedAnnotationSetNames

public String[] getIndexedAnnotationSetNames()
                                      throws SearchException
This method returns a set of annotation set names that are indexed. Each entry has the following format:

corpusName;annotationSetName

where, the corpusName is the name of the corpus the annotationSetName belongs to.

Specified by:
getIndexedAnnotationSetNames in interface Searcher
Returns:
Throws:
SearchException

getParameters

public Map<String,Object> getParameters()
Gets the search parameters set by user.

Specified by:
getParameters in interface Searcher
Returns:

getQueryTokens

public List<String> getQueryTokens(String query)
Gets the query tokens for the given query.

Parameters:
query -
Returns:

addQueryTokens

public void addQueryTokens(String query,
                           List<String> queryTokens)
Adds the query tokens for the given query.

Parameters:
query -
queryTokens -

exportResults

public void exportResults(File outputFile)
This method allow exporting results in to the provided file. This method has not been implemented yet.

Specified by:
exportResults in interface Searcher

freq

public int freq(String corpusToSearchIn,
                String annotationSetToSearchIn,
                String annotationType,
                String featureName,
                String value)
         throws SearchException
Specified by:
freq in interface Searcher
Throws:
SearchException
See Also:
StatsCalculator#freq(String, String, String, String, String)

freq

public int freq(String corpusToSearchIn,
                String annotationSetToSearchIn,
                String annotationType)
         throws SearchException
Specified by:
freq in interface Searcher
Throws:
SearchException
See Also:
StatsCalculator#freq(String, String, String, String, String)

freq

public int freq(String corpusToSearchIn,
                String annotationSetToSearchIn,
                String annotationType,
                String featureName)
         throws SearchException
Specified by:
freq in interface Searcher
Throws:
SearchException
See Also:
StatsCalculator#freq(String, String, String, String, String)

freq

public int freq(List<Hit> patternsToSearchIn,
                String annotationType,
                String feature,
                String value,
                boolean inMatchedSpan,
                boolean inContext)
         throws SearchException
Specified by:
freq in interface Searcher
Throws:
SearchException
See Also:
StatsCalculator#freq(List, String, String, String, boolean, boolean)

freq

public int freq(List<Hit> patternsToSearchIn,
                String annotationType,
                boolean inMatchedSpan,
                boolean inContext)
         throws SearchException
Specified by:
freq in interface Searcher
Throws:
SearchException
See Also:
StatsCalculator#freq(List, String, String, String, boolean, boolean)

freqForAllValues

public Map<String,Integer> freqForAllValues(List<Hit> patternsToSearchIn,
                                            String annotationType,
                                            String feature,
                                            boolean inMatchedSpan,
                                            boolean inContext)
                                     throws SearchException
Specified by:
freqForAllValues in interface Searcher
Throws:
SearchException
See Also:
StatsCalculator#freq(List, String, String, boolean, boolean)

setLuceneDatastore

public void setLuceneDatastore(LuceneDataStoreImpl datastore)