gate.creole.tokeniser
Class DefaultTokeniser
java.lang.Object
gate.util.AbstractFeatureBearer
gate.creole.AbstractResource
gate.creole.AbstractProcessingResource
gate.creole.AbstractLanguageAnalyser
gate.creole.tokeniser.DefaultTokeniser
- All Implemented Interfaces:
- ANNIEConstants, Executable, LanguageAnalyser, ProcessingResource, Resource, Benchmarkable, FeatureBearer, NameBearer, Serializable
public class DefaultTokeniser
- extends AbstractLanguageAnalyser
- implements Benchmarkable
A composed tokeniser containing a SimpleTokeniser and a
Transducer.
The simple tokeniser tokenises the document and the transducer processes its
output.
- See Also:
- Serialized Form
| Fields inherited from interface gate.creole.ANNIEConstants |
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DEFAULT_FILE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_INSTANCE_FEATURE_NAME, LOOKUP_LANGUAGE_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PLUGIN_DIR, PR_NAMES, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME |
| Methods inherited from class gate.creole.AbstractResource |
checkParameterValues, getBeanInfo, getName, getParameterValue, getParameterValue, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DEF_TOK_DOCUMENT_PARAMETER_NAME
public static final String DEF_TOK_DOCUMENT_PARAMETER_NAME
- See Also:
- Constant Field Values
DEF_TOK_ANNOT_SET_PARAMETER_NAME
public static final String DEF_TOK_ANNOT_SET_PARAMETER_NAME
- See Also:
- Constant Field Values
DEF_TOK_TOKRULES_URL_PARAMETER_NAME
public static final String DEF_TOK_TOKRULES_URL_PARAMETER_NAME
- See Also:
- Constant Field Values
DEF_TOK_GRAMRULES_URL_PARAMETER_NAME
public static final String DEF_TOK_GRAMRULES_URL_PARAMETER_NAME
- See Also:
- Constant Field Values
DEF_TOK_ENCODING_PARAMETER_NAME
public static final String DEF_TOK_ENCODING_PARAMETER_NAME
- See Also:
- Constant Field Values
tokeniser
protected SimpleTokeniser tokeniser
- the simple tokeniser used for tokenisation
transducer
protected Transducer transducer
- the transducer used for post-processing
DefaultTokeniser
public DefaultTokeniser()
init
public Resource init()
throws ResourceInstantiationException
- Initialise this resource, and return it.
- Specified by:
init in interface Resource- Overrides:
init in class AbstractProcessingResource
- Throws:
ResourceInstantiationException
cleanup
public void cleanup()
- Description copied from class:
AbstractProcessingResource
- should clear all internal data of the resource. Does nothing now
- Specified by:
cleanup in interface Resource- Overrides:
cleanup in class AbstractProcessingResource
execute
public void execute()
throws ExecutionException
- Description copied from class:
AbstractProcessingResource
- Run the resource. It doesn't make sense not to override
this in subclasses so the default implementation signals an
exception.
- Specified by:
execute in interface Executable- Overrides:
execute in class AbstractProcessingResource
- Throws:
ExecutionException
interrupt
public void interrupt()
- Notifies all the PRs in this controller that they should stop their
execution as soon as possible.
- Specified by:
interrupt in interface Executable- Overrides:
interrupt in class AbstractProcessingResource
setTokeniserRulesURL
public void setTokeniserRulesURL(URL tokeniserRulesURL)
getTokeniserRulesURL
public URL getTokeniserRulesURL()
setEncoding
public void setEncoding(String encoding)
getEncoding
public String getEncoding()
setTransducerGrammarURL
public void setTransducerGrammarURL(URL transducerGrammarURL)
getTransducerGrammarURL
public URL getTransducerGrammarURL()
setAnnotationSetName
public void setAnnotationSetName(String annotationSetName)
getAnnotationSetName
public String getAnnotationSetName()
setBenchmarkId
public void setBenchmarkId(String benchmarkId)
- Description copied from interface:
Benchmarkable
- This method sets the benchmarkID for this resource. The resource
must use this as the prefix for any sub-events it logs.
- Specified by:
setBenchmarkId in interface Benchmarkable
- Parameters:
benchmarkId - the benchmark ID, which must not contain spaces
as it is already used as a separator in the log, you can use
Benchmark.createBenchmarkId(String, String) for it.
getBenchmarkId
public String getBenchmarkId()
- Description copied from interface:
Benchmarkable
- Returns the benchmark ID of this resource.
- Specified by:
getBenchmarkId in interface Benchmarkable
- Returns: