gate.corpora
Class TikaFormat
java.lang.Object
gate.util.AbstractFeatureBearer
gate.creole.AbstractResource
gate.creole.AbstractLanguageResource
gate.DocumentFormat
gate.corpora.TikaFormat
- All Implemented Interfaces:
- LanguageResource, Resource, FeatureBearer, NameBearer, Serializable
@CreoleResource(name="Apache Tika Document Format",
isPrivate=true,
autoinstances=)
public class TikaFormat- extends DocumentFormat
- See Also:
- Serialized Form
| Methods inherited from class gate.DocumentFormat |
addStatusListener, areEqual, decideBetweenThreeMimeTypes, decideBetweenTwoMimeTypes, fireStatusChanged, getDocumentFormat, getDocumentFormat, getDocumentFormat, getElement2StringMap, getFeatures, getMarkupElementsMap, getMimeType, getMimeTypeForString, getShouldCollectRepositioning, getSupportedFileSuffixes, guessTypeUsingMagicNumbers, removeStatusListener, runMagicNumbers, setElement2StringMap, setFeatures, setMarkupElementsMap, setMimeType, setShouldCollectRepositioning, unpackMarkup |
| Methods inherited from class gate.creole.AbstractResource |
checkParameterValues, getBeanInfo, getName, getParameterValue, getParameterValue, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TikaFormat
public TikaFormat()
init
public Resource init()
throws ResourceInstantiationException
- Description copied from class:
AbstractResource
- Initialise this resource, and return it.
- Specified by:
init in interface Resource- Overrides:
init in class AbstractResource
- Throws:
ResourceInstantiationException
supportsRepositioning
public Boolean supportsRepositioning()
- Description copied from class:
DocumentFormat
- If the document format could collect repositioning information
during the unpack phase this method will return true.
You should override this method in the child class of the defined
document format if it could collect the repositioning information.
- Overrides:
supportsRepositioning in class DocumentFormat
unpackMarkup
public void unpackMarkup(Document doc)
throws DocumentFormatException
- Description copied from class:
DocumentFormat
- Unpack the markup in the document. This converts markup from the
native format (e.g. XML, RTF) into annotations in GATE format.
Uses the markupElementsMap to determine which elements to convert, and
what annotation type names to use.
- Specified by:
unpackMarkup in class DocumentFormat
- Throws:
DocumentFormatException
unpackMarkup
public void unpackMarkup(Document doc,
RepositioningInfo repInfo,
RepositioningInfo ampCodingInfo)
throws DocumentFormatException
- Specified by:
unpackMarkup in class DocumentFormat
- Throws:
DocumentFormatException