|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectgate.corpora.DocumentXmlUtils
public class DocumentXmlUtils
This class is contains useful static methods for working with the GATE XML
format. Many of the methods in this class were originally in DocumentImpl but as they are not specific to any one implementation of the
Document interface they have been moved here.
| Field Summary | |
|---|---|
static int |
DOC_SIZE_MULTIPLICATION_FACTOR
This field is used when creating StringBuffers for toXml() methods. |
static Map |
entitiesMap
A map initialized in init() containing entities that needs to be replaced in strings |
| Constructor Summary | |
|---|---|
DocumentXmlUtils()
|
|
| Method Summary | |
|---|---|
static void |
annotationSetToXml(AnnotationSet anAnnotationSet,
StringBuffer buffer)
This method saves an AnnotationSet as XML. |
static void |
annotationSetToXml(AnnotationSet anAnnotationSet,
String annotationSetNameToUse,
StringBuffer buffer)
This method saves an AnnotationSet as XML. |
static void |
buildEntityMapFromString(String aScanString,
TreeMap aMapToFill)
This method takes aScanString and searches for those chars from entitiesMap that appear in the string. |
static StringBuffer |
combinedNormalisation(String inputString)
Combines replaceCharsWithEntities and filterNonXmlChars in a single method |
static StringBuffer |
featuresToXml(FeatureMap aFeatureMap,
Map normalizedFeatureNames)
This method saves a FeatureMap as XML elements. |
static StringBuffer |
filterNonXmlChars(StringBuffer aStrBuffer)
This method filters any non XML char see: http://www.w3c.org/TR/2000/REC-xml-20001006#charsets All non XML chars will be replaced with 0x20 (space char) This assures that the next time the document is loaded there won't be any problems. |
static boolean |
isXmlChar(char ch)
This method decide if a char is a valid XML one or not |
static StringBuffer |
replaceCharsWithEntities(String anInputString)
This method replace all chars that appears in the anInputString and also that are in the entitiesMap with their corresponding entity |
static String |
textWithNodes(TextualDocument doc,
String aText)
Returns the document's text interspersed with <Node> elements at all points where the document has an annotation beginning or ending. |
static String |
toXml(TextualDocument doc)
Returns a GateXml document that is a custom XML format for wich there is a reader inside GATE called gate.xml.GateFormatXmlHandler. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final int DOC_SIZE_MULTIPLICATION_FACTOR
public static Map entitiesMap
| Constructor Detail |
|---|
public DocumentXmlUtils()
| Method Detail |
|---|
public static String toXml(TextualDocument doc)
doc - the document to serialize.
public static StringBuffer featuresToXml(FeatureMap aFeatureMap,
Map normalizedFeatureNames)
aFeatureMap - the feature map that has to be saved as XML.
public static StringBuffer combinedNormalisation(String inputString)
public static StringBuffer filterNonXmlChars(StringBuffer aStrBuffer)
aStrBuffer - represents the input String that is filtred. If the aStrBuffer is
null then an empty string will be returend
public static boolean isXmlChar(char ch)
ch - the char to be tested
public static StringBuffer replaceCharsWithEntities(String anInputString)
anInputString - the string analyzed. If it is null then returns the
empty string
public static String textWithNodes(TextualDocument doc,
String aText)
public static void buildEntityMapFromString(String aScanString,
TreeMap aMapToFill)
public static void annotationSetToXml(AnnotationSet anAnnotationSet,
StringBuffer buffer)
anAnnotationSet - The annotation set that has to be saved as XML.
public static void annotationSetToXml(AnnotationSet anAnnotationSet,
String annotationSetNameToUse,
StringBuffer buffer)
anAnnotationSet - The annotation set that has to be saved as XML.annotationSetNameToUse - The standard annotationSetToXml(AnnotaionSet, StringBuffer) uses the name that belongs to the provided annotation set,
however, this method allows one to store the provided annotation set under a different annotation set name.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||