|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectgate.corpora.DocumentContentImpl
public class DocumentContentImpl
Represents the commonalities between all sorts of document contents.
| Constructor Summary | |
|---|---|
DocumentContentImpl()
Default construction |
|
DocumentContentImpl(String s)
For ranges |
|
DocumentContentImpl(URL u,
String encoding,
Long start,
Long end)
Contruction from URL and offsets. |
|
| Method Summary | |
|---|---|
boolean |
equals(Object other)
Two documents are the same if their contents is the same |
DocumentContent |
getContent(Long start,
Long end)
Return the contents under a particular span. |
String |
getOriginalContent()
Return the original content of the document received during the loading phase or on construction from string. |
int |
hashCode()
Calculate the hash value for the object. |
Long |
size()
The size of this content (e.g. character length for textual content). |
String |
toString()
Returns the String representing the content in case of a textual document. |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public DocumentContentImpl()
public DocumentContentImpl(URL u,
String encoding,
Long start,
Long end)
throws IOException
IOExceptionpublic DocumentContentImpl(String s)
| Method Detail |
|---|
public DocumentContent getContent(Long start,
Long end)
throws InvalidOffsetException
DocumentContentConceptually the annotation offsets are defined as falling in between characters, with "0" pointing before the fist character. Because of that, the offsets where an annotation ends and the space after it starts are the same.
So this is what the "abcde" string looks like with the offsets explicitly included: 0a1b2c3d4e5
"ab cd" would then look like this: 0a1b2 3c4d5
with the following annotations:
Token "ab" [0,2]
SpaceToken " " [2,3]
Token "cd" [3,5]
getContent in interface DocumentContentstart - the beginning index, inclusive.end - the ending index, exclusive.
InvalidOffsetException - if the
start is negative, or
end is larger than the length of
this DocumentContent object, or
start is larger than
end.public String toString()
toString in class Objectpublic Long size()
size in interface DocumentContentpublic boolean equals(Object other)
equals in class Objectpublic int hashCode()
hashCode in class Objectpublic String getOriginalContent()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||