public class SentenceBasedDocumentTextSplitter extends AbstractSplittingDocumentSupplierDecorator
| Modifier and Type | Field and Description |
|---|---|
private static org.slf4j.Logger |
LOGGER |
private opennlp.tools.sentdetect.SentenceDetectorME |
sentenceDetector |
documentSource, queue| Constructor and Description |
|---|
SentenceBasedDocumentTextSplitter(org.dice_research.topicmodeling.preprocessing.docsupplier.DocumentSupplier documentSource,
opennlp.tools.sentdetect.SentenceDetectorME sentenceDetector) |
| Modifier and Type | Method and Description |
|---|---|
private org.dice_research.topicmodeling.utils.doc.Document |
createDocument(String documentText,
int start,
int end,
org.dice_research.topicmodeling.utils.doc.TermTokenizedText tttext,
org.dice_research.topicmodeling.utils.doc.Document document) |
protected void |
splitDocument(org.dice_research.topicmodeling.utils.doc.Document document)
In this method the splitter should split up the given document and add all new documents to the
AbstractSplittingDocumentSupplierDecorator.queue. |
private void |
splitDocument(org.dice_research.topicmodeling.utils.doc.Document document,
String text) |
private void |
splitDocument(org.dice_research.topicmodeling.utils.doc.Document document,
String text,
List<org.dice_research.topicmodeling.lang.Term> terms) |
addToQueue, getDecoratedDocumentSupplier, getNextDocument, setDecoratedDocumentSuppliergetNextDocumentId, setDocumentStartIdprivate static final org.slf4j.Logger LOGGER
private opennlp.tools.sentdetect.SentenceDetectorME sentenceDetector
public SentenceBasedDocumentTextSplitter(org.dice_research.topicmodeling.preprocessing.docsupplier.DocumentSupplier documentSource,
opennlp.tools.sentdetect.SentenceDetectorME sentenceDetector)
protected void splitDocument(org.dice_research.topicmodeling.utils.doc.Document document)
AbstractSplittingDocumentSupplierDecoratorAbstractSplittingDocumentSupplierDecorator.queue.splitDocument in class AbstractSplittingDocumentSupplierDecoratorprivate void splitDocument(org.dice_research.topicmodeling.utils.doc.Document document,
String text,
List<org.dice_research.topicmodeling.lang.Term> terms)
private void splitDocument(org.dice_research.topicmodeling.utils.doc.Document document,
String text)
private org.dice_research.topicmodeling.utils.doc.Document createDocument(String documentText, int start, int end, org.dice_research.topicmodeling.utils.doc.TermTokenizedText tttext, org.dice_research.topicmodeling.utils.doc.Document document)
Copyright © 2015–2020. All rights reserved.