public class EntityBasedTokenizer extends AbstractNerPropagationPreprocessor implements EntityTokenSurfaceFormMappingSupplier
| Modifier and Type | Field and Description |
|---|---|
private static String |
CHARS_TO_INSERT |
private static String |
CHARS_TO_REPLACE |
protected EntityTermMapping |
surfaceFormsMapping |
CORRECT_SURFACE_FORM_ID, ENTITY_TOKEN_GONE_THROUGH_POS_TAGGING_ID| Constructor and Description |
|---|
EntityBasedTokenizer() |
| Modifier and Type | Method and Description |
|---|---|
EntityTermMapping |
getLastEntityTokenSurfaceFormMapping()
Returns the mapping created since the last call of this method.
|
protected org.dice_research.topicmodeling.lang.Term[] |
getTokensAfterPosTagging(org.dice_research.topicmodeling.utils.doc.ner.NamedEntityInText entity,
String surfaceForm) |
protected String |
processEntity(org.dice_research.topicmodeling.utils.doc.ner.NamedEntityInText entity,
String surfaceForm,
Set<String> tokens) |
preprocessNamedEntitiesprivate static final String CHARS_TO_REPLACE
private static final String CHARS_TO_INSERT
protected EntityTermMapping surfaceFormsMapping
protected String processEntity(org.dice_research.topicmodeling.utils.doc.ner.NamedEntityInText entity, String surfaceForm, Set<String> tokens)
processEntity in class AbstractNerPropagationPreprocessorprotected org.dice_research.topicmodeling.lang.Term[] getTokensAfterPosTagging(org.dice_research.topicmodeling.utils.doc.ner.NamedEntityInText entity,
String surfaceForm)
public EntityTermMapping getLastEntityTokenSurfaceFormMapping()
EntityTokenSurfaceFormMappingSuppliergetLastEntityTokenSurfaceFormMapping in interface EntityTokenSurfaceFormMappingSupplierCopyright © 2015–2020. All rights reserved.