public class UnitEnglish extends Object implements IUnitLanguage
Examples: "Give me two hundred birds" -> "Give me 200 birds" "$80 thousand and three hundred four" -> "$ 80304" "Show the first ten screws with a length of two inches" -> "Show the first 10 screws with a length of 0.0508 m" "10 miles" -> "1609.344 m" "one" can be a numeral, but it does not have to be. To handle those cases, The Stanford pipeline will be started on sentences containing "one". It will only be converted to a number, if its not in a relation "nmod" with any other word. With this, we can handle sentences like: "The color is different from the old one." This sentence will be the same after parsing.
| Modifier and Type | Field and Description |
|---|---|
(package private) HashMap<String,Double> |
identifierToMultiplier |
(package private) HashMap<String,org.apache.commons.lang3.tuple.ImmutablePair<Double,String>> |
identifierToUnit |
private static org.slf4j.Logger |
log |
(package private) StanfordNLPConnector |
stanford |
| Constructor and Description |
|---|
UnitEnglish(StanfordNLPConnector stanford)
Stanford is needed to handle sentences containing "one"
|
| Modifier and Type | Method and Description |
|---|---|
String |
convert(String q)
Converts all occurring natural language numerals to digits.
|
private String |
convertToBaseUnit(String str)
Converts occurring units to their base unit.
|
private String |
insertWhitespacebeforePunctuation(String input) |
private void |
loadResource() |
static void |
main(String[] args) |
private Double |
parseWord(String s) |
private String |
prettyAppendDouble(String out,
Double val)
Appends a String representation of a Double to input string.
|
private String |
replaceNumerals(String replaceThis)
Converts occurring natural language numerals to digits in any given
String.
|
private static org.slf4j.Logger log
HashMap<String,org.apache.commons.lang3.tuple.ImmutablePair<Double,String>> identifierToUnit
StanfordNLPConnector stanford
public UnitEnglish(StanfordNLPConnector stanford)
public String convert(String q)
IUnitLanguage"$80 million" - "$ 80000000" "10 miles" - "1609.344 m"
convert in interface IUnitLanguageq - any string which may or may not contain numerals or units to
convert.private String replaceNumerals(String replaceThis)
UnitEnglish
This only works when natural language numerals are stored in
identifierToMultiplier. Call UnitEnglish#loadTabSplit(File)
beforehand, to load data. To add numerals which should be recognized,
edit resource file.replaceThis - String which may or may not contain something to convert.private String convertToBaseUnit(String str)
str - Any string which may or may not contain a unitprivate String prettyAppendDouble(String out, Double val)
out - String to append to.val - Val to append.private Double parseWord(String s)
s - String to be parsed to Double. Also accepts all numeral words
defined in English fileprivate void loadResource()
throws IOException
IOExceptionpublic static void main(String[] args)
Copyright © 2016–2017 Pivotal Software, Inc.. All rights reserved.