Class IQuestionCsvParser

java.lang.Object
org.aksw.qa.commons.qald.IQuestionCsvParser

public class IQuestionCsvParser extends Object
Convert IQuestions to CSVs, in both directions.
Author:
jhuth
  • Field Details

    • ARRAY_SPLIT

      public static String ARRAY_SPLIT
      Some Columns can hold multiple values. So, a String representing this column content will be split on this char.
    • NULL_DEFAULT

      public static String NULL_DEFAULT
      The value that will be written if a column is declared for a csv but corresponding field in IQuestion is null
  • Constructor Details

    • IQuestionCsvParser

      public IQuestionCsvParser()
  • Method Details

    • csvRowToQuestion

      private static IQuestion csvRowToQuestion(String[] csvRow, IQuestionCsvParser.Column... columns) throws IOException
      Parameters:
      csvRow - A row of a csv, with decoded terminators and split up
      columns -
      Returns:
      Throws:
      IOException
    • csvToQuestionList

      public static List<IQuestion> csvToQuestionList(au.com.bytecode.opencsv.CSVReader reader, IQuestionCsvParser.Column... columns) throws IOException
      Reads a CSV which source is defined by given CSVReader, which also defines the separation and quotation chars. CSVReader can also be set up to skip the first n lines.

      Define the Column structure if the CSV by passig IQuestionCsvParser.Column Objects. The order has to fit the column order of the csv.

      For example, following call:

      csvToQuestionList(reader,{@link Column.ID(), Column.ignore(), Column.question("en") ) }

      Will parse the first columnn of the csv to IQuestion#setId() , ignore the second column, and will parse column 3 as question in english.

      Trailing csv columns (no IQuestionCsvParser.Column present) will be ignored, also, additionally defined IQuestionCsvParser.Columns will be ignored.

      IQuestionCsvParser.Columns can appear multiple times. For a column with one value(e.g. a boolean flag, sparql query) the rightmost column is used. For Columns holding possible array contents (e.g. golden answers, keywords), all defined columns will be parsed.

      Some columns can hold arrays, e.g. IQuestionCsvParser.Column.goldenAnswers(), IQuestionCsvParser.Column.keywords(String). The array separator string can be set via ARRAY_SPLIT Arrays can be preceeded and concluded by square and curly brackets. So, a toString on most Collections would be a valid input format.

      Parameters:
      reader - A CSVReader fitting to your needs. For GoogleDocs CSV export, you can use readerForGoogleDocsCsvExports(Reader, int)
      columns - as described above
      Returns:
      A list of IQuestions with the parseable information set.
      Throws:
      IOException - if something bad happens
    • questionListToCsv

      public static void questionListToCsv(au.com.bytecode.opencsv.CSVWriter writer, boolean columnDescriptorgRow, List<IQuestion> questions, IQuestionCsvParser.Column... columns) throws IOException
      Writes all data specified by the columns from all given IQuestions to the csvwriter. Get more information from the doc of csvToQuestionList(CSVReader, Column...)

      If or a declared IQuestionCsvParser.Column no information in IQuestion is present(e.g. field is null), NULL_DEFAULT will be written.

      Parameters:
      writer -
      columnDescriptorgRow - - the first row will contain the names of the columns
      questions -
      columns -
      Throws:
      IOException
    • parseBoolean

      private static Boolean parseBoolean(String boolStr) throws IOException
      Parses a boolean. Difference to Boolean.parseBoolean(String) is that no other values than {"true","false"} are accepted, ignoring case.
      Parameters:
      boolStr -
      Returns:
      a parsed boolean
      Throws:
      IOException - if boolStr not in {"true","false"}, ignoring case
    • parseStringArray

      private static List<String> parseStringArray(String arrayStr)
      Splits a string on ARRAY_SPLIT. First and last char can be square or curly brackets, they will be removed
      Parameters:
      arrayStr -
      Returns:
    • readerForGoogleDocsCsvExports

      public static au.com.bytecode.opencsv.CSVReader readerForGoogleDocsCsvExports(Reader reader, int skipLines)
      Returns reader which uses comma(,) as separator and (") as quote. Sufficient to read data which was generated with googleDocs csv-export.
      Parameters:
      reader -
      skipLines - skip the first n lines
      Returns: