Class RecordReaderJsonArray

java.lang.Object
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,T>
net.sansa_stack.hadoop.core.RecordReaderGenericBase<com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement>
net.sansa_stack.hadoop.format.gson.json.RecordReaderJsonArray
All Implemented Interfaces:
Closeable, AutoCloseable

public class RecordReaderJsonArray extends RecordReaderGenericBase<com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement>
  • Field Details

  • Constructor Details

    • RecordReaderJsonArray

      public RecordReaderJsonArray()
    • RecordReaderJsonArray

      public RecordReaderJsonArray(com.google.gson.Gson gson)
    • RecordReaderJsonArray

      public RecordReaderJsonArray(RecordReaderConf conf, com.google.gson.Gson gson)
  • Method Details

    • initialize

      public void initialize(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException
      Description copied from class: RecordReaderGenericBase
      Read out config paramaters (prefixes, length thresholds, ...) and examine the codec in order to set an internal flag whether the stream will be encoded or not.
      Overrides:
      initialize in class RecordReaderGenericBase<com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement>
      Throws:
      IOException
    • effectiveInputStream

      protected InputStream effectiveInputStream(InputStream base)
      Always replace the first character (which is either a comma or open bracket) with an open bracket in order to mimick a JSON array start.
      Overrides:
      effectiveInputStream in class RecordReaderGenericBase<com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement>
      Parameters:
      base - The base input stream
      Returns:
    • parse

      protected Stream<com.google.gson.JsonElement> parse(InputStream in, boolean isProbe)
      Description copied from class: RecordReaderGenericBase
      Create a flowable from the input stream. The input stream may be incorrectly positioned in which case the Flowable is expected to indicate this by raising an error event.
      Specified by:
      parse in class RecordReaderGenericBase<com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement,com.google.gson.JsonElement>
      Parameters:
      in - A supplier of input streams. May supply the same underlying stream on each call hence only at most a single stream should be taken from the supplier. Supplied streams are safe to use in try-with-resources blocks (possibly using CloseShieldInputStream). Taken streams should be closed by the client code.
      isProbe - Whether the parser should be configured for probing. For example, it is desireable to suppress porse errors during probing. Also, for probing the parser may optimize itself for minimizing latency of yielding items rather than overall throughput.
      Returns:
    • parse

      protected io.reactivex.rxjava3.core.Flowable<com.google.gson.JsonElement> parse(Callable<InputStream> inputStreamSupplier)