Class RecordReaderRdfTrigDataset

java.lang.Object
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,T>
net.sansa_stack.hadoop.core.RecordReaderGenericBase<U,G,A,T>
net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfBase<U,G,A,T>
net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfAccumulatingBase<org.apache.jena.sparql.core.Quad,org.apache.jena.graph.Node,org.aksw.jenax.arq.dataset.api.DatasetOneNg,org.aksw.jenax.arq.dataset.api.DatasetOneNg>
net.sansa_stack.hadoop.format.jena.trig.RecordReaderRdfTrigDataset
All Implemented Interfaces:
Closeable, AutoCloseable

public class RecordReaderRdfTrigDataset extends RecordReaderGenericRdfAccumulatingBase<org.apache.jena.sparql.core.Quad,org.apache.jena.graph.Node,org.aksw.jenax.arq.dataset.api.DatasetOneNg,org.aksw.jenax.arq.dataset.api.DatasetOneNg>
RecordReader for the Trig RDF format that groups consecutive quads having the same IRI for the graph component into Datasets.
  • Field Details

    • RECORD_MINLENGTH_KEY

      public static final String RECORD_MINLENGTH_KEY
      See Also:
    • RECORD_MAXLENGTH_KEY

      public static final String RECORD_MAXLENGTH_KEY
      See Also:
    • RECORD_PROBECOUNT_KEY

      public static final String RECORD_PROBECOUNT_KEY
      See Also:
    • ELEMENT_PROBECOUNT_KEY

      public static final String ELEMENT_PROBECOUNT_KEY
      See Also:
    • PREFIXES_MAXLENGTH_KEY

      public static final String PREFIXES_MAXLENGTH_KEY
      See Also:
    • trigFwdPatternGraphFollowedByCurlyBrace

      protected static final CustomPattern trigFwdPatternGraphFollowedByCurlyBrace
      This is the pattern for directives or trig data where graphs are separated by '{'. This pattern was used
    • trigFwdPatternNew

      protected static final CustomPattern trigFwdPatternNew
      This is the pattern for directives or (compact) IRIs. Trig graphs must be IRIs.
    • trigFwdPattern

      protected static final CustomPattern trigFwdPattern
  • Constructor Details

    • RecordReaderRdfTrigDataset

      public RecordReaderRdfTrigDataset()
  • Method Details

    • parse

      protected Stream<org.apache.jena.sparql.core.Quad> parse(InputStream in, boolean isProbe)
      Description copied from class: RecordReaderGenericBase
      Create a flowable from the input stream. The input stream may be incorrectly positioned in which case the Flowable is expected to indicate this by raising an error event.
      Specified by:
      parse in class RecordReaderGenericBase<org.apache.jena.sparql.core.Quad,org.apache.jena.graph.Node,org.aksw.jenax.arq.dataset.api.DatasetOneNg,org.aksw.jenax.arq.dataset.api.DatasetOneNg>
      Parameters:
      in - A supplier of input streams. May supply the same underlying stream on each call hence only at most a single stream should be taken from the supplier. Supplied streams are safe to use in try-with-resources blocks (possibly using CloseShieldInputStream). Taken streams should be closed by the client code.
      isProbe - Whether the parser should be configured for probing. For example, it is desireable to suppress porse errors during probing. Also, for probing the parser may optimize itself for minimizing latency of yielding items rather than overall throughput.
      Returns:
    • parse

      protected io.reactivex.rxjava3.core.Flowable<org.apache.jena.sparql.core.Quad> parse(Callable<InputStream> inputStreamSupplier)