Class RecordReaderGenericRdfTripleBase
java.lang.Object
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,T>
net.sansa_stack.hadoop.core.RecordReaderGenericBase<U,G,A,T>
net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfBase<T,T,T,T>
net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfNonAccumulatingBase<org.apache.jena.graph.Triple>
net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfTripleBase
- All Implemented Interfaces:
Closeable,AutoCloseable
- Direct Known Subclasses:
RecordReaderRdfNTriples,RecordReaderRdfTurtleTriple
public class RecordReaderGenericRdfTripleBase
extends RecordReaderGenericRdfNonAccumulatingBase<org.apache.jena.graph.Triple>
-
Nested Class Summary
Nested classes/interfaces inherited from class net.sansa_stack.hadoop.core.RecordReaderGenericBase
RecordReaderGenericBase.ReadTooFarException -
Field Summary
Fields inherited from class net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfBase
baseIri, baseIriKey, headerBytesKey, lang, prefixesMaxLengthKey, prefixMapFields inherited from class net.sansa_stack.hadoop.core.RecordReaderGenericBase
accumulating, codec, currentKey, currentValue, datasetFlow, decompressor, EMPTY_BYTE_ARRAY, enableStats, isEncoded, isFirstSplit, maxExtraByteCount, maxRecordLength, maxRecordLengthKey, minRecordLength, minRecordLengthKey, postambleBytes, preambleBytes, probeElementCount, probeElementCountKey, probeRecordCount, probeRecordCountKey, rawStream, recordFlowCloseable, recordStartPattern, regionStartSearchReadOverRegionEnd, regionStartSearchReadOverSplitEnd, skipRecordCount, split, splitEnd, splitId, splitLength, splitName, splitStart, stream, tailByteBuffer, tailBytes, tailEltBuffer, tailElts, tailEltsTime, tailRecordOffset, totalEltCount, totalRecordCount -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected Stream<org.apache.jena.graph.Triple>parse(InputStream inputStream, boolean isProbe) Create a flowable from the input stream.Methods inherited from class net.sansa_stack.hadoop.format.jena.base.RecordReaderGenericRdfBase
initialize, setupParserMethods inherited from class net.sansa_stack.hadoop.core.RecordReaderGenericBase
abbreviate, abbreviate, abbreviateAsUTF8, aggregate, aggregate, close, convert, createMatcherFactory, createRecordFlow, detectTail, didHitSplitBound, effectiveInputStream, effectiveInputStreamSupp, findFirstPositionWithProbeSuccess, findNextRegion, getCurrentKey, getCurrentValue, getPos, getPosition, getProgress, getStats, initRecordFlow, lines, logClose, logUnexpectedClose, nextKeyValue, parseFromSeekable, prober, setStreamToInterval, unbufferedStream
-
Constructor Details
-
RecordReaderGenericRdfTripleBase
-
-
Method Details
-
parse
Description copied from class:RecordReaderGenericBaseCreate a flowable from the input stream. The input stream may be incorrectly positioned in which case the Flowable is expected to indicate this by raising an error event.- Specified by:
parsein classRecordReaderGenericBase<org.apache.jena.graph.Triple,org.apache.jena.graph.Triple, org.apache.jena.graph.Triple, org.apache.jena.graph.Triple> - Parameters:
inputStream- A supplier of input streams. May supply the same underlying stream on each call hence only at most a single stream should be taken from the supplier. Supplied streams are safe to use in try-with-resources blocks (possibly using CloseShieldInputStream). Taken streams should be closed by the client code.isProbe- Whether the parser should be configured for probing. For example, it is desireable to suppress porse errors during probing. Also, for probing the parser may optimize itself for minimizing latency of yielding items rather than overall throughput.- Returns:
-