Package net.sansa_stack.hadoop.core
Class SeekableSourceOverSplit
java.lang.Object
net.sansa_stack.hadoop.core.SeekableSourceOverSplit
- All Implemented Interfaces:
Closeable,AutoCloseable,org.aksw.commons.io.buffer.array.HasArrayOps<byte[]>,org.aksw.commons.io.input.ReadableChannelFactory<byte[]>,org.aksw.commons.io.input.ReadableChannelSource<byte[]>,org.aksw.commons.io.input.SeekableReadableChannelSource<byte[]>
public class SeekableSourceOverSplit
extends Object
implements org.aksw.commons.io.input.SeekableReadableChannelSource<byte[]>, Closeable
A seekable source over a split (usually a hadoop input split). When there is an attempt to read over the
split boundary, then a "transition" action is called.
This action may scan ahead for an end end position after the split boundary.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected NavigableMap<Long,Long> protected org.aksw.commons.io.input.ReadableChannel<byte[]>The total number of bytes that need to be read from base until the split boundary is reached.protected org.aksw.commons.io.input.SeekableReadableChannel<byte[]>protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>The postamble buffer is only served if a limit is set viaSeekableSourceOverSplit.Channel.setLimit(long)If no limit is set then the remainder of the stream is consumed which is assumed to include the postambleprotected NavigableMap<Long,Integer> protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> -
Constructor Summary
ConstructorsConstructorDescriptionSeekableSourceOverSplit(org.aksw.commons.io.input.ReadableChannel<byte[]> baseStream, org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> headBuffer, org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> tailBuffer, org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> postambleBuffer, NavigableMap<Long, Long> absPosToBlockOffset) If true then the headStream can no longer be used. -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()protected static SeekableSourceOverSplitcreate(org.aksw.commons.io.input.ReadableChannel<byte[]> baseStream, org.aksw.commons.io.input.ReadableChannel<byte[]> headStream, byte[] postambleBytes, NavigableMap<Long, Long> blockOffsetToAbsPos) static SeekableSourceOverSplitcreateForBlockEncodedStream(org.aksw.commons.io.hadoop.SeekableInputStream inn, long splitPoint, byte[] postambleBytes) static SeekableSourceOverSplitcreateForNonEncodedStream(org.aksw.commons.io.hadoop.SeekableInputStream in, long splitPoint, byte[] postambleBytes) org.aksw.commons.io.buffer.array.ArrayOps<byte[]>longgetBlockForPos(long pos) protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>getBufferByBaseOffset(long baseOffset) protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>getBufferByIndex(int index) protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>getBufferByIndexUnsafe(int index) org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>longlongorg.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]>net.sansa_stack.hadoop.core.SeekableSourceOverSplit.Channelprotected voidlongsize()Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.aksw.commons.io.input.SeekableReadableChannelSource
newReadableChannel, newReadableChannel, newReadableChannel
-
Field Details
-
baseStream
protected org.aksw.commons.io.input.ReadableChannel<byte[]> baseStreamThe total number of bytes that need to be read from base until the split boundary is reached. A value of -1 indicates unknown. For non-encoded streams this is simply the length of the split. -
headBuffer
protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> headBuffer -
tailBuffer
protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> tailBuffer -
postambleBuffer
protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> postambleBufferThe postamble buffer is only served if a limit is set viaSeekableSourceOverSplit.Channel.setLimit(long)If no limit is set then the remainder of the stream is consumed which is assumed to include the postamble -
debufferedHead
protected org.aksw.commons.io.input.SeekableReadableChannel<byte[]> debufferedHead -
posToIndex
-
absPosToBlockOffset
-
-
Constructor Details
-
Method Details
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-
getBlockForPos
public long getBlockForPos(long pos) -
getKnownSize
public long getKnownSize() -
getAbsPosToBlockOffset
- Returns:
- null if the underlying stream is not based on blocks; otherwise a map of byte-offsets (staring from zero) to block offsets
-
getBufferByBaseOffset
protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> getBufferByBaseOffset(long baseOffset) -
getBufferByIndex
protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> getBufferByIndex(int index) -
getBufferByIndexUnsafe
protected org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> getBufferByIndexUnsafe(int index) -
setupTailBuffer
protected void setupTailBuffer() -
getHeadBuffer
public org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> getHeadBuffer() -
getTailBuffer
public org.aksw.commons.io.buffer.array.BufferOverReadableChannel<byte[]> getTailBuffer() -
createForNonEncodedStream
public static SeekableSourceOverSplit createForNonEncodedStream(org.aksw.commons.io.hadoop.SeekableInputStream in, long splitPoint, byte[] postambleBytes) -
createForBlockEncodedStream
public static SeekableSourceOverSplit createForBlockEncodedStream(org.aksw.commons.io.hadoop.SeekableInputStream inn, long splitPoint, byte[] postambleBytes) -
getHeadSize
public long getHeadSize() -
newReadableChannel
public net.sansa_stack.hadoop.core.SeekableSourceOverSplit.Channel newReadableChannel() throws IOException- Specified by:
newReadableChannelin interfaceorg.aksw.commons.io.input.ReadableChannelFactory<byte[]>- Specified by:
newReadableChannelin interfaceorg.aksw.commons.io.input.SeekableReadableChannelSource<byte[]>- Throws:
IOException
-
size
- Specified by:
sizein interfaceorg.aksw.commons.io.input.ReadableChannelSource<byte[]>- Throws:
IOException
-
getArrayOps
public org.aksw.commons.io.buffer.array.ArrayOps<byte[]> getArrayOps()- Specified by:
getArrayOpsin interfaceorg.aksw.commons.io.buffer.array.HasArrayOps<byte[]>
-