Class BZip2CodecAdapted
java.lang.Object
org.aksw.commons.io.hadoop.compress.bzip2.BZip2CodecAdapted
- All Implemented Interfaces:
org.apache.hadoop.conf.Configurable,org.apache.hadoop.io.compress.CompressionCodec,org.apache.hadoop.io.compress.SplittableCompressionCodec
@Public
@Evolving
public class BZip2CodecAdapted
extends Object
implements org.apache.hadoop.conf.Configurable, org.apache.hadoop.io.compress.SplittableCompressionCodec
This class provides output and input streams for bzip2 compression
and decompression. It uses the native bzip2 library on the system
if possible, else it uses a pure-Java implementation of the bzip2
algorithm. The configuration parameter
io.compression.codec.bzip2.library can be used to control this
behavior.
In the pure-Java mode, the Compressor and Decompressor interfaces
are not implemented. Therefore, in that mode, those methods of
CompressionCodec which have a Compressor or Decompressor type
argument, throw UnsupportedOperationException.
Currently, support for splittability is available only in the
pure-Java mode; therefore, if a SplitCompressionInputStream is
requested, the pure-Java implementation is used, regardless of the
setting of the configuration parameter mentioned above.
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.hadoop.io.compress.CompressionCodec
org.apache.hadoop.io.compress.CompressionCodec.UtilNested classes/interfaces inherited from interface org.apache.hadoop.io.compress.SplittableCompressionCodec
org.apache.hadoop.io.compress.SplittableCompressionCodec.READ_MODE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.hadoop.io.compress.CompressorCreate a newCompressorfor use by thisCompressionCodec.org.apache.hadoop.io.compress.DecompressorCreate a newDecompressorfor use by thisCompressionCodec.org.apache.hadoop.io.compress.CompressionInputStreamCreate aCompressionInputStreamthat will read from the given input stream and return a stream for uncompressed data.org.apache.hadoop.io.compress.CompressionInputStreamcreateInputStream(InputStream in, org.apache.hadoop.io.compress.Decompressor decompressor) Create aCompressionInputStreamthat will read from the givenInputStreamwith the givenDecompressor, and return a stream for uncompressed data.org.apache.hadoop.io.compress.SplitCompressionInputStreamcreateInputStream(InputStream seekableIn, org.apache.hadoop.io.compress.Decompressor decompressor, long start, long end, org.apache.hadoop.io.compress.SplittableCompressionCodec.READ_MODE readMode) Creates CompressionInputStream to be used to read off uncompressed data in one of the two reading modes.org.apache.hadoop.io.compress.CompressionOutputStreamCreate aCompressionOutputStreamthat will write to the givenOutputStream.org.apache.hadoop.io.compress.CompressionOutputStreamcreateOutputStream(OutputStream out, org.apache.hadoop.io.compress.Compressor compressor) Create aCompressionOutputStreamthat will write to the givenOutputStreamwith the givenCompressor.Class<? extends org.apache.hadoop.io.compress.Compressor>Get the type ofCompressorneeded by thisCompressionCodec.org.apache.hadoop.conf.ConfigurationgetConf()Return the configuration used by this object.Class<? extends org.apache.hadoop.io.compress.Decompressor>Get the type ofDecompressorneeded by thisCompressionCodec..bz2 is recognized as the default extension for compressed BZip2 filesvoidsetConf(org.apache.hadoop.conf.Configuration conf) Set the configuration to be used by this object.
-
Constructor Details
-
BZip2CodecAdapted
public BZip2CodecAdapted()Creates a new instance of BZip2Codec. -
BZip2CodecAdapted
public BZip2CodecAdapted(int bufferSize)
-
-
Method Details
-
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf) Set the configuration to be used by this object.- Specified by:
setConfin interfaceorg.apache.hadoop.conf.Configurable- Parameters:
conf- the configuration object.
-
getConf
public org.apache.hadoop.conf.Configuration getConf()Return the configuration used by this object.- Specified by:
getConfin interfaceorg.apache.hadoop.conf.Configurable- Returns:
- the configuration object used by this objec.
-
createOutputStream
public org.apache.hadoop.io.compress.CompressionOutputStream createOutputStream(OutputStream out) throws IOException Create aCompressionOutputStreamthat will write to the givenOutputStream.- Specified by:
createOutputStreamin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Parameters:
out- the location for the final output stream- Returns:
- a stream the user can write uncompressed data to, to have it compressed
- Throws:
IOException
-
createOutputStream
public org.apache.hadoop.io.compress.CompressionOutputStream createOutputStream(OutputStream out, org.apache.hadoop.io.compress.Compressor compressor) throws IOException Create aCompressionOutputStreamthat will write to the givenOutputStreamwith the givenCompressor.- Specified by:
createOutputStreamin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Parameters:
out- the location for the final output streamcompressor- compressor to use- Returns:
- a stream the user can write uncompressed data to, to have it compressed
- Throws:
IOException
-
getCompressorType
Get the type ofCompressorneeded by thisCompressionCodec.- Specified by:
getCompressorTypein interfaceorg.apache.hadoop.io.compress.CompressionCodec- Returns:
- the type of compressor needed by this codec.
-
createCompressor
public org.apache.hadoop.io.compress.Compressor createCompressor()Create a newCompressorfor use by thisCompressionCodec.- Specified by:
createCompressorin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Returns:
- a new compressor for use by this codec
-
createInputStream
public org.apache.hadoop.io.compress.CompressionInputStream createInputStream(InputStream in) throws IOException Create aCompressionInputStreamthat will read from the given input stream and return a stream for uncompressed data.- Specified by:
createInputStreamin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Parameters:
in- the stream to read compressed bytes from- Returns:
- a stream to read uncompressed bytes from
- Throws:
IOException
-
createInputStream
public org.apache.hadoop.io.compress.CompressionInputStream createInputStream(InputStream in, org.apache.hadoop.io.compress.Decompressor decompressor) throws IOException Create aCompressionInputStreamthat will read from the givenInputStreamwith the givenDecompressor, and return a stream for uncompressed data.- Specified by:
createInputStreamin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Parameters:
in- the stream to read compressed bytes fromdecompressor- decompressor to use- Returns:
- a stream to read uncompressed bytes from
- Throws:
IOException
-
createInputStream
public org.apache.hadoop.io.compress.SplitCompressionInputStream createInputStream(InputStream seekableIn, org.apache.hadoop.io.compress.Decompressor decompressor, long start, long end, org.apache.hadoop.io.compress.SplittableCompressionCodec.READ_MODE readMode) throws IOException Creates CompressionInputStream to be used to read off uncompressed data in one of the two reading modes. i.e. Continuous or Blocked reading modes- Specified by:
createInputStreamin interfaceorg.apache.hadoop.io.compress.SplittableCompressionCodec- Parameters:
seekableIn- The InputStreamstart- The start offset into the compressed streamend- The end offset into the compressed streamreadMode- Controls whether progress is reported continuously or only at block boundaries.- Returns:
- CompressionInputStream for BZip2 aligned at block boundaries
- Throws:
IOException
-
getDecompressorType
Get the type ofDecompressorneeded by thisCompressionCodec.- Specified by:
getDecompressorTypein interfaceorg.apache.hadoop.io.compress.CompressionCodec- Returns:
- the type of decompressor needed by this codec.
-
createDecompressor
public org.apache.hadoop.io.compress.Decompressor createDecompressor()Create a newDecompressorfor use by thisCompressionCodec.- Specified by:
createDecompressorin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Returns:
- a new decompressor for use by this codec
-
getDefaultExtension
.bz2 is recognized as the default extension for compressed BZip2 files- Specified by:
getDefaultExtensionin interfaceorg.apache.hadoop.io.compress.CompressionCodec- Returns:
- A String telling the default bzip2 file extension
-