Class RDFDataMgrEx

java.lang.Object
org.aksw.jenax.sparql.query.rx.RDFDataMgrEx

public class RDFDataMgrEx extends Object
Extensions to help open an InputStream of unknown content using probing against languages registered to the Jena riot system. This includes languages based on triples, quads and result sets. Support for further types may be added in the future.
Author:
Claus Stadler, Dec 18, 2018
  • Field Details

    • DEFAULT_PROBE_LANGS

      public static final List<org.apache.jena.riot.Lang> DEFAULT_PROBE_LANGS
  • Constructor Details

    • RDFDataMgrEx

      public RDFDataMgrEx()
  • Method Details

    • toString

      public static String toString(org.apache.jena.rdf.model.Model model, org.apache.jena.riot.RDFFormat rdfFormat)
    • toString

      public static String toString(org.apache.jena.query.Dataset dataset, org.apache.jena.riot.RDFFormat rdfFormat)
    • isStdIn

      public static boolean isStdIn(String filenameOrIri)
    • getLang

      public static org.apache.jena.riot.Lang getLang(org.apache.jena.atlas.web.TypedInputStream tin)
      Map a TypedInputStream's media type to a Lang
      Parameters:
      tin -
      Returns:
    • read

      public static void read(org.apache.jena.rdf.model.Model model, org.apache.jena.atlas.web.TypedInputStream tin)
    • forceBuffered

      public static org.apache.jena.atlas.web.TypedInputStream forceBuffered(org.apache.jena.atlas.web.TypedInputStream tin)
      Return a TypedInputStream whose underlying InputStream supports marks If the original one already supports it it is returned as is.
      Parameters:
      tin -
      Returns:
    • forceBuffered

      public static InputStream forceBuffered(InputStream in)
    • wrapInputStream

      public static org.apache.jena.atlas.web.TypedInputStream wrapInputStream(InputStream in, org.apache.jena.atlas.web.TypedInputStream proto)
      Wrap an InputStream as a TypedInputStream based on the attributes of the latter
      Parameters:
      in -
      proto -
      Returns:
    • decode

      public static InputStream decode(InputStream in, List<String> codecs) throws org.apache.commons.compress.compressors.CompressorException
      Throws:
      org.apache.commons.compress.compressors.CompressorException
    • decode

      public static InputStream decode(InputStream in, List<String> codecs, org.apache.commons.compress.compressors.CompressorStreamFactory csf) throws org.apache.commons.compress.compressors.CompressorException
      Decode a given input stream based on a sequence of codec names.
      Parameters:
      in -
      codecs -
      csf -
      Returns:
      Throws:
      org.apache.commons.compress.compressors.CompressorException
    • encode

      public static OutputStream encode(OutputStream out, List<String> codecs, org.apache.commons.compress.compressors.CompressorStreamFactory csf) throws org.apache.commons.compress.compressors.CompressorException
      Throws:
      org.apache.commons.compress.compressors.CompressorException
    • encoder

      public static Function<OutputStream,OutputStream> encoder(String... codecs)
    • encoder

      public static Function<OutputStream,OutputStream> encoder(List<String> codecs)
    • encoder

      public static Function<OutputStream,OutputStream> encoder(org.apache.commons.compress.compressors.CompressorStreamFactory csf, List<String> codecs)
    • probeEncodings

      public static InputStream probeEncodings(InputStream is, List<String> outEncodings) throws IOException
      Given an input stream with mark support, attempt to create a decoded input stream. The returned stream will be ready for further use with all detected encodings added to outEncodings.
      Parameters:
      is - An input stream that must have mark support
      outEncodings - Output argument. Detected encodings will be added to that list (if not null).
      Throws:
      IOException
    • probeEntityInfo

      public static RdfEntityInfo probeEntityInfo(Callable<InputStream> inSupp, Iterable<org.apache.jena.riot.Lang> candidates) throws IOException
      Probe an input stream for any encodings (e.g. using compression codecs) and its eventual content type.
       try (InputStream in = ...) {
         EntityInfo entityInfo = probeEntityInfo(in, RDFDataMgrEx.DEFAULT_PROBE_LANGS);
       }
       
      Parameters:
      in -
      candidates -
      Returns:
      Throws:
      IOException
    • probeEntityInfo

      public static RdfEntityInfo probeEntityInfo(Path path, Iterable<org.apache.jena.riot.Lang> candidates) throws IOException
      Throws:
      IOException
    • probeEntityInfo

      public static RdfEntityInfo probeEntityInfo(InputStream in, Iterable<org.apache.jena.riot.Lang> candidates) throws IOException
      This method closes the input stream when it returns.
      Throws:
      IOException
    • probeLang

      public static org.apache.jena.atlas.web.TypedInputStream probeLang(InputStream in, Iterable<org.apache.jena.riot.Lang> candidates, Collection<Map.Entry<org.apache.jena.riot.Lang,Throwable>> errorCollector)
    • probeLang

      public static org.apache.jena.atlas.web.TypedInputStream probeLang(InputStream in, Iterable<org.apache.jena.riot.Lang> candidates)
      Determine the RDF content of the given input stream. The returned input stream buffers the given stream if needed. Only the returned stream should be used after using this function. The following example shows how to obtain a Lang from the probing result:
       TypedInputStream tin = RDFDataMgrEx.probeLang(in, RDFDataMgrEx.DEFAULT_PROBE_LANGS);
       Lang lang = RDFLanguages.contentTypeToLang(tis.getContentType());
       
    • setDefaultMark

      public static void setDefaultMark(InputStream in)
    • probeLang

      public static org.apache.jena.atlas.web.TypedInputStream probeLang(InputStream in, Iterable<org.apache.jena.riot.Lang> candidates, boolean tryAllCandidates, Collection<Map.Entry<org.apache.jena.riot.Lang,Throwable>> errorCollector)
      Probe the content of the input stream against a given set of candidate languages. Wraps the input stream as a BufferedInputStream and can thus also probe on STDIN. This is also the reason why the method does not take an InputStream supplier as argument. The result is a TypedInputStream which combines the BufferedInputStream with content type information
      Parameters:
      in -
      candidates -
      tryAllCandidates - If true do not accept the first successful candidate; instead try all candidates and pick the one that yields most data
      Returns:
    • probeLang

      public static org.apache.jena.atlas.web.TypedInputStream probeLang(InputStream in, Iterable<org.apache.jena.riot.Lang> candidates, boolean tryAllCandidates)
    • peek

      public static void peek(InputStream in)
    • open

      public static org.apache.jena.atlas.web.TypedInputStream open(String src, Iterable<org.apache.jena.riot.Lang> probeLangs, Collection<Map.Entry<org.apache.jena.riot.Lang,Throwable>> errorCollector)
      Attempts to open the given src and probe for the content type Src may be '-' but not NULL in order to refer to STDIN.
      Parameters:
      src -
      probeLangs -
      Returns:
    • open

      public static org.apache.jena.atlas.web.TypedInputStream open(String src, Iterable<org.apache.jena.riot.Lang> probeLangs)
    • open

      public static org.apache.jena.atlas.web.TypedInputStream open(Path path, Iterable<org.apache.jena.riot.Lang> probeLangs, Collection<Map.Entry<org.apache.jena.riot.Lang,Throwable>> errorCollector)
      open via nio
    • open

      public static org.apache.jena.atlas.web.TypedInputStream open(Path path, Iterable<org.apache.jena.riot.Lang> probeLangs)
    • probeForSpecificLang

      public static org.apache.jena.atlas.web.TypedInputStream probeForSpecificLang(org.apache.jena.atlas.web.TypedInputStream result, Iterable<org.apache.jena.riot.Lang> probeLangs, Collection<Map.Entry<org.apache.jena.riot.Lang,Throwable>> errorCollector)
    • parseTrigAgainstDataset

      public static org.apache.jena.query.Dataset parseTrigAgainstDataset(org.apache.jena.query.Dataset dataset, org.apache.jena.shared.PrefixMapping prefixMapping, InputStream in)
    • parseTurtleAgainstModel

      public static org.apache.jena.rdf.model.Model parseTurtleAgainstModel(org.apache.jena.rdf.model.Model model, org.apache.jena.shared.PrefixMapping prefixMapping, InputStream in)
      Parse the input stream as turtle, thereby prepending a serialization of the given prefix mapping. This is a workaround for Jena's riot framework - especially RDFParser - apparently not supporting injecting a prefix mapping.
      Parameters:
      model -
      prefixMapping -
      in -
      Returns:
    • prependWithPrefixes

      public static InputStream prependWithPrefixes(InputStream in, org.apache.jena.shared.PrefixMapping prefixMapping)
      Convenience method to prepend prefixes to an input stream (in turtle syntax)
      Parameters:
      in -
      prefixMapping -
      Returns:
    • prependWithPrefixes

      public static InputStream prependWithPrefixes(InputStream in, org.apache.jena.shared.PrefixMapping prefixMapping, org.apache.jena.riot.RDFFormat fmt)
      Convenience method to prepend prefixes to an input stream (in a given format)
      Parameters:
      in -
      prefixMapping -
      Returns:
    • prependWithPrefixes

      public static org.apache.jena.atlas.web.TypedInputStream prependWithPrefixes(org.apache.jena.atlas.web.TypedInputStream in, org.apache.jena.shared.PrefixMapping prefixMapping)
    • newParserBuilderForReadAsGiven

      public static org.apache.jena.riot.RDFParserBuilder newParserBuilderForReadAsGiven(String baseIri)
      Return a preconfigured parser builder that retains blank node ids and relative IRIs
    • readAsGiven

      public static org.apache.jena.graph.Graph readAsGiven(org.apache.jena.graph.Graph graph, String uri)
    • readAsGiven

      public static org.apache.jena.rdf.model.Model readAsGiven(org.apache.jena.rdf.model.Model model, String uri)
    • readAsGiven

      public static org.apache.jena.sparql.core.DatasetGraph readAsGiven(org.apache.jena.sparql.core.DatasetGraph datasetGraph, String uri)
    • readAsGiven

      public static org.apache.jena.query.Dataset readAsGiven(org.apache.jena.query.Dataset dataset, String uri)
    • loadModelAsGiven

      public static org.apache.jena.rdf.model.Model loadModelAsGiven(String uri)
    • readAsGiven

      public static org.apache.jena.sparql.core.DatasetGraph readAsGiven(org.apache.jena.sparql.core.DatasetGraph datasetGraph, String uri, String baseIri)
    • readAsGiven

      public static org.apache.jena.query.Dataset readAsGiven(org.apache.jena.query.Dataset dataset, String uri, String baseIri)
    • readAsGiven

      public static org.apache.jena.sparql.core.DatasetGraph readAsGiven(org.apache.jena.sparql.core.DatasetGraph datasetGraph, InputStream in, org.apache.jena.riot.Lang lang)
    • readAsGiven

      public static org.apache.jena.query.Dataset readAsGiven(org.apache.jena.query.Dataset dataset, InputStream in, org.apache.jena.riot.Lang lang)
    • loadDatasetAsGiven

      public static org.apache.jena.query.Dataset loadDatasetAsGiven(String uri, String baseIri)
    • writeAsGiven

      public static void writeAsGiven(OutputStream out, org.apache.jena.rdf.model.Model model, org.apache.jena.riot.RDFFormat rdfFormat, String baseIri)
    • writeAsGiven

      public static void writeAsGiven(OutputStream out, org.apache.jena.query.Dataset dataset, org.apache.jena.riot.RDFFormat rdfFormat, String baseIri)
    • writeAsGiven

      public static void writeAsGiven(OutputStream out, org.apache.jena.sparql.core.DatasetGraph datasetGraph, org.apache.jena.riot.RDFFormat rdfFormat, String baseIri)
    • printParseRoundtrip

      public static org.apache.jena.query.Dataset printParseRoundtrip(org.apache.jena.query.Dataset dataset, org.apache.jena.riot.RDFFormat rdfFormat, org.apache.jena.query.Dataset result)
      Serialize a dataset in memory and return its deserialized version.
      Parameters:
      dataset - The input dataset.
      rdfFormat - The serialization format. Its lang is used for deserialization.
      result - The dataset to write to. If null then a new default dataset is created.
      Returns:
      The dataset obtained from deserialization.
    • printParseRoundtrip

      public static org.apache.jena.rdf.model.Model printParseRoundtrip(org.apache.jena.rdf.model.Model model, org.apache.jena.riot.RDFFormat rdfFormat, org.apache.jena.rdf.model.Model result)
      Serialize a model in memory and return its deserialized version.
      Parameters:
      model - The input model.
      rdfFormat - The serialization format. Its lang is used for deserialization.
      result - The model to write to. If null then a new default model is created.
      Returns:
      The model obtained from deserialization.
    • tryLoadResourceWithProperty

      public static <T extends org.apache.jena.rdf.model.RDFNode> Optional<T> tryLoadResourceWithProperty(String src, org.apache.jena.rdf.model.Property p, Class<T> clazz)
      Attempt to load a resource as an RDF model and locate the single resource with a given property.