Class QueryGenerationUtils

java.lang.Object
org.aksw.jenax.arq.util.syntax.QueryGenerationUtils

public class QueryGenerationUtils extends Object
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static Set<Set<org.apache.jena.sparql.core.Var>>
    analyzeDistinctVarSets(org.apache.jena.query.Query query)
    Checks whether a query uses DISTINCT and/or GROUP BY.
    static boolean
    anyHaystackHasTheNeedle(Collection<? extends Collection<?>> haystacks, Object needle)
     
    static org.apache.jena.query.Query
    constructToLateral(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.query.QueryType outputQueryType, boolean distinct, boolean project)
    Convert a construct query with multiple quads in the template into one with single ?g ?s ?p ?o quad.
    static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query>
    createQueryCount(org.apache.jena.query.Query query)
    Given a query derives a new one that counts the bindings of the original one's graph pattern
    static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query>
    createQueryCount(org.apache.jena.query.Query query, Long itemLimit, Long rowLimit)
     
    static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query>
    createQueryCountCore(org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit)
     
    static org.apache.jena.query.Query
    createQueryCountCore(org.apache.jena.sparql.core.Var resultVar, org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit)
    Transform a SELECT query such that it yields the count of matching solution bindings within the given constraints The rowLimit parameter only affects query that make use of DISTINCT: It adds this limit to the graph pattern such that the query pattern on which the distinct runs is limited to the given number of bindings.
    static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query>
    createQueryCountPartition(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> partitionVars, Long itemLimit, Long rowLimit)
    Count the number of distinct binding for the given variables.
    static org.apache.jena.query.Query
    createQueryQuad(org.apache.jena.sparql.core.Quad quad)
     
    static org.apache.jena.query.Query
    createQueryTriple(org.apache.jena.graph.Triple m)
     
    static boolean
    discard(org.apache.jena.query.Query query)
    Cardinality-preserving transformation for queries with group by where all group by expressions are projected: SELECT ?groupKeys ?nonGroupKeys { ...
    static org.apache.jena.query.Query
    discardUnbound(org.apache.jena.query.Query query)
    Injects a FILTER condition that discards 'empty' rows w.r.t.
    static org.apache.jena.query.Query
    distinct(org.apache.jena.query.Query query)
    Ensure that the query's result bindings are unique.
    static boolean
    everyHaystackHasAnyNeedle(Collection<? extends Collection<?>> haystacks, Collection<?> needles)
     
    static boolean
    everyNeedleIsInAnyHaystack(Collection<? extends Collection<?>> haystacks, Collection<?> needles)
     
    static boolean
    haystackHasAnyNeedle(Collection<?> haystack, Collection<?> needles)
     
    static boolean
    isEffectiveQueryResultStar(org.apache.jena.query.Query query)
    Test whether the queries distinguished variables are the same as those of SELECT * { ...
    static boolean
    needsWrappingByFeatures(org.apache.jena.query.Query query)
    Returns true if the query uses features that prevents it from being represented as a pair of graph pattern + projection
    static boolean
    needsWrappingByFeatures(org.apache.jena.query.Query query, boolean includeSlice)
    Similar to #needsWrapping(Query) but includes a flag whether to include slice information (limit / offset).
    static boolean
    optimizeGroupByToDistinct(org.apache.jena.query.Query query)
    If discardNonGroupByProjection is true than the result is NOT an equivalence transformation but a cardinality-preserving transformation: The result set size of a query transformed with this procedure is equivalent to that of the original one.
    static boolean
    optimizeGroupByToDistinct(org.apache.jena.query.Query query, boolean discardAggregators)
    SELECT ?v1 ?n { } GROUP BY ?v1 ?vn ===> SELECT DISTINCT ?v1 ?vN { } If discardAggregators then the resulting query becomes one with the same number of bindings using distinct: SELECT ?v COUNT(DISTINCT ?o) {} GROUP BY ?v ===> SELECT DISTINCT ?v {}
    static org.apache.jena.query.Query
    project(org.apache.jena.query.Query query, boolean distinct, Collection<org.apache.jena.sparql.core.Var> vars)
    Return the original query or a copy of it depending on Update a query in place to project only the given variables.
    static org.apache.jena.query.Query
    project(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> resultVars)
     
    static List<org.apache.jena.sparql.syntax.ElementBind>
    quadToBinds(org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.sparql.core.Quad quad, boolean isRealQuad)
    Create BIND(quad[i] AS ?quadVar[i]) statements for the given quads.
    static void
    removeNonProjectedVars(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> allowedVars)
     
    static org.apache.jena.query.Query
    virtuosoFixForOrderedSlicing(org.apache.jena.query.Query query)
    A commonly needed query transformation needed for virtuoso: Using order by with limit and/or offset fails for non-small limits/offsets.
    static org.apache.jena.query.Query
    wrapAsSubQuery(org.apache.jena.query.Query query)
     
    static org.apache.jena.query.Query
    wrapAsSubQuery(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Var v)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • QueryGenerationUtils

      public QueryGenerationUtils()
  • Method Details

    • virtuosoFixForOrderedSlicing

      public static org.apache.jena.query.Query virtuosoFixForOrderedSlicing(org.apache.jena.query.Query query)
      A commonly needed query transformation needed for virtuoso: Using order by with limit and/or offset fails for non-small limits/offsets. The solution implemented by this method is to rewrite a query such that a sub-query does the order by and an outer query applies limit / offset: SELECT { } OFFSET foo ORDER BY bar becomes SELECT * { SELECT { } ORDER BY bar } OFFSET foo
    • createQueryQuad

      public static org.apache.jena.query.Query createQueryQuad(org.apache.jena.sparql.core.Quad quad)
    • createQueryTriple

      public static org.apache.jena.query.Query createQueryTriple(org.apache.jena.graph.Triple m)
    • wrapAsSubQuery

      public static org.apache.jena.query.Query wrapAsSubQuery(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Var v)
    • wrapAsSubQuery

      public static org.apache.jena.query.Query wrapAsSubQuery(org.apache.jena.query.Query query)
    • createQueryCount

      public static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCount(org.apache.jena.query.Query query)
      Given a query derives a new one that counts the bindings of the original one's graph pattern
      Parameters:
      query -
      Returns:
    • createQueryCountPartition

      public static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCountPartition(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> partitionVars, Long itemLimit, Long rowLimit)
      Count the number of distinct binding for the given variables. If null, all variables are considered. For SELECT queries: SELECT COUNT(*) { SELECT DISTINCT partitionVars { original-select-query } } For CONSTRUCT queries: SELECT COUNT(*) { SELECT DISTINCT partitionVars { query pattern } }
      Parameters:
      query -
      partitionVars -
      rowLimit -
      itemLimit -
      Returns:
    • createQueryCount

      public static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCount(org.apache.jena.query.Query query, Long itemLimit, Long rowLimit)
    • everyNeedleIsInAnyHaystack

      public static boolean everyNeedleIsInAnyHaystack(Collection<? extends Collection<?>> haystacks, Collection<?> needles)
    • anyHaystackHasTheNeedle

      public static boolean anyHaystackHasTheNeedle(Collection<? extends Collection<?>> haystacks, Object needle)
    • everyHaystackHasAnyNeedle

      public static boolean everyHaystackHasAnyNeedle(Collection<? extends Collection<?>> haystacks, Collection<?> needles)
    • haystackHasAnyNeedle

      public static boolean haystackHasAnyNeedle(Collection<?> haystack, Collection<?> needles)
    • distinct

      public static org.apache.jena.query.Query distinct(org.apache.jena.query.Query query)
      Ensure that the query's result bindings are unique. In the simplest case just adds DISTINT to the projection but may do nothing if it is determined that the current projection already yields unique bindings. This is the case if a super set of a group by's expression is projected.
      Parameters:
      query -
      Returns:
    • project

      public static org.apache.jena.query.Query project(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> resultVars)
    • project

      public static org.apache.jena.query.Query project(org.apache.jena.query.Query query, boolean distinct, Collection<org.apache.jena.sparql.core.Var> vars)
      Return the original query or a copy of it depending on Update a query in place to project only the given variables. If distinct == true: Will not apply distinct if the result for the underlying variables is already distinct, e.g. applying 'DISTINCT ?s ?p ?c' to 'SELECT (expr1 AS ?s) (expr2 AS ?p) (FOO AS ?c) { } GROUP BY expr1 expr2 does not have to wrap the underlying query, as ?s ?p ?c is already distinct (in fact ?s ?p is already distinct) So the main contribution of this method is taking care of the indirection via the expressions.
      Parameters:
      clone -
      distinct -
      vars -
    • removeNonProjectedVars

      public static void removeNonProjectedVars(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> allowedVars)
    • analyzeDistinctVarSets

      public static Set<Set<org.apache.jena.sparql.core.Var>> analyzeDistinctVarSets(org.apache.jena.query.Query query)
      Checks whether a query uses DISTINCT and/or GROUP BY. If so then a multimap from grouping expression to projection variable is constructed and the for the set of each expression's related set of variables is returned. e AS ?x, e AS ?y will yield a {{?x ?y}}
      Parameters:
      query -
      Returns:
      The set of distinct var combinations; empty if there are none
    • optimizeGroupByToDistinct

      public static boolean optimizeGroupByToDistinct(org.apache.jena.query.Query query)
      If discardNonGroupByProjection is true than the result is NOT an equivalence transformation but a cardinality-preserving transformation: The result set size of a query transformed with this procedure is equivalent to that of the original one. Converts GROUP BY to DISTINCT if all group by expressions appear in the projection The projection of any non group-by expressions is removed. - If _exactly_ the group keys are projected the query can be turned to DISTINCT - Removes superfluous DISTINCT - i.e. SELECT DISTINCT ?groupKey ?derived WHERE { ... } GROUP BY ?groupKey: If a superset of the group keys are projected then distinct is not needed (as each binding is uniquely identified by the group keys)
      Parameters:
      query -
    • discard

      public static boolean discard(org.apache.jena.query.Query query)
      Cardinality-preserving transformation for queries with group by where all group by expressions are projected: SELECT ?groupKeys ?nonGroupKeys { ... } GROUP BY ?groupKeys becomes SELECT DISTINCT ?groupKeys { ... }
      Parameters:
      query -
      Returns:
    • optimizeGroupByToDistinct

      public static boolean optimizeGroupByToDistinct(org.apache.jena.query.Query query, boolean discardAggregators)
      SELECT ?v1 ?n { } GROUP BY ?v1 ?vn ===> SELECT DISTINCT ?v1 ?vN { } If discardAggregators then the resulting query becomes one with the same number of bindings using distinct: SELECT ?v COUNT(DISTINCT ?o) {} GROUP BY ?v ===> SELECT DISTINCT ?v {}
      Parameters:
      query -
      discardAggregators - Discard all aggregators if applicable. The result is a query with the same number of bindings.
      Returns:
      true iff the query was modified
    • isEffectiveQueryResultStar

      public static boolean isEffectiveQueryResultStar(org.apache.jena.query.Query query)
      Test whether the queries distinguished variables are the same as those of SELECT * { ... } Example: SELECT * { ?s a ?t } is effectively the same as SELECT ?t ?s { ?s a ?t } Variable order does not matter
      Returns:
    • createQueryCountCore

      public static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCountCore(org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit)
    • needsWrappingByFeatures

      public static boolean needsWrappingByFeatures(org.apache.jena.query.Query query)
      Returns true if the query uses features that prevents it from being represented as a pair of graph pattern + projection
      Parameters:
      query -
      Returns:
    • needsWrappingByFeatures

      public static boolean needsWrappingByFeatures(org.apache.jena.query.Query query, boolean includeSlice)
      Similar to #needsWrapping(Query) but includes a flag whether to include slice information (limit / offset). Does not consider the DISTINCT flag!
      Parameters:
      query -
      Returns:
    • createQueryCountCore

      public static org.apache.jena.query.Query createQueryCountCore(org.apache.jena.sparql.core.Var resultVar, org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit)
      Transform a SELECT query such that it yields the count of matching solution bindings within the given constraints The rowLimit parameter only affects query that make use of DISTINCT: It adds this limit to the graph pattern such that the query pattern on which the distinct runs is limited to the given number of bindings. It does not make sense to use for queries with group by because that would alter the result SELECT COUNT(*) { SELECT DISTINCT originalProjection { SELECT * { originalGraphPattern } LIMIT rowLimit } OFFSET originalOffset LIMIT min(originalLimit, itemLimit) }
      Parameters:
      resultVar - The output variable (COUNT(...) AS ?resultVar)
      rawQuery -
      itemLimit - Number of bindings to consider returned by the given select query
      rowLimit - Number of rows to consider within the query's graph pattern
      Returns:
    • constructToLateral

      public static org.apache.jena.query.Query constructToLateral(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.query.QueryType outputQueryType, boolean distinct, boolean project)
      Convert a construct query with multiple quads in the template into one with single ?g ?s ?p ?o quad. This operation is NOT indempotent.
       CONSTRUCT {
         ?a ?b ?c .
         ?d ?e ?f .
       } WHERE {
         pattern
       }
       
      becomes
       CONSTRUCT {
         ?s ?p ?o
       }
       WHERE {
         SELECT DISTINCT ?s ?p ?o { # sub query is omitted if project and distinct are both false / DISTINCT is only added if distinct == true
           pattern
           LATERAL {
             { BIND(?a AS ?s) BIND(?b AS ?p) BIND(?c AS ?o) }
             UNION { BIND(?d AS ?s) BIND(?e AS ?p) BIND(?f AS ?o) }
           }
         }
       }
       
      Parameters:
      query - The input query which to transform (remains unchanged)
      quadVars - The variables to use for projecting the (g s p o) components
      outputQueryType - Whether the resulting query is a construct query or a select one. SELECT implies project=true.
      project - Whether to create a sub query that only projects (g s p o)
      distinct - Whether to apply distinct (implies project=true)
      Returns:
    • quadToBinds

      public static List<org.apache.jena.sparql.syntax.ElementBind> quadToBinds(org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.sparql.core.Quad quad, boolean isRealQuad)
      Create BIND(quad[i] AS ?quadVar[i]) statements for the given quads. Used with #constructToLateral(Query).
      Parameters:
      quadVars - A quad with components for use on the rhs of a bind which may thus only use variables
      quad - A quad with components for use on the lhs of a bind
      isRealQuad - Whether to create a BIND for the graph component.
      Returns:
      A list of ElementBind's.
    • discardUnbound

      public static org.apache.jena.query.Query discardUnbound(org.apache.jena.query.Query query)
      Injects a FILTER condition that discards 'empty' rows w.r.t. the projection. Creates a copy of the original query. Wraps the inner query with SELECT ?x1 ... ?xn { { ORIGINAL Query } FILTER(BOUND(?x1) || ... || BOUND(xn)) # At least one variable must be bound }
      Parameters:
      query -
      Returns: