Package org.aksw.jenax.arq.util.syntax
Class QueryGenerationUtils
java.lang.Object
org.aksw.jenax.arq.util.syntax.QueryGenerationUtils
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionanalyzeDistinctVarSets(org.apache.jena.query.Query query) Checks whether a query uses DISTINCT and/or GROUP BY.static booleananyHaystackHasTheNeedle(Collection<? extends Collection<?>> haystacks, Object needle) static org.apache.jena.query.QueryconstructToLateral(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.query.QueryType outputQueryType, boolean distinct, boolean project) Convert a construct query with multiple quads in the template into one with single ?g ?s ?p ?o quad.static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCount(org.apache.jena.query.Query query) Given a query derives a new one that counts the bindings of the original one's graph patternstatic Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCount(org.apache.jena.query.Query query, Long itemLimit, Long rowLimit) static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCountCore(org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit) static org.apache.jena.query.QuerycreateQueryCountCore(org.apache.jena.sparql.core.Var resultVar, org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit) Transform a SELECT query such that it yields the count of matching solution bindings within the given constraints The rowLimit parameter only affects query that make use of DISTINCT: It adds this limit to the graph pattern such that the query pattern on which the distinct runs is limited to the given number of bindings.static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCountPartition(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> partitionVars, Long itemLimit, Long rowLimit) Count the number of distinct binding for the given variables.static org.apache.jena.query.QuerycreateQueryQuad(org.apache.jena.sparql.core.Quad quad) static org.apache.jena.query.QuerycreateQueryTriple(org.apache.jena.graph.Triple m) static booleandiscard(org.apache.jena.query.Query query) Cardinality-preserving transformation for queries with group by where all group by expressions are projected: SELECT ?groupKeys ?nonGroupKeys { ...static org.apache.jena.query.QuerydiscardUnbound(org.apache.jena.query.Query query) Injects a FILTER condition that discards 'empty' rows w.r.t.static org.apache.jena.query.Querydistinct(org.apache.jena.query.Query query) Ensure that the query's result bindings are unique.static booleaneveryHaystackHasAnyNeedle(Collection<? extends Collection<?>> haystacks, Collection<?> needles) static booleaneveryNeedleIsInAnyHaystack(Collection<? extends Collection<?>> haystacks, Collection<?> needles) static booleanhaystackHasAnyNeedle(Collection<?> haystack, Collection<?> needles) static booleanisEffectiveQueryResultStar(org.apache.jena.query.Query query) Test whether the queries distinguished variables are the same as those of SELECT * { ...static booleanneedsWrappingByFeatures(org.apache.jena.query.Query query) Returns true if the query uses features that prevents it from being represented as a pair of graph pattern + projectionstatic booleanneedsWrappingByFeatures(org.apache.jena.query.Query query, boolean includeSlice) Similar to#needsWrapping(Query)but includes a flag whether to include slice information (limit / offset).static booleanoptimizeGroupByToDistinct(org.apache.jena.query.Query query) If discardNonGroupByProjection is true than the result is NOT an equivalence transformation but a cardinality-preserving transformation: The result set size of a query transformed with this procedure is equivalent to that of the original one.static booleanoptimizeGroupByToDistinct(org.apache.jena.query.Query query, boolean discardAggregators) SELECT ?v1 ?n { } GROUP BY ?v1 ?vn ===> SELECT DISTINCT ?v1 ?vN { } If discardAggregators then the resulting query becomes one with the same number of bindings using distinct: SELECT ?v COUNT(DISTINCT ?o) {} GROUP BY ?v ===> SELECT DISTINCT ?v {}static org.apache.jena.query.Queryproject(org.apache.jena.query.Query query, boolean distinct, Collection<org.apache.jena.sparql.core.Var> vars) Return the original query or a copy of it depending on Update a query in place to project only the given variables.static org.apache.jena.query.Queryproject(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> resultVars) static List<org.apache.jena.sparql.syntax.ElementBind>quadToBinds(org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.sparql.core.Quad quad, boolean isRealQuad) Create BIND(quad[i] AS ?quadVar[i]) statements for the given quads.static voidremoveNonProjectedVars(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> allowedVars) static org.apache.jena.query.QueryvirtuosoFixForOrderedSlicing(org.apache.jena.query.Query query) A commonly needed query transformation needed for virtuoso: Using order by with limit and/or offset fails for non-small limits/offsets.static org.apache.jena.query.QuerywrapAsSubQuery(org.apache.jena.query.Query query) static org.apache.jena.query.QuerywrapAsSubQuery(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Var v)
-
Constructor Details
-
QueryGenerationUtils
public QueryGenerationUtils()
-
-
Method Details
-
virtuosoFixForOrderedSlicing
public static org.apache.jena.query.Query virtuosoFixForOrderedSlicing(org.apache.jena.query.Query query) A commonly needed query transformation needed for virtuoso: Using order by with limit and/or offset fails for non-small limits/offsets. The solution implemented by this method is to rewrite a query such that a sub-query does the order by and an outer query applies limit / offset: SELECT { } OFFSET foo ORDER BY bar becomes SELECT * { SELECT { } ORDER BY bar } OFFSET foo -
createQueryQuad
public static org.apache.jena.query.Query createQueryQuad(org.apache.jena.sparql.core.Quad quad) -
createQueryTriple
public static org.apache.jena.query.Query createQueryTriple(org.apache.jena.graph.Triple m) -
wrapAsSubQuery
public static org.apache.jena.query.Query wrapAsSubQuery(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Var v) -
wrapAsSubQuery
public static org.apache.jena.query.Query wrapAsSubQuery(org.apache.jena.query.Query query) -
createQueryCount
public static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCount(org.apache.jena.query.Query query) Given a query derives a new one that counts the bindings of the original one's graph pattern- Parameters:
query-- Returns:
-
createQueryCountPartition
public static Map.Entry<org.apache.jena.sparql.core.Var,org.apache.jena.query.Query> createQueryCountPartition(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> partitionVars, Long itemLimit, Long rowLimit) Count the number of distinct binding for the given variables. If null, all variables are considered. For SELECT queries: SELECT COUNT(*) { SELECT DISTINCT partitionVars { original-select-query } } For CONSTRUCT queries: SELECT COUNT(*) { SELECT DISTINCT partitionVars { query pattern } }- Parameters:
query-partitionVars-rowLimit-itemLimit-- Returns:
-
createQueryCount
-
everyNeedleIsInAnyHaystack
public static boolean everyNeedleIsInAnyHaystack(Collection<? extends Collection<?>> haystacks, Collection<?> needles) -
anyHaystackHasTheNeedle
public static boolean anyHaystackHasTheNeedle(Collection<? extends Collection<?>> haystacks, Object needle) -
everyHaystackHasAnyNeedle
public static boolean everyHaystackHasAnyNeedle(Collection<? extends Collection<?>> haystacks, Collection<?> needles) -
haystackHasAnyNeedle
-
distinct
public static org.apache.jena.query.Query distinct(org.apache.jena.query.Query query) Ensure that the query's result bindings are unique. In the simplest case just adds DISTINT to the projection but may do nothing if it is determined that the current projection already yields unique bindings. This is the case if a super set of a group by's expression is projected.- Parameters:
query-- Returns:
-
project
public static org.apache.jena.query.Query project(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> resultVars) -
project
public static org.apache.jena.query.Query project(org.apache.jena.query.Query query, boolean distinct, Collection<org.apache.jena.sparql.core.Var> vars) Return the original query or a copy of it depending on Update a query in place to project only the given variables. If distinct == true: Will not apply distinct if the result for the underlying variables is already distinct, e.g. applying 'DISTINCT ?s ?p ?c' to 'SELECT (expr1 AS ?s) (expr2 AS ?p) (FOO AS ?c) { } GROUP BY expr1 expr2 does not have to wrap the underlying query, as ?s ?p ?c is already distinct (in fact ?s ?p is already distinct) So the main contribution of this method is taking care of the indirection via the expressions.- Parameters:
clone-distinct-vars-
-
removeNonProjectedVars
public static void removeNonProjectedVars(org.apache.jena.query.Query query, Collection<org.apache.jena.sparql.core.Var> allowedVars) -
analyzeDistinctVarSets
public static Set<Set<org.apache.jena.sparql.core.Var>> analyzeDistinctVarSets(org.apache.jena.query.Query query) Checks whether a query uses DISTINCT and/or GROUP BY. If so then a multimap from grouping expression to projection variable is constructed and the for the set of each expression's related set of variables is returned. e AS ?x, e AS ?y will yield a {{?x ?y}}- Parameters:
query-- Returns:
- The set of distinct var combinations; empty if there are none
-
optimizeGroupByToDistinct
public static boolean optimizeGroupByToDistinct(org.apache.jena.query.Query query) If discardNonGroupByProjection is true than the result is NOT an equivalence transformation but a cardinality-preserving transformation: The result set size of a query transformed with this procedure is equivalent to that of the original one. Converts GROUP BY to DISTINCT if all group by expressions appear in the projection The projection of any non group-by expressions is removed. - If _exactly_ the group keys are projected the query can be turned to DISTINCT - Removes superfluous DISTINCT - i.e. SELECT DISTINCT ?groupKey ?derived WHERE { ... } GROUP BY ?groupKey: If a superset of the group keys are projected then distinct is not needed (as each binding is uniquely identified by the group keys)- Parameters:
query-
-
discard
public static boolean discard(org.apache.jena.query.Query query) Cardinality-preserving transformation for queries with group by where all group by expressions are projected: SELECT ?groupKeys ?nonGroupKeys { ... } GROUP BY ?groupKeys becomes SELECT DISTINCT ?groupKeys { ... }- Parameters:
query-- Returns:
-
optimizeGroupByToDistinct
public static boolean optimizeGroupByToDistinct(org.apache.jena.query.Query query, boolean discardAggregators) SELECT ?v1 ?n { } GROUP BY ?v1 ?vn ===> SELECT DISTINCT ?v1 ?vN { } If discardAggregators then the resulting query becomes one with the same number of bindings using distinct: SELECT ?v COUNT(DISTINCT ?o) {} GROUP BY ?v ===> SELECT DISTINCT ?v {}- Parameters:
query-discardAggregators- Discard all aggregators if applicable. The result is a query with the same number of bindings.- Returns:
- true iff the query was modified
-
isEffectiveQueryResultStar
public static boolean isEffectiveQueryResultStar(org.apache.jena.query.Query query) Test whether the queries distinguished variables are the same as those of SELECT * { ... } Example: SELECT * { ?s a ?t } is effectively the same as SELECT ?t ?s { ?s a ?t } Variable order does not matter- Returns:
-
createQueryCountCore
-
needsWrappingByFeatures
public static boolean needsWrappingByFeatures(org.apache.jena.query.Query query) Returns true if the query uses features that prevents it from being represented as a pair of graph pattern + projection- Parameters:
query-- Returns:
-
needsWrappingByFeatures
public static boolean needsWrappingByFeatures(org.apache.jena.query.Query query, boolean includeSlice) Similar to#needsWrapping(Query)but includes a flag whether to include slice information (limit / offset). Does not consider the DISTINCT flag!- Parameters:
query-- Returns:
-
createQueryCountCore
public static org.apache.jena.query.Query createQueryCountCore(org.apache.jena.sparql.core.Var resultVar, org.apache.jena.query.Query rawQuery, Long itemLimit, Long rowLimit) Transform a SELECT query such that it yields the count of matching solution bindings within the given constraints The rowLimit parameter only affects query that make use of DISTINCT: It adds this limit to the graph pattern such that the query pattern on which the distinct runs is limited to the given number of bindings. It does not make sense to use for queries with group by because that would alter the result SELECT COUNT(*) { SELECT DISTINCT originalProjection { SELECT * { originalGraphPattern } LIMIT rowLimit } OFFSET originalOffset LIMIT min(originalLimit, itemLimit) }- Parameters:
resultVar- The output variable (COUNT(...) AS ?resultVar)rawQuery-itemLimit- Number of bindings to consider returned by the given select queryrowLimit- Number of rows to consider within the query's graph pattern- Returns:
-
constructToLateral
public static org.apache.jena.query.Query constructToLateral(org.apache.jena.query.Query query, org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.query.QueryType outputQueryType, boolean distinct, boolean project) Convert a construct query with multiple quads in the template into one with single ?g ?s ?p ?o quad. This operation is NOT indempotent.CONSTRUCT { ?a ?b ?c . ?d ?e ?f . } WHERE { pattern }becomesCONSTRUCT { ?s ?p ?o } WHERE { SELECT DISTINCT ?s ?p ?o { # sub query is omitted if project and distinct are both false / DISTINCT is only added if distinct == true pattern LATERAL { { BIND(?a AS ?s) BIND(?b AS ?p) BIND(?c AS ?o) } UNION { BIND(?d AS ?s) BIND(?e AS ?p) BIND(?f AS ?o) } } } }- Parameters:
query- The input query which to transform (remains unchanged)quadVars- The variables to use for projecting the (g s p o) componentsoutputQueryType- Whether the resulting query is a construct query or a select one. SELECT implies project=true.project- Whether to create a sub query that only projects (g s p o)distinct- Whether to apply distinct (implies project=true)- Returns:
-
quadToBinds
public static List<org.apache.jena.sparql.syntax.ElementBind> quadToBinds(org.apache.jena.sparql.core.Quad quadVars, org.apache.jena.sparql.core.Quad quad, boolean isRealQuad) Create BIND(quad[i] AS ?quadVar[i]) statements for the given quads. Used with#constructToLateral(Query).- Parameters:
quadVars- A quad with components for use on the rhs of a bind which may thus only use variablesquad- A quad with components for use on the lhs of a bindisRealQuad- Whether to create a BIND for the graph component.- Returns:
- A list of ElementBind's.
-
discardUnbound
public static org.apache.jena.query.Query discardUnbound(org.apache.jena.query.Query query) Injects a FILTER condition that discards 'empty' rows w.r.t. the projection. Creates a copy of the original query. Wraps the inner query with SELECT ?x1 ... ?xn { { ORIGINAL Query } FILTER(BOUND(?x1) || ... || BOUND(xn)) # At least one variable must be bound }- Parameters:
query-- Returns:
-