org.aksw.commons.jena.util
Class CommonProperties
java.lang.Object
org.aksw.commons.jena.util.CommonProperties
@Guarded
public class CommonProperties
- extends Object
- Author:
- Konrad Höffner
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CommonProperties
public CommonProperties()
getCommonProperties
public static LinkedHashMap<String,Integer> getCommonProperties(@NotEmpty@NotNull
String endpoint,
@NotEmpty@NotNull
String where,
@Range(min=0.0,max=1.0)
Double threshold,
@Min(value=1.0)
Integer maxResultSize,
@Min(value=1.0)
Integer sampleSize)
- For a given SPARQL where clause, creates a table of their most common properties.
Also available with a file cache:
CachedCommonProperties.
The following example shows the 5 most common properties for the where clause "?s a dbpedia-owl:Settlement".
Attention: You may only use ?s, ?p and ?o as variable names for subject, predicate and object respectively.
| p |
count |
| http://www.w3.org/1999/02/22-rdf-syntax-ns#type |
50 |
| http://www.w3.org/2000/01/rdf-schema#label |
50 |
| http://xmlns.com/foaf/0.1/page |
50 |
| http://www.w3.org/2000/01/rdf-schema#comment |
49 |
| http://purl.org/dc/terms/subject |
49 |
- Parameters:
endpoint - the URL of the SPARQL endpoint to be queriedwhere - the contents of a SPARQL select "where" clause which may only use
?s, ?p and ?o as variable names for subject, predicate and object.threshold - a value between 0 and 1, specifying what fraction of the instances must have this property
for it to be counted as common property. Set to null if you want no restriction on this.maxResultSize - a non-negative integer value, specifying the maximum amount of properties to return.sampleSize - the number of instances whose triples are examined. Set to null to look at all triples (may take a long time).
On the other hand, using a sample instead of all data may give a wrong result even for a big sample size because the sample is not random
but the selection depends on the SPARQL server (uses Virtuoso SPARQL for subqueries).
- Returns:
- the most common properties sorted by occurrence in descending order.
Each property p is counted at most once for each instance s, even if there are multiple triples (s,p,o).
Example: getCommonProperties(0.5) will only return properties which are used by at least half of the uris in the cache.
- See Also:
CachedCommonProperties
Copyright © 2012. All Rights Reserved.