Class EuclideanBlockingModule
- java.lang.Object
-
- org.aksw.limes.core.measures.mapper.space.blocking.EuclideanBlockingModule
-
- All Implemented Interfaces:
IBlockingModule
public class EuclideanBlockingModule extends Object implements IBlockingModule
- Author:
- Axel-C. Ngonga Ngomo (ngonga@informatik.uni-leipzig.de)
-
-
Constructor Summary
Constructors Constructor Description EuclideanBlockingModule(String props, String measureName, double threshold)Initializes the generator.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static ArrayList<ArrayList<Double>>addIdsToList(ArrayList<ArrayList<Double>> keys, TreeSet<String> propValues)static ArrayList<Double>copyList(ArrayList<Double> list)ArrayList<ArrayList<Integer>>getAllBlockIds(Instance a)ArrayList<ArrayList<Integer>>getAllSourceIds(Instance a, String sourceProps)ArrayList<Integer>getBlockId(Instance a)Computes the block ID for a given instance a.ArrayList<ArrayList<Integer>>getBlocksToCompare(ArrayList<Integer> blockId)Computes the ids of all the blocks surrounding a given block for comparison Will be extremely useful for parallelizing as we can use blocking on T and S as then put use locality
-
-
-
Constructor Detail
-
EuclideanBlockingModule
public EuclideanBlockingModule(String props, String measureName, double threshold)
Initializes the generator. The basic idea here is the following: First, pick a random instance origin. That is the center upon which the block ids will be computed. Each measure can return the threshold for blocking that is equivalent to the similarity threshold given in by the user. For euclidean metrics, this value is the same. Yet, for metrics that squeeze space, this might not be the case. It is important to notice that the generation assumes that the size of props.split("|") is the same as dimensions.- Parameters:
props- List of properties that make up each dimensionmeasureName- Name of the measure to be used to compute the similarity of instancesthreshold- General similarity threshold for the metric. This threshold is transformed into a distance threshold, as sim = a, d = (1 - a)/a. The space tiling is carried out according to distances, not similarities. Still, we can ensure that all points within the similarity range are found.
-
-
Method Detail
-
addIdsToList
public static ArrayList<ArrayList<Double>> addIdsToList(ArrayList<ArrayList<Double>> keys, TreeSet<String> propValues)
-
getBlockId
public ArrayList<Integer> getBlockId(Instance a)
Computes the block ID for a given instance a. The idea behind the blocking is to tile the target space into blocks of dimension thresdhold^dimensions. Each instance s from the source space is then compared with the blocks lying directly around s's block and the block where s is.- Specified by:
getBlockIdin interfaceIBlockingModule- Parameters:
a- The instance whose blockId is to be computed- Returns:
- The ID for the block of a
-
getBlocksToCompare
public ArrayList<ArrayList<Integer>> getBlocksToCompare(ArrayList<Integer> blockId)
Computes the ids of all the blocks surrounding a given block for comparison Will be extremely useful for parallelizing as we can use blocking on T and S as then put use locality- Specified by:
getBlocksToComparein interfaceIBlockingModule- Parameters:
blockId- index- Returns:
- the ids of all the blocks surrounding a given block for comparison
-
getAllBlockIds
public ArrayList<ArrayList<Integer>> getAllBlockIds(Instance a)
- Specified by:
getAllBlockIdsin interfaceIBlockingModule
-
getAllSourceIds
public ArrayList<ArrayList<Integer>> getAllSourceIds(Instance a, String sourceProps)
- Specified by:
getAllSourceIdsin interfaceIBlockingModule
-
-