Class LinearSelfConfigurator

  • All Implemented Interfaces:
    ISelfConfigurator
    Direct Known Subclasses:
    BooleanSelfConfigurator

    public class LinearSelfConfigurator
    extends Object
    implements ISelfConfigurator
    Author:
    Axel-C. Ngonga Ngomo (ngonga@informatik.uni-leipzig.de), Mohamed Sherif (sherif@informatik.uni-leipzig.de), Klaus Lyko (lyko@informatik.uni-leipzig.de)
    • Field Detail

      • STRICT

        public boolean STRICT
      • ITERATIONS_MAX

        public int ITERATIONS_MAX
      • MIN_THRESHOLD

        public double MIN_THRESHOLD
      • source

        public ACache source
      • target

        public ACache target
      • learningRate

        public double learningRate
      • kappa

        public double kappa
      • min_coverage

        public double min_coverage
    • Constructor Detail

      • LinearSelfConfigurator

        public LinearSelfConfigurator​(ACache source,
                                      ACache target)
        Constructor
        Parameters:
        source - Source cache
        target - Target cache
      • LinearSelfConfigurator

        public LinearSelfConfigurator​(ACache source,
                                      ACache target,
                                      double minCoverage,
                                      double beta,
                                      Map<String,​String> measures)
        Constructor
        Parameters:
        source - Source cache
        target - Target cache
        minCoverage - Minimal coverage for a property to be considered for linking
        beta - Beta value for computing F_beta
        measures - Atomic measures
      • LinearSelfConfigurator

        public LinearSelfConfigurator​(ACache source,
                                      ACache target,
                                      double minCoverage,
                                      double beta)
        Constructor
        Parameters:
        source - Source cache
        target - Target cache
        beta - Beta value for computing F_beta
        minCoverage - Minimal coverage for a property to be considered for linking
    • Method Detail

      • setPFMType

        public void setPFMType​(LinearSelfConfigurator.QMeasureType qMeasureType)
        Set PFMs based upon name. if name.equals("reference") using ReferencePseudoMeasures.class: Nikolov/D'Aquin/Motta ESWC 2012.
      • setMeasure

        public void setMeasure​(IQualitativeMeasure pfm)
        Set PFMs based upon name. if name.equals("reference") using ReferencePseudoMeasures.class: Nikolov/D'Aquin/Motta ESWC 2012.
      • setDefaultMeasures

        public void setDefaultMeasures()
        set default atomic measures
      • getPropertyStats

        public static Map<String,​Double> getPropertyStats​(ACache c,
                                                                double minCoverage)
        Extracts all properties from a cache that have a coverage beyond minCoverage
        Parameters:
        c - Input cache
        minCoverage - Threshold for coverage
        Returns:
        Map of property to coverage
      • getAllInitialClassifiers

        public List<SimpleClassifier> getAllInitialClassifiers()
        Computes all initial classifiers that compare properties whose coverage is beyong the coverage threshold
        Returns:
        A map of sourceProperty to targetProperty to Classifier
      • getBestInitialClassifiers

        public List<SimpleClassifier> getBestInitialClassifiers()
        Computes the best initial mapping for each source property
        Returns:
        List of classifiers that each contain the best initial mappings
      • getBestInitialClassifiers

        public List<SimpleClassifier> getBestInitialClassifiers​(Set<String> measureList)
        Computes the best initial mapping for each source property
        Parameters:
        measureList - Define Measures to be used by their name: eg. "jaccard", "levenshtein", "trigrams", "cosine", ...
        Returns:
        List of classifiers that each contain the best initial mappings
      • getMapping

        public AMapping getMapping​(List<SimpleClassifier> classifiers)
        Runs classifiers and retrieves the corresponding mappings
        Parameters:
        classifiers - List of classifiers
        Returns:
        AMapping generated by the list of classifiers
      • getOverallMapping

        public AMapping getOverallMapping​(Map<SimpleClassifier,​AMapping> mappings,
                                          double threshold)
        Computes the weighted linear combination of the similarity computed by the single classifiers
        Parameters:
        mappings - Maps classifiers to their results
        threshold - Similarity threshold for exclusion
        Returns:
        Resulting overall mapping
      • getInitialOverallClassifiers

        public List<SimpleClassifier> getInitialOverallClassifiers​(List<SimpleClassifier> classifiers)
        Updates the weights of the classifiers such that they map the initial conditions for a classifier
        Parameters:
        classifiers - Input classifiers
        Returns:
        Normed classifiers
      • computeNext

        public double computeNext​(List<SimpleClassifier> classifiers,
                                  int index)
        Aims to improve upon a particular classifier by checking whether adding a delta to its similarity worsens the total classifer
      • executeClassifier

        public AMapping executeClassifier​(SimpleClassifier c,
                                          double threshold)
        Runs a classifier and get the mappings for it
        Parameters:
        c - Classifier
        threshold - Threshold for similarities
        Returns:
        Corresponding mapping
      • execute

        public AMapping execute​(String sourceProperty,
                                String targetProperty,
                                String measure,
                                double threshold)
        Runs measure(sourceProperty, targetProperty) >= threshold
        Parameters:
        sourceProperty - Source property
        targetProperty - Target property
        measure - Similarity measure
        threshold - Similarity threshold
        Returns:
        Correspoding AMapping
      • getBestOneToOneMapping

        public AMapping getBestOneToOneMapping​(AMapping m)
        Gets the best target for each source and returns it
        Parameters:
        m -
        Returns:
      • computeQuality

        public Double computeQuality​(AMapping map)
        Method to compute quality of a mapping. Uses per default the specified PFM qMeasure. TODO: active learning variant.
        Parameters:
        map -
        Returns:
      • setSupervisedBatch

        public void setSupervisedBatch​(AMapping reference)
        Set caches to trimmed caches according to the given reference mapping.
        Parameters:
        reference -
      • getSource

        public ACache getSource()
      • setSource

        public void setSource​(ACache source)
      • getTarget

        public ACache getTarget()
      • setTarget

        public void setTarget​(ACache target)