Interface IStringMeasure

    • Method Detail

      • getPrefixLength

        int getPrefixLength​(int tokensNumber,
                            double threshold)
        Length of prefix to consider when mapping the input string with other strings.
        Parameters:
        tokensNumber - Size of input string in
        threshold - Similarity threshold
        Returns:
        Prefix length
      • getMidLength

        int getMidLength​(int tokensNumber,
                         double threshold)
        Theshold for the length of the tokens to be indexed
        Parameters:
        tokensNumber - Number of tokens of current input
        threshold - Similarity threshold
        Returns:
        Length of tokens to be indexed
      • getSizeFilteringThreshold

        double getSizeFilteringThreshold​(int tokensNumber,
                                         double threshold)
      • getAlpha

        int getAlpha​(int xTokensNumber,
                     int yTokensNumber,
                     double threshold)
        Threshold for the positional filtering
        Parameters:
        xTokensNumber - Size of the first input string
        yTokensNumber - Size of the first input string
        threshold - Similarity threshold
        Returns:
        Threshold for positional filtering
      • getSimilarity

        double getSimilarity​(int overlap,
                             int lengthA,
                             int lengthB)
        Returns the similarity of two strings given their length and the overlap. Useful when these values are known so that no computation of known values have to be carried out anew
        Parameters:
        overlap - Overlap of strings A and B
        lengthA - Length of A
        lengthB - Length of B
        Returns:
        Similarity of A and B
      • computableViaOverlap

        boolean computableViaOverlap()
        Returns true if this similarity function can be computed just via the getSimilarity(overlag, lengthA, lengthB)
        Returns:
        True if it's possible, else false;