PHP.nl

levenshtein

levenshtein

Calculate Levenshtein distance between two strings

int **levenshtein** string $string1 string $string2 int $insertion_cost int $replacement_cost int $deletion_cost

The Levenshtein distance is defined as the minimal number of characters you have to replace, insert or delete to transform into . The complexity of the algorithm is , where and are the length of and (rather good when compared to , which is , but still expensive). string1``string2``O(m*n)``n``m``string1``string2``similar_text``O(max(n,m)**3)

If , and/or are unequal to , the algorithm adapts to choose the cheapest transforms. E.g. if , no replacements will be done, but rather inserts and deletions instead. insertion_cost``replacement_cost``deletion_cost``1``$insertion_cost + $deletion_cost < $replacement_cost

string1One of the strings being evaluated for Levenshtein distance.

string2One of the strings being evaluated for Levenshtein distance.

insertion_costDefines the cost of insertion.

replacement_costDefines the cost of replacement.

deletion_costDefines the cost of deletion.

This function returns the Levenshtein-Distance between the two argument strings.

Voorbeeld: example

<?php
// input misspelled word
$input = 'carrrot';

// array of words to check against
$words  = array('apple','pineapple','banana','orange',
                'radish','carrot','pea','bean','potato');

// no shortest distance found, yet
$shortest = -1;

// loop through words to find the closest
foreach ($words as $word) {

    // calculate the distance between the input word,
    // and the current word
    $lev = levenshtein($input, $word);

    // check for an exact match
    if ($lev == 0) {

        // closest word is this one (exact match)
        $closest = $word;
        $shortest = 0;

        // break out of the loop; we've found an exact match
        break;
    }

    // if this distance is less than the next found shortest
    // distance, OR if a next shortest word has not yet been found
    if ($lev <= $shortest || $shortest < 0) {
        // set the closest match, and shortest distance
        $closest  = $word;
        $shortest = $lev;
    }
}

echo "Input word: $input\n";
if ($shortest == 0) {
    echo "Exact match found: $closest\n";
} else {
    echo "Did you mean: $closest?\n";
}

?>
Input word: carrrot
Did you mean: carrot?

soundex``similar_text``metaphone