levenshtein
levenshtein
Calculate Levenshtein distance between two strings
int **levenshtein** string $string1 string $string2 int $insertion_cost int $replacement_cost int $deletion_cost
The Levenshtein distance is defined as the minimal number of
characters you have to replace, insert or delete to transform
into .
The complexity of the algorithm is ,
where and are the
length of and
(rather good when compared to
, which is ,
but still expensive).
string1``string2``O(m*n)``n``m``string1``string2``similar_text``O(max(n,m)**3)
If ,
and/or are unequal to ,
the algorithm adapts to choose the cheapest transforms.
E.g. if ,
no replacements will be done, but rather inserts and deletions instead.
insertion_cost``replacement_cost``deletion_cost``1``$insertion_cost + $deletion_cost < $replacement_cost
string1One of the strings being evaluated for Levenshtein distance.
string2One of the strings being evaluated for Levenshtein distance.
insertion_costDefines the cost of insertion.
replacement_costDefines the cost of replacement.
deletion_costDefines the cost of deletion.
This function returns the Levenshtein-Distance between the two argument strings.
Voorbeeld: example
<?php
// input misspelled word
$input = 'carrrot';
// array of words to check against
$words = array('apple','pineapple','banana','orange',
'radish','carrot','pea','bean','potato');
// no shortest distance found, yet
$shortest = -1;
// loop through words to find the closest
foreach ($words as $word) {
// calculate the distance between the input word,
// and the current word
$lev = levenshtein($input, $word);
// check for an exact match
if ($lev == 0) {
// closest word is this one (exact match)
$closest = $word;
$shortest = 0;
// break out of the loop; we've found an exact match
break;
}
// if this distance is less than the next found shortest
// distance, OR if a next shortest word has not yet been found
if ($lev <= $shortest || $shortest < 0) {
// set the closest match, and shortest distance
$closest = $word;
$shortest = $lev;
}
}
echo "Input word: $input\n";
if ($shortest == 0) {
echo "Exact match found: $closest\n";
} else {
echo "Did you mean: $closest?\n";
}
?>
Input word: carrrot
Did you mean: carrot?
soundex``similar_text``metaphone