|
|
Up |
|
|
  |
Author: sftrimansftriman Date: Apr 3, 2008 02:47
I am looking for a way to, without custom defining a dictionary, to
get a list of suggested words for a misspelled word. Or better, "the"
most likely intended word for a misspelled word.
My base case to consider is:
dmr wjite saddle
which refers to a brand (DMR) and color (white) of a bike part
(saddle).
Ideally, dmr would return no suggestion, and wjite would return the
string "white" though I could certainly understand why "write" is
equally good a suggestion. I would be willing to define an add-on
dictionary to ignore certain words, such as brands and abbreviations
which are known to me, such as DMR, so that is possible to handle.
ispell -a yields:
|
| Show full article (1.55Kb) |
|
| | 4 Comments |
|
  |
Author: David FilmerDavid Filmer Date: Apr 3, 2008 12:29
sftriman wrote:
> get a list of suggested words for a misspelled word. Or better, "the"
> most likely intended word for a misspelled word.
|
| |
|
| | no comments |
|
  |
Author: Joost DiepenmaatJoost Diepenmaat Date: Apr 3, 2008 12:34
sftriman yahoo.com> writes:
> I am looking for a way to, without custom defining a dictionary, to
> get a list of suggested words for a misspelled word. Or better, "the"
> most likely intended word for a misspelled word.
|
| |
| no comments |
|
  |
Author: Ben BullockBen Bullock Date: Apr 3, 2008 23:27
On Apr 3, 6:47 pm, sftriman yahoo.com> wrote:
> I am looking for a way to, without custom defining a dictionary, to
> get a list of suggested words for a misspelled word. Or better, "the"
> most likely intended word for a misspelled word.
> from which I could easily pass on the dmr suggestions, but, scoring
> and evaluating the suggestions for wjite is harder. "white" and
> "write" are 'ranked' (I guess) 3rd, 4th, and 7th.
|
| |
| no comments |
|
  |
|
|
  |
Author: Ted ZlatanovTed Zlatanov Date: Apr 4, 2008 08:58
On Thu, 3 Apr 2008 23:27:56 -0700 (PDT) Ben Bullock gmail.com> wrote:
BB> On Apr 3, 6:47 pm, sftriman yahoo.com> wrote:
>> I am looking for a way to, without custom defining a dictionary, to
>> get a list of suggested words for a misspelled word. Or better, "the"
>> most likely intended word for a misspelled word.
>> from which I could easily pass on the dmr suggestions, but, scoring
>> and evaluating the suggestions for wjite is harder. "white" and
>> "write" are 'ranked' (I guess) 3rd, 4th, and 7th.
BB> One thing which might help you rank the strings is the "Levenshtein
BB> distance". This gives you the "difference" between two strings as a
BB> number. I don't know if it is on CPAN but there is a module found
BB> here:
BB> The documentation is here:
BB> Presumably the string with the smallest Levenshtein distance from the
BB> input string would be the most likely candidate for the spelling
BB> checker, although some very rare words might have small distances.
|
| Show full article (1.54Kb) |
| no comments |
|
RELATED THREADS |
  |
|
|
|