NLP - Spell Checker

A Spell checker (or spell corrector) is an application, program or a feature of a program that checks for misspellings in a text and offers possible solutions (candidate words). [1]

A basic spell checker carries out the following processes:

1. Spell Checker from Scratch

Create dictionary/vocabulary from a book

Create Spell Checker model

Based on Peter Norvig’s 21-line spelling corrector using probability theory. [2]

Note: If you want to see Spell Checker algorithm in the VB.net language, click here.

Test Spell Checker model

2. Spell Checker using PySpellChecker

Pure Python Spell Checking based on Peter Norvig’s blog post on setting up a simple spell checking algorithm. [3]

It uses a Levenshtein Distance algorithm to find permutations within an edit distance of 2 from the original word. It then compares all permutations (insertions, deletions, replacements, and transpositions) to known words in a word frequency list. Those words that are found more often in the frequency list are more likely the correct results. [4]

English model

Spanish model

Reference

[1] Wikipedia - Spell Checker.
[2] Peter Norvig Spell Correct project.
[3] Pypi - PySpellChecker project site.
[4] PySpellChecker PDF manual.


« Home