How to do fuzzy text matching in Python
• 1 minWhile regular expressions come at handy when you want to check whether a certain pattern is within a text, Fuzzy text matching is extremely useful when you want check whether two strings are similar.
The Levenstein Distance is a famous formula that works very well for comparing how many steps there is between two different sequences (see strings here).
e.g. house => mouse : 1 step
So to say it is the perfect algorithm to auto correct the mistakes of a user entry😉
Or have the closest answer to a user input (e.g. search engine).
Thefuzz library
In order to compute this Levenstein Distance we can use the thefuzz library.
Installation
pip3 install thefuzz[speedup]
Import
from thefuzz import fuzz
Using the simple ratio
The fuzz.ratio() method will give you a score between 0 to 100 of how similar the two strings are.
fuzz.ratio("this is a test", "this is a test!")
There are other methods than the simple ratio if you may need more, you can have a look at the github documentation. The documentation is quite straight forward.
Bravo ! You now know how to do fuzzy text matching in Python !