The word “fuzzy” means something that is indistinct or vague, something that cannot be explained precisely. We all know what “search” means. That should give you a hint of what this blog post is about. Whenever you type something into the Google search engine, you will see that it always returns good results, even when you type the wrong spelling. How does it know what you meant? There are many different ways to misspell a word. How does it know exactly what word you have in mind?
Say hello to fuzzy search
Fuzzy search refers to the act of searching for things that are approximately similar to what we are looking for. Now the definition of “approximately similar” depends on the algorithm. It’s the same as understanding what the user meant when he typed, let’s say, “ctash”. The user could have meant “clash” or “crash”. If the algorithm is good enough, it will understand the context and return the right search results.
What exactly is fuzzy search?
Fuzzy search, in our context here, is a process of locating web pages that are likely to be relevant to the search terms, even when the input does not exactly match the desired information. A fuzzy search is done by means of a fuzzy matching program. A fuzzy matching program returns a list of results based on relevance even though search argument words and spellings may not exactly match. Exact and highly relevant matches appear near the top of the list. For example, if a user types “Argintena” into a search engine, a list of hits is returned along with the question, “Did you mean Argentina?”. Alternative spellings, and words that sound the same but are spelled differently, are given.
Is it used only in search engines?
A fuzzy matching program can basically operate like a spell checker and spelling-error corrector. Does it remind you of a certain tool you use almost every single day? All the text editing softwares and related analysis tools employ fuzzy matching algorithms to handle spell checks. When you are using MS Word, you see that the spelling errors are detected and underlined. If you right-click on that, it will suggest a bunch of options for the right spelling.
A fuzzy matching program can compensate for common input typing errors, as well as errors introduced by optical character recognition (OCR) scanning of printed documents. The program can return hits with content that contains a specified base word along with prefixes and suffixes. For example, if “hand” is entered as a search word, hits occur for sites containing words such as “handcuffs” or “shorthand.” The program can also find synonyms and related terms, working like an online thesaurus or encyclopedic cross-reference tool. For example, if you enter the word “finance” in the Ask search engine, the search results contain links related to accounting, stocks, loans, etc.
Does it always work?
Fuzzy matching programs usually return a mix of relevant and irrelevant hits. Superfluous results are likely to occur for terms with multiple meanings, only one of which is the meaning the user intends. If the user has only a vague or general idea of the topic, or does not know exactly what to look for, the ratio of relevant hits to irrelevant hits tends to be low. Fuzzy searching is much more powerful than exact searching when used for research and investigation. Fuzzy searching is especially useful when researching unfamiliar, foreign-language, or sophisticated terms, the proper spellings of which are not widely known. Imagine a world where you don’t know the exact spelling of something, and the search engine doesn’t know it either! You will not be able to look something up as easily as you do today. Fuzzy searching can also be used to locate individuals based on incomplete or partially inaccurate identifying information.
Now that we understand the concept of fuzzy search, we will discuss about fuzzy matching algorithms in the next blog post.