For Reducing morphological variations and grouping words to one common root
JIRA CODE – JJ-134
It is the process of grouping together the different inflected forms of a word so they can be analysed as a single item. Lemmatization is similar to stemming but it brings context to the words. So it links words with similar meaning to one word.
Text preprocessing includes both Stemming as well as Lemmatization. Many times people find these two terms confusing. Some treat these two as same. Actually, lemmatization is preferred over Stemming because lemmatization does morphological analysis of the words.
1: Used in search engines.
2: Used in compact indexing
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
print("rocks :", lemmatizer.lemmatize("rocks"))
print("better :", lemmatizer.lemmatize("better", pos ="a"))
Output:
rocks : rock
better : good