In text analysis, _______ is a common preprocessing step to reduce the dataset to its most basic form.

Bag of Words
Lemmatization
Regularization
Tokenization

Bag of Words is a common preprocessing step in text analysis, where the dataset is represented as an unordered set of words, disregarding grammar and word order. Lemmatization, Tokenization, and Regularization are distinct processes in text analysis.

Add your answer