Finding Discriminating Terms
This process is called Feature Selection
Eliminate stop words (e.g., “the”, “a”, etc.)
Apply Zipf’s Law to remove infrequent terms
Novel methods for feature selection using information theory
6
# Words
Frequency of appearance
Previous slide
Next slide
Back to first slide
View graphic version