dimanche 28 juin 2015

Pyhton/RegEx: Exclude StopWords

I have such list:

stopwords = ['a', 'and', 'is']

and such sentence:

sentence = 'A Mule is Eating and drinking'

Expected output:

reduced = ['mule', 'eating', 'drinking']

I have so far:

reduced = filter(None, re.match(r'\W+', sentence.lower()))

Now how would you filter out the stopwords?

Edit: Note the upper to lowercase conversion

Aucun commentaire:

Enregistrer un commentaire