How to Tokenizing Sentences and words in NLTK

Spread the love

NLTK Tokenizing Sentences:

You need to import nltk library with sent_tokenzie for this program. It split the sentences by using punctuations.

from nltk.tokenize import sent_tokenize

text = "I love python. I love nlp"

print(sent_tokenize(text))

Output:

['I love python.', 'I love nlp']

NLTK Tokenizing Words:

Same like sentence tokenize, you need to use word_tokenize function to split the words.

from nltk.tokenize import word_tokenize

text = "I love python. I love nlp"

print(word_tokenize(text))

Output:

['I', 'love', 'python', '.', 'I', 'love', 'nlp']

Top 10 Advantages of Natural Language Processing(NLP)