How to Tokenizing Sentences and words in NLTK
NLTK Tokenizing Sentences:
You need to import nltk library with sent_tokenzie for this program. It split the sentences by using punctuations.
from nltk.tokenize import sent_tokenize text = "I love python. I love nlp" print(sent_tokenize(text))
Output:
['I love python.', 'I love nlp']
NLTK Tokenizing Words:
Same like sentence tokenize, you need to use word_tokenize function to split the words.
from nltk.tokenize import word_tokenize text = "I love python. I love nlp" print(word_tokenize(text))
Output:
['I', 'love', 'python', '.', 'I', 'love', 'nlp']