SlideShare a Scribd company logo
1 of 28
NLTK
Alberts Pumpurs
90% of world's data generated over last two years
common
Internet
user
creates
Visual Textual
Instagram
Flickr
Vscocam
Facebook
Tumblr
Blogger
Twitter
Facebook
Emails
Costumer Reviews
Detecting hidden signals
World is full of unstructured, text-rich
data. Everything from emails to
customer tweets.
The information buried in all that
text holds the potential to deliver
valuable business insights
Text analytics is the practice of using
technology to gather, store and mine
textual information for hidden signals
that can be used to inform smarter
business decisions
An explosion of
unstructured
data
Many types of organizations are
experiencing explosive growth in their
unstructured enterprise data.
Same time that they have access to
external sources of data such as social
media, blogs, and mobile data.
Until now, much of this information
passed through the organization virtually
unanalyzed. Today, new tools for
handling large amounts of complex data
makes it easier to squeeze value from
such unlikely sources.
Text Processing
use cases
sentiment analysis
spam filtering
text categorization
topic detection
keyword frequency
plagiatism detection
document similarity
phrase extraction
Natural Language
Tool Kit
leading platform for building
Python programs to work with
human language data
NLTK Features
sentence and word tokenization
text calsification
corpora
parsing
clustring
part of speach tagging
text stemming
and mutch more..
Sentence
tokenization
Word
tokenization
Part of speech
tagging
Part of speech
tagging explanation
CC Coordinating conjunctin
CD Cardinal Number
DT Determiner
EX Existing “ there“
FW Foreign word
IN Preposition or subordination conjuction
JJ Adjective
JJR Adjective- comparative
JJS Adjective- superlative
LS List item marker
MD Modal
NN Noun- singular or mass
NNS Non-Plural
NP Proper noun- singular
nltk.help.upenn_tagset() //all tag sets
Chunking and NER
Text
clasification
Algorithms in
NLTK
Naive Bayes
Maximum Entropy
Decision Tree
Text clasification
Sentiment analysis
https://github.com/pumpurs/SentimentWordsLV/
Document similarity
detection
Tf-idf stands for term frequency-inverse document frequency, and the tf-idf weight is
a weight often used in information retrieval and text mining. This weight is a statistical
measure used to evaluate how important a word is to a document in a collection or
corpus.
Similarity and
concordance
Dispersion Plot
“Market and product reserch”
“Social CMS”
1.97 b social network users
“Costumer profiling / analytics”
70% of marketers used Facebook to gain
6.7 million people blog on blogging sites
pumpurs.alberts@gmail.com
Big Data, Startups, Text Analysis, Internet of Things, Web Development

More Related Content

What's hot

The Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information RetrievalThe Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information RetrievalTony Russell-Rose
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
 
Natural Language Processing and Machine Learning
Natural Language Processing and Machine LearningNatural Language Processing and Machine Learning
Natural Language Processing and Machine LearningKarthik Sankar
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingCloudxLab
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analyticsshanbady
 
September 2021: Top10 Cited Articles in Natural Language Computing
September 2021: Top10 Cited Articles in Natural Language ComputingSeptember 2021: Top10 Cited Articles in Natural Language Computing
September 2021: Top10 Cited Articles in Natural Language Computingkevig
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)VenkateshMurugadas
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts
 
Natural language processing using python
Natural language processing using pythonNatural language processing using python
Natural language processing using pythonPrakash Anand
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games ResearchJose Zagal
 
NLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyNLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyoutsider2
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processingMinh Pham
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 wordsananth
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 

What's hot (20)

The Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information RetrievalThe Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information Retrieval
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
NLP
NLPNLP
NLP
 
Natural Language Processing and Machine Learning
Natural Language Processing and Machine LearningNatural Language Processing and Machine Learning
Natural Language Processing and Machine Learning
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to NLTK
Introduction to NLTKIntroduction to NLTK
Introduction to NLTK
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
 
September 2021: Top10 Cited Articles in Natural Language Computing
September 2021: Top10 Cited Articles in Natural Language ComputingSeptember 2021: Top10 Cited Articles in Natural Language Computing
September 2021: Top10 Cited Articles in Natural Language Computing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
OpenNLP demo
OpenNLP demoOpenNLP demo
OpenNLP demo
 
Natural language processing using python
Natural language processing using pythonNatural language processing using python
Natural language processing using python
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
NLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyNLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easy
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Similar to Python NLTK

Text mining introduction-1
Text mining   introduction-1Text mining   introduction-1
Text mining introduction-1Sumit Sony
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantinimaxfalc
 
Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text AnalyticsSeth Grimes
 
Text Analytics Overview, 2011
Text Analytics Overview, 2011Text Analytics Overview, 2011
Text Analytics Overview, 2011Seth Grimes
 
Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning  	Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning   sstose
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsDerek Kane
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsIJCERT JOURNAL
 
Natural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxNatural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxAlyaaMachi
 
Copy of 10text (2)
Copy of 10text (2)Copy of 10text (2)
Copy of 10text (2)Uma Se
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining TechniquesHouw Liong The
 
Fast and accurate sentiment classification us and naive bayes model b516001
Fast and accurate sentiment classification  us and naive bayes model b516001Fast and accurate sentiment classification  us and naive bayes model b516001
Fast and accurate sentiment classification us and naive bayes model b516001Abhisek Sahoo
 
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...KristiLBurns
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperJohn Felahi
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rYanchang Zhao
 
Phrase Based Indexing and Information Retrivel
Phrase Based Indexing and Information RetrivelPhrase Based Indexing and Information Retrivel
Phrase Based Indexing and Information Retrivelbalaabirami
 
Phrase Based Indexing
Phrase Based IndexingPhrase Based Indexing
Phrase Based Indexingbalaabirami
 
ShortStory_bioCaster.pptx
ShortStory_bioCaster.pptxShortStory_bioCaster.pptx
ShortStory_bioCaster.pptxDeviPriyaRavi2
 

Similar to Python NLTK (20)

Text mining introduction-1
Text mining   introduction-1Text mining   introduction-1
Text mining introduction-1
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantini
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
 
Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text Analytics
 
Text Analytics Overview, 2011
Text Analytics Overview, 2011Text Analytics Overview, 2011
Text Analytics Overview, 2011
 
Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning  	Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning  
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews
 
Natural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxNatural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptx
 
Copy of 10text (2)
Copy of 10text (2)Copy of 10text (2)
Copy of 10text (2)
 
Web and text
Web and textWeb and text
Web and text
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining Techniques
 
Fast and accurate sentiment classification us and naive bayes model b516001
Fast and accurate sentiment classification  us and naive bayes model b516001Fast and accurate sentiment classification  us and naive bayes model b516001
Fast and accurate sentiment classification us and naive bayes model b516001
 
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
 
Week12
Week12Week12
Week12
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-r
 
Phrase Based Indexing and Information Retrivel
Phrase Based Indexing and Information RetrivelPhrase Based Indexing and Information Retrivel
Phrase Based Indexing and Information Retrivel
 
Phrase Based Indexing
Phrase Based IndexingPhrase Based Indexing
Phrase Based Indexing
 
ShortStory_bioCaster.pptx
ShortStory_bioCaster.pptxShortStory_bioCaster.pptx
ShortStory_bioCaster.pptx
 

Python NLTK