Beyond Sentiment
Mining Social Media
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
 Introduction
–
Text Analytics & Sentiment Analysis
 Expertise Analysis
–
Basic Level Categories
– Categorization of Expertise
 Social Behavior Predictions
–
Distinguishing Action from Expression
 Social Media – Wisdom of Crowds
– Cloud Sourcing technical support
 Questions
2
KAPS Group: General





Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 8-10
Partners – SAS, Smart Logic, Microsoft-FAST, Concept Searching, etc.
Consulting, Strategy, Knowledge architecture audit
Services:
– Text Analytics evaluation, development, consulting, customization
– Knowledge Representation – taxonomy, ontology, Prototype
– Metadata standards and implementation
– Knowledge Management: Collaboration, Expertise, e-learning
– Applied Theory – Faceted taxonomies, complexity theory, natural
categories
3
Introduction to Text Analytics
Text Analytics Features
 Text Extraction (Noun phrase, themes, parts of speech)
–
Catalogs with variants, rule based dynamic
– Multiple types, custom classes – entities, concepts, events
 Fact Extraction
Relationships of entities – people-organizations-activities
– Ontologies – triples, RDF, etc. // Disambiguation
–
 Auto-categorization – Build on a Taxonomy
Training sets – Bayesian, Vector space
– Boolean– Full search syntax – AND, OR, NOT, DIST#, SENT
This is the most difficult to develop
Foundation for all applications
–


4
5
Case Study – Categorization & Sentiment
6
Case Study – Categorization & Sentiment
7
Text Analytics and Text Mining
Data and Unstructured Content
 80% of content is unstructured – adding to semantic web is major
 Text Analytics – content into data
–
Big Data meets Big Content
 Real integration of text and ontology
– Beyond “hasDescription”
– Improve accuracy of extracted entities, facts – disambiguation
• Pipeline – oil & gas OR research / Ford
– Add Concepts, not just “Things” – 68% want this
 Semantic Web + Text Analytics = real world value
 Linked Data + Text Analytics – best of both worlds
 Build superior foundation elements – taxonomies, categorization
8
Sentiment Analysis
Development Process
 Combination of Statistical and categorization rules
 Start with Training sets – examples of positive, negative,






neutral documents (find good examples – forums, etc.)
Develop a Statistical Model
Generate domain positive and negative words and phrases
Develop a taxonomy of Products & Features
Develop rules for positive and negative statements
Test and Refine
Test and Refine again
9
10
11
12
13
14
Expertise Analysis
Basic Level Categories
 Levels: Superordinate – Basic – Subordinate
–
–






Mammal – Dog – Golden Retriever
Furniture – chair – kitchen chair
Mid-level in a taxonomy / hierarchy
Short and easy words, similarly perceived shapes
Maximum distinctness and expressiveness
Most commonly used labels
First level named and understood by children
Level at which most of our knowledge is organized
15
Basic Level Categories and Expertise
 Experts prefer lower, subordinate levels
–
–
Novice prefer higher, superordinate levels
General Populace prefers basic level
 Expertise Characterization for individuals, communities,
documents, and sets of documents
 Experts chunk series of actions, ideas, etc.
–
–
–
Novice – high level only
Intermediate – steps in the series
Expert – special language – based on deep connections
 Types of expert – technical, strategic
16
Expertise Analysis
Analytical Techniques
 Corpus context dependent
–
Author748 – is general in scientific health care context,
advanced in news health care context
 Need to generate overall expertise level for a corpus
 Also contextual rules
–
–
“Tests” is general, high level
“Predictive value of tests” is lower, more expert
 Develop expertise rules – similar to categorization rules
– Use basic level for subject
– Superordinate for general, subordinate for expert
17
Expertise Analysis
Application areas
 Business & Customer intelligence / Social Media
–
–
Combine with sentiment analysis – finer evaluation – what are
experts saying, what are novices saying
Deeper research into communities, customers
 Enterprise Content Management
–
At publish time, software automatically gives an expertise
level – present to author for validation
 Expertise location
–
Generate automatic expertise characterization based on
authored documents
18
Beyond Sentiment
Behavior Prediction – Case Study
 Telecommunications Customer Service
 Problem – distinguish customers likely to cancel from mere
threats
 Analyze customer support notes
 General issues – creative spelling, second hand reports
 Develop categorization rules
–
–
–
First – distinguish cancellation calls – not simple
Second - distinguish cancel what – one line or all
Third – distinguish real threats
19
Beyond Sentiment
Behavior Prediction – Case Study
 Basic Rule
–
(START_20, (AND,
–
(DIST_7,"[cancel]", "[cancel-what-cust]"),
– (NOT,(DIST_10, "[cancel]", (OR, "[one-line]", "[restore]", “[if]”)))))
 Examples:
–
customer called to say he will cancell his account if the does not stop
receiving a call from the ad agency.
– cci and is upset that he has the asl charge and wants it off or her is going
to cancel his act
– ask about the contract expiration date as she wanted to cxl teh acct
Combine sophisticated rules with sentiment statistical training
20
Beyond Sentiment - Wisdom of Crowds
Cloud / Crowd Sourcing Technical Support
 Example – Android User Forum
 Develop a taxonomy of products, features, problem areas
 Develop Categorization Rules:
–
–
–
Find product & feature – forum structure
Find problem areas in response
Nearby Text for solution
 Automatic – simply expose lists of “solutions”
–
Search Based application
 Human mediated – experts scan and clean up solutions
21
Beyond Sentiment - Wisdom of Crowds
Cloud / Crowd Sourcing Technical Support
 Quote:
 Originally Posted by jersey221
 you either need to be rooted and download a screenshot app from the
market like picme,shootme.or download the android sdk and use that..im
not quite sure about the sdk method.
 I use the SDK method and it isn't to bad a all. I'll get some pics up later, I
am still trying to get the time to update from fresh 1.0 to 1.1.
 Device(s): Fresh 2.1.1
 Thanks: 36
 Thanked 37 Times in 26 Posts
22
Beyond Sentiment - Wisdom of Crowds
Cloud / Crowd Sourcing Technical Support
 Quote: Originally Posted by jersey221
 its not on the marketplace its called taps of fire
 here's a download for it when you download it put it on your sd card then
look for it on a file manager like es file explorer
 or astro on you phone then click it and open in manager or something
like that and then install it and you should be good.
 TapsOfFire104.apk - tapsoffire - Taps Of Fire (1.0.4) - Project Hosting on
Google Code
 i am guessing my phone needs to be rooted for something like this to
happen.
 Device(s): rooted htc hero with fresh 1.1 rom
 Thanks: 21 - Thanked 3 Times in 3 Posts
23
Beyond Sentiment
Conclusions
 Text Analytics turns text into data – semantic web,
predictive analytics
 Sentiment Analysis needs good categorization
 Expertise Analysis can add a new dimension to sentiment
–
More sophisticated Voice of the Customer
 Multiple Applications from Expertise analysis – search, BI,
CI, Enterprise Content Management, Expertise Location
 New Directions – Behavior Prediction, Crowd Sourcing, ?
 Text Analytics needs Cognitive Science
–
Not just library science or data modeling or ontology
24
Questions?
Tom Reamy
[email protected]
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Resources
 Books
–
Women, Fire, and Dangerous Things
• George Lakoff
–
Knowledge, Concepts, and Categories
• Koen Lamberts and David Shanks
–
Formal Approaches in Categorization
• Ed. Emmanuel Pothos and Andy Wills
–
The Mind
• Ed John Brockman
• Good introduction to a variety of cognitive science theories,
issues, and new ideas
–
Any cognitive science book written after 2009
26
Resources
 Conferences – Web Sites
–
–
–
–
–
–
Text Analytics World
http://www.textanalyticsworld.com
Text Analytics Summit
http://www.textanalyticsnews.com
Semtech
http://www.semanticweb.com
27
Resources
 Blogs
–
SAS- http://blogs.sas.com/text-mining/
 Web Sites
–
–
–
–
–
Taxonomy Community of Practice:
http://finance.groups.yahoo.com/group/TaxoCoP/
LindedIn – Text Analytics Summit Group
http://www.LinkedIn.com
Whitepaper – CM and Text Analytics http://www.textanalyticsnews.com/usa/contentmanagementm
eetstextanalytics.pdf
Whitepaper – Enterprise Content Categorization strategy and
development – http://www.kapsgroup.com
28
Resources
 Articles
–
–
–
–
Malt, B. C. 1995. Category coherence in cross-cultural
perspective. Cognitive Psychology 29, 85-148
Rifkin, A. 1985. Evidence for a basic level in event
taxonomies. Memory & Cognition 13, 538-56
Shaver, P., J. Schwarz, D. Kirson, D. O’Conner 1987.
Emotion Knowledge: further explorations of prototype
approach. Journal of Personality and Social Psychology 52,
1061-1086
Tanaka, J. W. & M. E. Taylor 1991. Object categories and
expertise: is the basic level in the eye of the beholder?
Cognitive Psychology 23, 457-82
29
Descargar

Taxonomy Development Workshop