The document discusses a system called System U that uses computational methods to discover personality traits, values, needs, and emotional styles of individuals from their social media posts. It aims to provide personalized experiences at scale. The system analyzes text using psycholinguistic analytics and models to predict traits according to frameworks like the Big 5 personality traits. It was validated through studies showing its predictions correlated well with standard personality surveys for most people. Further field studies on Twitter showed traits could help predict who would respond to recommendations or help others.
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Computational Discovery of Personality from Social Media
1. 1
System
Computa*onal
Discovery
of
Personality
Traits
from
Social
Media
for
Individualized
Experience
Michelle
Zhou
IBM
Research,
Almaden
mzhou@us.ibm.com
2. 2
Outline
• Mo*va*on
• System
U
Overview
and
Live
Demo
• Methodology
• Valida*ons
• Summary
3. 3
“The
perfect
solu.on
is
to
serve
each
consumer
individually.
The
problem?
There
are
7
billion
of
them.”
Consumer
products
CMO,
Singapore
IBM
2011
CMO
Study
4. 4
Model
personality
traits
dis*nguishing
individuals
[Ford’
05,
O’Brien
’96,
Neuman
’99,
Gosling
’03,
Wholan’06]
Derive
personality
traits
for
hundreds
of
millions
of
individuals
Individualiza*on
at
Scale
5. 5
Lengthy
standard
psychometric
tests
Reliability
and
freshness
of
test
results
Challenges
“Welcome
to
our
store,
would
you
like
to
take
a
personality
test?”
6. 6
A
Silver
Lining
Psycholinguis*c
studies:
personality
from
text
[Tausczik
and
Pennybaker‘10,
Yarkoni
‘10]
Hundreds
of
millions
of
people
leave
text
footprints
on
social
media
“I love food, .., with … together we … in… very…happy.”
Word category: Inclusive
Agreeableness
7. 7
System
U
in
a
Nutshell
Big
5
Values
Needs
Emo4on
Style
A7tude
Psycholinguis*c
Analy*cs
InkWell
VisWell
Engagement
Recommenda*on
Personality
Portrait
Social
Media
12. 12
Discovering
Big
5
Personality
Traits
• Psychological
characteris*cs
reflec*ng
individual
differences
• Consistent
and
enduring
• Can
change
• Link
to
many
aspects
of
one’s
life
– Problem/emo*on
coping
– Rela*onship
selec*on
– Occupa*onal
proficiency
– Team
performance
– .
.
.
outgoing/energe*c
vs.
solitary/reserved
efficient/organized
vs.
easy-‐going/careless
[O’Brien
’96,
Neuman
’99,
Gosling
’03,
Wholan’06]
13. Discovering
Fundamental
Needs
[Ford,
2005]
• Fundamental
needs
are
universal
[Aaker
1995,
Maslow
1943]
• Oken
change
with
life
events
• Link
to
many
aspects
of
one’s
life
• Brand/product
choices
• Occupa*onal
choices
• .
.
.
18. 18
Online
Predic*on
of
Personality
Traits
from
Text
Predica*ve
Models
Personality
Traits
Social
Media
Posts
Big
5
Values
Needs
Emo*onal
Style
Aptude
…
“…
great
to
have
a
chauffer
who
can
help
us
accomplish
our
goals
…”
Chauffeur
Accomplish
Goal
Special
License
…
Ideal
0.37
0.94
0.23
0.35
0.13
…
1
1
1
0
0
…
19. 19
Online
Predic*on
of
Personality
Traits
from
Text
Addi*onal
processing
– Normalize
counts
with
total
words
– Linear
combina*on
of
counts
with
learned
derived
co-‐
efficient
to
compute
trait
scores
– Normalize
trait
scores
to
give
percen*le
scores
“…
great
to
have
a
chauffer
who
can
help
us
accomplish
our
goals
…”
Chauffeur
Accomplish
Goal
Special
License
…
Ideal
0.37
0.94
0.23
0.35
0.13
…
1
1
1
0
0
…
21. How
good
are
our
results
compared
to
standard
psychometric
studies?
How
well
can
our
results
be
used
to
predict
or
influence
one’s
behavior?
22. System
U
vs.
Standard
Surveys
• Par*cipants
– Invited
1325
Twicer
users
at
IBM,
650
responded,
and
256
completed
• Method
– Par*cipants
took
three
sets
of
psychometric
tests
• 50-‐item
Big
5
(IPIP),
26-‐item
basic
values
(Schwartz),
and
52-‐item
fundamental
needs
(our
own)
– Par*cipants
rated
how
well
each
type
of
the
derived
trait
matches
with
their
percep*on
of
themselves
23. Results
• RV-‐Coefficient
correla*on
analysis
of
each
type
of
trait
• Over
80%
of
popula*on,
their
correla*on
is
sta.s.cally
significant
(80.8%,
98.21%,
and
86.6%
for
Big
5
personality,
basic
values
and
needs)
[Gou
et
al.
CHI
2014]
24. Field
Studies
on
Twicer
Who
are
more
likely
to
behave
as
asked
and
how?
– Respond
to
recommended
services
(“ads”)
– Answer
strangers’
ques*ons
– Help
strangers
spread
informa*on
(e.g.,
SOS)
26. Study
1:
Who
Will
Respond
to
Ads
Social
message
Fine
Lifestyle
message
Fun
message
27. Study
1:
Who
Will
Respond
to
Ads
Method
– Iden*fied
7290
Twicer
users
who
twicer
about
traveling
to
NYC
in
the
near
future
– Computed
personality
traits
for
each
iden*fied
user
– Sent
one
of
the
three
messages
via
Twicer
to
each
person
28. Study
1:
Who
Will
Respond
to
Ads
Results
• Rela*onships
between
traits
and
responses
– Avg
response
rates
for
some
top-‐matched
are
impressive
(e.g.,
top
25%
Extrovert
for
social
msg
CTR
8.65,
following
9.12,
and
RFR
5.66)
• Certain
personality
traits
resulted
in
significantly
higher
successful
responses
– A
combina*on
of
high
openness
and
low
neuro*cism
presented
31%
and
45%
increase
in
clicking
and
following
rates
29. Study
2:
Who
Will
Answer
Ques*ons
[Mahmud
et
al.,
IUI
2013]
Method
– Model
a
person’s
ability,
willingness,
and
readiness
to
answer
ques*ons
– Predict
one’s
likelihood
to
respond
– Op*miza*on-‐based
approach
to
answerer
selec*on
30. Study
2:
Who
Will
Answer
Ques*ons
[Mahmud
et
al.,
IUI
2013]
Experiment
Results
– Iden*fied
500
Twicer
users
each
for
two
domains
– Sent
requests
to
100
random
users,
used
our
work
to
select
100
among
the
remaining
400
users
– Compared
random,
baseline,
and
ours
TSA-‐tracker-‐1
TSA-‐tracker-‐2
Product
Baseline
42%
33%
31%
Live
Experiment Random
Selec4on Our
Algorithm
TSA-‐Tracker-‐1 29% 66%
Product 26% 60%
31. Study
2:
Who
Will
Spread
Informa*on
and
When
Method
– Modeled
core
features
of
an
“informa*on
spreader”
• Willingness,
readiness,
ac*vity
*me
pacern
– Predicted
the
likelihood
to
respond
and
*me-‐to-‐act
[Lee
et
al.,
IUI
2014]
32. Study
2:
Who
Will
Spread
Informa*on
and
When
[Lee
et
al.,
IUI
2014]
Experiment
Results
– Randomly
selected
426
candidates
who
had
recently
tweeted
about
“bird
flu”
in
July
2013
– Each
approach
selected
top
100
candidates
Approach
Retwee4ng
Rate
Random
People
Contact
4%
Popular
People
Contact
9%
Our
Approach
19%
Approach
Retwee4ng
Rate
Random
People
Contact
4%
Popular
People
Contact
8.7%
Our
Predic*on
Approach
18%
Our
Approach
+
Wait
*me
model
18.5%
33. 33
Key
Applica*ons
Marke*ng
Determine
who,
what,
how,
and
when
to
target
Customer
Care
Agent-‐Customer
match
making
Real-‐*me
agent
assistant
Smarter
Workforce
Recruitment
Talent
iden*fica*on
and
development
Risk
iden*fica*on
and
mi*ga*on
34. 34
Summary
• Psycholinguis*c
analysis
derives
deep
understanding
of
individuals
at
scale
• Derived
personality
traits
can
be
used
to
predict
and
influence
individuals’
behavior
in
the
real
world
• Far-‐reaching
implica*ons
on
crea*ng
hyper-‐
personalized
social
recommender
systems
35. 35
Acknowledgement
• Jilin
Chen
• Eben
Habor
• Liang
Gou
• Jalal
Mahmud
• Nimrod
Megiddo
• Jeff
Nichols
• Aditya
Pal
• Jerre
Schoudt
• Barton
Smith
• Ying
Xuan
• Huahai
Yang
• Hernan
Badenes
• Mateo
Nicolas
Bengualid
• Richard
Gabriel
• Huiji
Gao
• Chris
Kau
• Mengdie
Hu
• Kyumin
Lee
• Tara
Machews
• Ruogu
Yang
• Tom
Zimmerman
36. 36
References
• Chen,
J.,
Hsieh,
G.,
Mahmud,
J.,
and
Nichols,
J.
Understanding
individuals
personal
values
from
social
media
word
use.
In
ACM
Proc.
CSCW
’2014.
• Ford,
J.
K.
Brands
Laid
Bare.
John
Wiley
&
Sons,
2005.
• Gou,
L.,
Zhou,
M.X.,
and
Yang,
H.
KnowMe
and
ShareMe:
Understanding
automa*cally
discovered
personality
traits
from
social
media
and
user
sharing
preferences.
In
ACM
Proc.
CHI
2014.
• Lee,
K.,
Mahmud,
J.,
Chen,
J.,
Zhou,
M.X.,
and
Nichols,
J.
Who
will
retweet
this?
Automa*cally
iden*fying
and
engaging
strangers
on
Twicer
to
spread
informa*on.
In
ACM
Proc.
IUI
‘2014.
• Luo,
L.,
Wang,
F.,
Zhou,
M.X.,
Pan,
X.,
and
Chen,
H.
Who’s
got
answers?
Growing
the
pool
of
answerers
in
a
smart
enterprise
Social
Q&A
system.
In
ACM
Proc.
IUI
‘2014.
• Mahmud,
J.,
Zhou,
M.X.,
Megiddo,
N.,
Nichols,
J.,
and
Drews,
C.
Recommending
Targeted
Strangers
from
Whom
to
Solicit
Informa*on
in
Twicer.
In
ACM
Proc.
IUI
‘2013.
• Schwartz,
S.
H.
Basic
human
values:
Theory,
measurement,
and
applica.ons.
Revue
francaise
de
sociologie,
2006.
• Tausczik,
Y.
R.,
and
Pennebaker,
J.
W.
The
psychological
meaning
of
words:
LIWC
and
computerized
text
analysis
methods.
Journal
of
Language
and
Social
Psychology
29,
1
(2010),
24–54.
• Yang,
H.,
and
Li,
Y.
Iden*fying
user
needs
from
social
media.
IBM
Tech.
Report
(2013).
• Yarkoni,
T.
Personality
in
100,000
words:
A
large-‐scale
analysis
of
personality
and
word
use
among
bloggers.
J.
research
in
personality
44,
3
(2010),
363–373.