What Makes Online Content Viral
What Makes Online Content Viral
What Makes Online Content Viral
MILKMAN*
Why are certain pieces of online content (e.g., advertisements, videos,
news articles) more viral than others? This article takes a psychological
approach to understanding diffusion. Using a unique data set of all the
New York Times articles published over a three-month period, the authors
examine how emotion shapes virality. The results indicate that positive
content is more viral than negative content, but the relationship between
emotion and social transmission is more complex than valence alone.
Virality is partially driven by physiological arousal. Content that evokes
high-arousal positive (awe) or negative (anger or anxiety) emotions is
more viral. Content that evokes low-arousal, or deactivating, emotions
(e.g., sadness) is less viral. These results hold even when the authors
control for how surprising, interesting, or practically useful content is (all
of which are positively linked to virality), as well as external drivers of
attention (e.g., how prominently content was featured). Experimental
results further demonstrate the causal impact of specific emotion on
transmission and illustrate that it is driven by the level of activation
induced. Taken together, these findings shed light on why people share
content and how to design more effective viral marketing campaigns.
Keywords: word of mouth, viral marketing, social transmission, online
content
that interpersonal communication affects attitudes and decision making (Asch 1956; Katz and Lazarsfeld 1955), and
recent work has demonstrated the causal impact of word of
mouth on product adoption and sales (Chevalier and Mayzlin 2006; Godes and Mayzlin 2009).
Although it is clear that social transmission is both frequent and important, less is known about why certain pieces
of online content are more viral than others. Some customer
service experiences spread throughout the blogosphere,
while others are never shared. Some newspaper articles earn
a position on their websites most e-mailed list, while others languish. Companies often create online ad campaigns
or encourage consumer-generated content in the hope that
people will share this content with others, but some of these
efforts take off while others fail. Is virality just random, as
some argue (e.g., Cashmore 2009), or might certain characteristics predict whether content will be highly shared?
This article examines how content characteristics affect
virality. In particular, we focus on how emotion shapes
social transmission. We do so in two ways. First, we analyze
a unique data set of nearly 7000 New York Times articles to
examine which articles make the newspapers most emailed list. Controlling for external drivers of attention,
such as where an article was featured online and for how
long, we examine how contents valence (i.e., whether an
192
One reason people may share stories, news, and information is because they contain useful information. Coupons or
articles about good restaurants help people save money and
eat better. Consumers may share such practically useful
content for altruistic reasons (e.g., to help others) or for selfenhancement purposes (e.g., to appear knowledgeable, see
Wojnicki and Godes 2008). Practically useful content also has
social exchange value (Homans 1958), and people may share
it to generate reciprocity (Fehr, Kirchsteiger, and Riedl 1998).
Emotional aspects of content may also affect whether it is
shared (Heath, Bell, and Sternberg 2001). People report discussing many of their emotional experiences with others,
and customers report greater word of mouth at the extremes
of satisfaction (i.e., highly satisfied or highly dissatisfied;
Anderson 1998). People may share emotionally charged content to make sense of their experiences, reduce dissonance, or
deepen social connections (Festinger, Riecken, and Schachter
1956; Peters and Kashima 2007; Rime et al. 1991).
Emotional Valence and Social Transmission
193
Importantly, however, the social transmission of emotional content may be driven by more than just valence. In
addition to being positive or negative, emotions also differ
on the level of physiological arousal or activation they
evoke (Smith and Ellsworth 1985). Anger, anxiety, and sadness are all negative emotions, for example, but while anger
and anxiety are characterized by states of heightened
arousal or activation, sadness is characterized by low
arousal or deactivation (Barrett and Russell 1998).
We suggest that these differences in arousal shape social
transmission (see also Berger 2011). Arousal is a state of
mobilization. While low arousal or deactivation is characterized by relaxation, high arousal or activation is characterized by activity (for a review, see Heilman 1997). Indeed,
this excitatory state has been shown to increase actionrelated behaviors such as getting up to help others (Gaertner
and Dovidio 1977) and responding faster to offers in negotiations (Brooks and Schweitzer 2011). Given that sharing
information requires action, we suggest that activation
should have similar effects on social transmission and boost
the likelihood that content is highly shared.
If this is the case, even two emotions of the same valence
may have different effects on sharing if they induce different levels of activation. Consider something that makes people sad versus something that makes people angry. Both
emotions are negative, so a simple valence-based perspective would suggest that content that induces either emotion
should be less viral (e.g., people want to make their friends
feel good rather than bad). An arousal- or activation-based
analysis, however, provides a more nuanced perspective.
Although both emotions are negative, anger might increase
transmission (because it is characterized by high activation),
while sadness might actually decrease transmission
(because it is characterized by deactivation or inaction).
THE CURRENT RESEARCH
194
We collected information about all New York Times articles that appeared on the newspapers home page (www.
nytimes.com) between August 30 and November 30, 2008
(6956 articles). We captured data using a web crawler that
visited the New York Times home page every 15 minutes
during the period in question. It recorded information about
every article on the home page and each article on the most
e-mailed list (updated every 15 minutes). We captured each
articles title, full text, author(s), topic area (e.g., opinion,
sports), and two-sentence summary created by the New York
Times. We also captured each articles section, page, and
publication date if it appeared in the print paper, as well as
the dates, times, locations, and durations of all appearances
it made on the New York Times home page. Of the articles
in our data set, 20% earned a position on the most e-mailed
list.
Article Coding
(n = 2566). For each dimension (awe, anger, anxiety, sadness, surprise, practical utility, and interest), a separate
group of three independent raters rated each article on a
five-point Likert scale according to the extent to which it
was characterized by the construct in question (1 = not at
all, and 5 = extremely). We gave raters feedback on their
coding of a test set of articles until it was clear that they
understood the relevant construct. Interrater reliability was
high on all dimensions (all s > .70), indicating that content tends to evoke similar emotions across people. We
averaged scores across coders and standardized them (for
sample articles that scored highly on the different dimensions, see Table 1; for summary statistics, see Table 2; and
for correlations between variables, see the Appendix). We
assigned all uncoded articles a score of zero on each dimension after standardization (i.e., we assigned uncoded articles
the mean value), and we included a dummy in regression
analyses to control for uncoded stories (for a discussion of
this standard imputation methodology, see Cohen and
Cohen 1983). This enabled us to use the full set of articles
collected to analyze the relationship between other content
characteristics (that did not require manual coding) and
virality. Using only the coded subset of articles provides
similar results.
Table 1
Emotionality
Redefining Depression as Mere Sadness
When All Else Fails, Blaming the Patient Often Comes Next
Positivity
Wide-Eyed New Arrivals Falling in Love with the City
Tony Award for Philanthropy
(Low Scoring)
Web Rumors Tied to Korean Actresss Suicide
Germany: Baby Polar Bears Feeder Dies
Awe
Rare Treatment Is Reported to Cure AIDS Patient
The Promise and Power of RNA
Anger
What Red Ink? Wall Street Paid Hefty Bonuses
Loan Titans Paid McCain Adviser Nearly $2 Million
Anxiety
For Stocks, Worst Single-Day Drop in Two Decades
Home Prices Seem Far from Bottom
Sadness
Maimed on 9/11, Trying to Be Whole Again
Obama Pays Tribute to His Grandmother After She Dies
Control Variables
Practical Utility
Voter Resources
It Comes in Beige or Black, but You Make It Green (a story
about being environmentally friendly when disposing of old
computers)
Interest
Love, Sex and the Changing Landscape of Infidelity
Teams Prepare for the Courtship of LeBron James
Surprise
Passion for Food Adjusts to Fit Passion for Running (a story
about a restaurateur who runs marathons)
Pecking, but No Order, on Streets of East Harlem (a story about
chickens in Harlem)
195
Additional Controls
7.43%
.98%
1.81
1.47
1.55
1.31
1.66
2.71
2.25
1021.35
11.08
9.13
.29
.66
SD
1.92%
1.84%
.71
.51
.64
.41
1.01
.85
.87
668.94
1.54
2.54
.45
.48
196
Variable
Table 3
PREDICTOR VARIABLES
Awe
Anger
Anxiety
Sadness
Content Controls
Practical utility
Interest
Surprise
Other Control Variables
Word count
Author fame
Writing complexity
Author gender
of its skew, we use the logarithm of this variable as a control in our analyses. We also control for variables that might
both influence transmission and the likelihood that an article possesses certain characteristics (e.g., evokes anger).
Writing complexity. We control for how difficult a piece
of writing is to read using the SMOG Complexity Index
(McLaughlin 1969). This widely used index variable essentially measures the grade-level appropriateness of the writing. Alternate complexity measures yield similar results.
Author gender. Because male and female authors have
different writing styles (Koppel, Argamon, and Shimoni
2002; Milkman, Carmona, and Gleason 2007), we control
for the gender of an articles first author (male, female, or
unknown due to a missing byline). We classify gender using
a first name mapping list from prior research (Morton,
Zettelmeyer, and Silva-Risso 2003). For names that were
classified as gender neutral or did not appear on this list,
research assistants determined author gender by finding the
authors online.
Article length and day dummies. We also control for an
articles length in words. Longer articles may be more likely
to go into enough detail to inspire awe or evoke anger but
may simply be more viral because they contain more infor-
1 + exp
t + 1 z-emotionality at
+ 2 z-positivityat
+ 3 z-awe at + 4 z-angerat
+ 5 z-anxietyat
+ 6 z-sadnessat + Xat
197
Figure 1
:&
9(:&4/5%*./&
&
0/5.&9(:&
/5
/
4/5%*./&
&
!"#$%&&
'()*+,&&
-"33)/&4/5%*./&65.&
&
& &
6*))/%/3&
))/% 3&
;*<=4/5%*. &
;*<=4/5%*./&
-(./&0/12&
&
&
6(7(+&8"2%&
&
Notes: Portions with X through them always featured Associated Press and Reuters news stories, videos, blogs, or advertisements rather than articles by
New York Times reporters.
198
Table 4
AN ARTICLES LIKELIHOOD OF MAKING THE NEW YORK TIMES MOST E-MAILED LIST AS A FUNCTION OF ITS CONTENT
CHARACTERISTICS
Emotion Predictors
Positivity
Specific Emotions
Awe
Anger
Anxiety
Sadness
Content Controls
Practical utility
Interest
Right column
Bulleted subfeature
More news
Bottom list 10
Uncredited
Emotionality
(2)
.13***
(.03)
Emotionality
Surprise
Positivity
(1)
.11***
(.03)
.27***
(.03)
Specific
Emotions
(3)
.17***
(.03)
.26***
(.03)
.36***
(.06)
.37***
(.10)
.27***
(.07)
.16*
(.07)
.13***
(.02)
.11***
(.01)
.14***
(.01)
.06***
(.00)
.04**
(.01)
.01
(.01)
.06**
(.02)
.11***
(.02)
.10***
(.01)
.10***
(.02)
.05***
(.01)
.04**
(.01)
.06***
(.01)
.11***
(.03)
.11***
(.03)
.12***
(.01)
.15***
(.02)
.06***
(.01)
.05*
(.02)
.01
(.02)
.08**
(.03)
No
No
No
No
6956
.04
3118.45
.23***
(.05)
.29***
(.06)
.30***
(.06)
.29**
(.10)
.21***
(.07)
.12
(.07)
No
6956
.00
3245.85
.14***
(.04)
.09*
(.04)
Only Coded
Articles
(6)
.34***
(.05)
.38***
(.09)
.24***
(.07)
.17*
(.07)
.16***
(.04)
.22***
(.04)
Including Section
Dummies
(5)
.46***
(.05)
.44***
(.06)
.20***
(.05)
.19***
(.05)
Including
Controls
(4)
No
6956
.07
3034.17
.34***
(.06)
.29***
(.06)
.16**
(.06)
.52***
(.11)
.05
(.04)
.17***
(.02)
.36***
(.08)
.39
(.26)
Yes
No
6956
.28
2331.37
.18**
(.07)
.31***
(.07)
.24***
(.06)
.71***
(.12)
.05
(.04)
.15***
(.02)
.33***
(.09)
.56*
(.27)
Yes
Yes
6956
.36
2084.85
.27***
(.06)
.27***
(.07)
.18**
(.06)
.57***
(.18)
.06
(.07)
.15***
(.03)
.27*
(.13)
.50
(.37)
Yes
No
2566
.32
904.76
199
Figure 2
Anger (+1SD)
Awe (+1SD)
Positivity (+1SD)
Emotionality (+1SD)
Interest (+1SD)
Surprise (+1SD)
20%
21%
34%
Sadness (+1SD)
13%
30%
18%
14%
25%
30%
20%
0%
20%
% Change in Fitted Probability of Making the List
40%
200
201
202
These findings also have important marketing implications. Considering the specific emotions content evokes
should help companies maximize revenue when placing
advertisements and should help online content providers
when pricing access to content (e.g., potentially charging
more for content that is more likely to be shared). It might
also be useful to feature or design content that evokes activating emotions because such content is likely to be shared
(thus increasing page views).
Our findings also shed light on how to design successful
viral marketing campaigns and craft contagious content.
While marketers often produce content that paints their
product in a positive light, our results suggest that content
will be more likely to be shared if it evokes high-arousal
emotions. Advertisements that make consumers content or
relaxed, for example, will not be as viral as those that amuse
them. Furthermore, while some marketers might shy away
from advertisements that evoke negative emotions, our
results suggest that negative emotion can actually increase
transmission if it is characterized by activation. BMW, for
example, created a series of short online films called The
Hire that they hoped would go viral and which included
car chases and story lines that often evoked anxiety (with
such titles as Ambush and Hostage). While one might
be concerned that negative emotion would hurt the brand,
our results suggest that it should increase transmission
because anxiety induces arousal. (Incidentally, The Hire
was highly successful, generating millions of views). Following this line of reasoning, public health information
should be more likely to be passed on if it is framed to
evoke anger or anxiety rather than sadness.
Similar points apply to managing online consumer sentiment. While some consumer-generated content (e.g.,
reviews, blog posts) is positive, much is negative and can
build into consumer backlashes if it is not carefully managed. Mothers offended by a Motrin ad campaign, for example, banded together and began posting negative YouTube
videos and tweets (Petrecca 2008). Although it is impossible to address all negative sentiment, our results indicate
that certain types of negativity may be more important to
address because they are more likely to be shared. Customer
experiences that evoke anxiety or anger, for example,
should be more likely to be shared than those that evoke
sadness (and textual analysis can be used to distinguish different types of posts). Consequently, it may be more important to rectify experiences that make consumers anxious
rather than disappointed.
In conclusion, this research illuminates how content characteristics shape whether it becomes viral. When attempting
to generate word of mouth, marketers often try targeting
influentials, or opinion leaders (i.e., some small set of
special people who, whether through having more social
ties or being more persuasive, theoretically have more influence than others). Although this approach is pervasive,
203
204
(1.00
(.04* (1.00
.02
(.02
(.04*
.16*
(.03*
.18*
(.00
.18*
(.06*
(.04*
(.054* (.07*
.10*
.04*
(.06*
.05*
(.05*
.05*
.09*
.03*
.07*
(.06*
(.21*
(.03*
(.01
.02
.01
.06*
(.16*
(.05*
(.00
.02
.08*
.11*
(.11*
(.10*
(.03*
(.15*
Emotionality
Positivity
Awe
Anger
Anxiety
Sadness
Practical utility
Interest
Surprise
Word count 103
Complexity
Author fame
Author female
Missing
Top feature
Near top feature
Right column
Bulleted subfeature
More news
Middle feature bar
Bottom list
Emotionality Positivity
Anger
(1.00
(.50*
(.42*
.12*
.13*
.01
(.02
(.10*
(.01
.03*
(.03*
(.06*
(.15*
(.00
(.09*
(.07*
.06*
.11*
Awe
(1.00
.21*
.11*
(.08*
.11*
(.26*
(.24*
(.04*
.04*
(.06*
(.01
.06*
.03*
.02
(.04*
.05*
.01
.06*
.07*
Appendix
(1.00
(.45*
(.07*
.24*
(.00
(.00
(.13*
(.03*
(.00
.02
(.06*
(.07*
.02
(.08*
(.06*
.06*
.09*
(1.00
.05*
.19*
(.05*
(.00
(.05*
(.01
(.00
(.00
(.05*
(.07*
.02
(.06*
(.06*
.05*
.06*
(1.00
.06*
.05*
.01
(.01
.02
(.05*
(.01
(.02
.03*
(.05*
(.04*
.08*
(.00
(.06*
(1.00
(.18*
(.06*
.11*
(.00
.01
(.02
.03*
.05*
(.06*
.05*
.04*
(.10*
(.09*
(1.00
(.02*
(.04*
(.02
(.07*
.09*
.02*
(.01
.02*
.04*
(.07*
(.04*
(.04*
(1.00
.06*
(.01
(.00
.01
(.28*
(.27*
(.05*
(.07*
.02
(.16*
(.29*
(1.00
(.01
.02*
(.02*
(.01
(.06*
.01
(.03*
(.09*
.06*
.04*
(1.00
(.00
.71*
(.00
(.06*
.03*
(.03*
(.05*
.13*
.06*
(1.00
.15*
.02
.01
.02
(.01
.01
(.00
(.05*
(1.00
(.01
.05*
(.16*
.04*
.07*
(.13*
(.00
(1.00
(.27*
(.02
(.12*
(.01
(.02
(.04*
(1.00
.04*
(.12*
(.10*
.05*
.05*
(1.00
.03*
.06*
(.07*
(.10*
(1.00
.05*
.04*
(.00
Word
Near
Bulleted
Practical
Count Complex- Author Author
Top
Top
Right
SubAnxiety Sadness Utility Interest Surprise 103
ity
Fame Female Missing Feature Feature Column feature
(1.00
.08*
.09*
More
News
(1.00
(.13*
Middle
Feature
Bar