Reply to Garcia et al.: Common mistakes in measuring frequency dependent word characteristics
Authors:
P. S. Dodds,
E. M. Clark,
S. Desu,
M. R. Frank,
A. J. Reagan,
J. R. Williams,
L. Mitchell,
K. D. Harris,
I. M. Kloumann,
J. P. Bagrow,
K. Megerdoomian,
M. T. McMahon,
B. F. Tivnan,
C. M. Danforth
Abstract:
We demonstrate that the concerns expressed by Garcia et al. are misplaced, due to (1) a misreading of our findings in [1]; (2) a widespread failure to examine and present words in support of asserted summary quantities based on word usage frequencies; and (3) a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists. In particular, we show that the English…
▽ More
We demonstrate that the concerns expressed by Garcia et al. are misplaced, due to (1) a misreading of our findings in [1]; (2) a widespread failure to examine and present words in support of asserted summary quantities based on word usage frequencies; and (3) a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists. In particular, we show that the English component of our study compares well statistically with two related surveys, that no survey design influence is apparent, and that estimates of measurement error do not explain the positivity biases reported in our work and that of others. We further demonstrate that for the frequency dependence of positivity---of which we explored the nuances in great detail in [1]---Garcia et al. did not perform a reanalysis of our data---they instead carried out an analysis of a different, statistically improper data set and introduced a nonlinearity before performing linear regression.
△ Less
Submitted 28 May, 2015; v1 submitted 25 May, 2015;
originally announced May 2015.
Human language reveals a universal positivity bias
Authors:
Peter Sheridan Dodds,
Eric M. Clark,
Suma Desu,
Morgan R. Frank,
Andrew J. Reagan,
Jake Ryland Williams,
Lewis Mitchell,
Kameron Decker Harris,
Isabel M. Kloumann,
James P. Bagrow,
Karine Megerdoomian,
Matthew T. McMahon,
Brian F. Tivnan,
Christopher M. Danforth
Abstract:
Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias i…
▽ More
Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias is strongly independent of frequency of word usage. Alongside these general regularities, we describe inter-language variations in the emotional spectrum of languages which allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.
△ Less
Submitted 15 June, 2014;
originally announced June 2014.