Skip to main content

Showing 1–18 of 18 results for author: Minot, J R

.
  1. arXiv:2307.08580  [pdf, other

    physics.soc-ph cs.CL

    The Resume Paradox: Greater Language Differences, Smaller Pay Gaps

    Authors: Joshua R. Minot, Marc Maier, Bradford Demarest, Nicholas Cheney, Christopher M. Danforth, Peter Sheridan Dodds, Morgan R. Frank

    Abstract: Over the past decade, the gender pay gap has remained steady with women earning 84 cents for every dollar earned by men on average. Many studies explain this gap through demand-side bias in the labor market represented through employers' job postings. However, few studies analyze potential bias from the worker supply-side. Here, we analyze the language in millions of US workers' resumes to investi… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 24 pages, 15 figures

  2. arXiv:2110.06847  [pdf, other

    cs.CL cs.CY cs.SI physics.soc-ph

    Ousiometrics and Telegnomics: The essence of meaning conforms to a two-dimensional powerful-weak and dangerous-safe framework with diverse corpora presenting a safety bias

    Authors: P. S. Dodds, T. Alshaabi, M. I. Fudolig, J. W. Zimmerman, J. Lovato, S. Beaulieu, J. R. Minot, M. V. Arnold, A. J. Reagan, C. M. Danforth

    Abstract: We define `ousiometrics' to be the study of essential meaning in whatever context that meaningful signals are communicated, and `telegnomics' as the study of remotely sensed knowledge. From work emerging through the middle of the 20th century, the essence of meaning has become generally accepted as being well captured by the three orthogonal dimensions of evaluation, potency, and activation (EPA).… ▽ More

    Submitted 29 March, 2023; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: 40 pages (34 page main manuscript, 6 page appendix), 15 figures (9 main, 6 appendix), 4 tables

  3. arXiv:2106.10281  [pdf, other

    cs.SI cs.CY physics.soc-ph

    Say Their Names: Resurgence in the collective attention toward Black victims of fatal police violence following the death of George Floyd

    Authors: Henry H. Wu, Ryan J. Gallagher, Thayer Alshaabi, Jane L. Adams, Joshua R. Minot, Michael V. Arnold, Brooke Foucault Welles, Randall Harp, Peter Sheridan Dodds, Christopher M. Danforth

    Abstract: The murder of George Floyd by police in May 2020 sparked international protests and renewed attention in the Black Lives Matter movement. Here, we characterize ways in which the online activity following George Floyd's death was unparalleled in its volume and intensity, including setting records for activity on Twitter, prompting the saddest day in the platform's history, and causing George Floyd'… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  4. arXiv:2106.01481  [pdf, other

    physics.soc-ph cs.CL cs.SI

    Quantifying language changes surrounding mental health on Twitter

    Authors: Anne Marie Stupinski, Thayer Alshaabi, Michael V. Arnold, Jane Lydia Adams, Joshua R. Minot, Matthew Price, Peter Sheridan Dodds, Christopher M. Danforth

    Abstract: Mental health challenges are thought to afflict around 10% of the global population each year, with many going untreated due to stigma and limited access to services. Here, we explore trends in words and phrases related to mental health through a collection of 1- , 2-, and 3-grams parsed from a data stream of roughly 10% of all English tweets since 2012. We examine temporal dynamics of mental heal… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: 12 pages, 5 figures, 1 table

  5. arXiv:2105.12006  [pdf, other

    cs.SI cs.CL

    The incel lexicon: Deciphering the emergent cryptolect of a global misogynistic community

    Authors: Kelly Gothard, David Rushing Dewhurst, Joshua R. Minot, Jane Lydia Adams, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: Evolving out of a gender-neutral framing of an involuntary celibate identity, the concept of `incels' has come to refer to an online community of men who bear antipathy towards themselves, women, and society-at-large for their perceived inability to find and maintain sexual relationships. By exploring incel language use on Reddit, a global online message board, we contextualize the incel community… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: 18 pages, 11 figures

  6. arXiv:2103.05841  [pdf, other

    cs.CL stat.ML

    Interpretable bias mitigation for textual data: Reducing gender bias in patient notes while maintaining classification performance

    Authors: Joshua R. Minot, Nicholas Cheney, Marc Maier, Danne C. Elbers, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: Medical systems in general, and patient treatment decisions and outcomes in particular, are affected by bias based on gender and other demographic elements. As language models are increasingly applied to medicine, there is a growing interest in building algorithmic fairness into processes impacting patient care. Much of the work addressing this question has focused on biases encoded in language mo… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

    Comments: 31 pages, 22 figures

  7. arXiv:2008.13078  [pdf, other

    physics.soc-ph cs.IR physics.data-an

    Probability-turbulence divergence: A tunable allotaxonometric instrument for comparing heavy-tailed categorical distributions

    Authors: P. S. Dodds, J. R. Minot, M. V. Arnold, T. Alshaabi, J. L. Adams, D. R. Dewhurst, A. J. Reagan, C. M. Danforth

    Abstract: Real-world complex systems often comprise many distinct types of elements as well as many more types of networked interactions between elements. When the relative abundances of types can be measured well, we further observe heavy-tailed categorical distributions for type frequencies. For the comparison of type frequency distributions of two systems or a system with itself at different time points… ▽ More

    Submitted 29 August, 2020; originally announced August 2020.

    Comments: 14 pages, 7 figures

  8. arXiv:2008.11305  [pdf, other

    physics.soc-ph cs.SI

    Long-term word frequency dynamics derived from Twitter are corrupted: A bespoke approach to detecting and removing pathologies in ensembles of time series

    Authors: P. S. Dodds, J. R. Minot, M. V. Arnold, T. Alshaabi, J. L. Adams, D. R. Dewhurst, A. J. Reagan, C. M. Danforth

    Abstract: Maintaining the integrity of long-term data collection is an essential scientific practice. As a field evolves, so too will that field's measurement instruments and data storage systems, as they are invented, improved upon, and made obsolete. For data streams generated by opaque sociotechnical systems which may have episodic and unknown internal rule changes, detecting and accounting for shifts in… ▽ More

    Submitted 27 August, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

    Comments: 8 pages, 5 figures

  9. arXiv:2008.07301  [pdf, other

    physics.soc-ph cs.SI

    Computational timeline reconstruction of the stories surrounding Trump: Story turbulence, narrative control, and collective chronopathy

    Authors: P. S. Dodds, J. R. Minot, M. V. Arnold, T. Alshaabi, J. L. Adams, A. J. Reagan, C. M. Danforth

    Abstract: Measuring the specific kind, temporal ordering, diversity, and turnover rate of stories surrounding any given subject is essential to develo** a complete reckoning of that subject's historical impact. Here, we use Twitter as a distributed news and opinion aggregation source to identify and track the dynamics of the dominant day-scale stories around Donald Trump, the 45th President of the United… ▽ More

    Submitted 30 September, 2022; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: 13 pages, 5 figures (4 main, 1 appendix), 1 table. Analysis complete for 6 calendar years, from 2015/01/01 through to 2021/12/31

    Journal ref: PLOS ONE, 2021, e0260592

  10. arXiv:2007.12988  [pdf, other

    cs.SI cs.CL physics.soc-ph

    Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

    Authors: Thayer Alshaabi, Jane L. Adams, Michael V. Arnold, Joshua R. Minot, David R. Dewhurst, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: In real-time, social media data strongly imprints world events, popular culture, and day-to-day conversations by millions of ordinary people at a scale that is scarcely conventionalized and recorded. Vitally, and absent from many standard corpora such as books and news archives, sharing and commenting mechanisms are native to social media platforms, enabling us to quantify social amplification (i.… ▽ More

    Submitted 16 July, 2021; v1 submitted 25 July, 2020; originally announced July 2020.

    Comments: Main text: 15 pages, 6 figures; Supplementary text: 23 pages, 11 figures, 15 tables. Website: https://storywrangling.org/

    Journal ref: Sci.Adv. 7 eabe6534 (2021)

  11. arXiv:2006.03526  [pdf, other

    physics.soc-ph cs.SI

    Ratioing the President: An exploration of public engagement with Obama and Trump on Twitter

    Authors: Joshua R. Minot, Michael V. Arnold, Thayer Alshaabi, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: The past decade has witnessed a marked increase in the use of social media by politicians, most notably exemplified by the 45th President of the United States (POTUS), Donald Trump. On Twitter, POTUS messages consistently attract high levels of engagement as measured by likes, retweets, and replies. Here, we quantify the balance of these activities, also known as "ratios", and study their dynamics… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

    Comments: 17 pages, 10 figures

  12. arXiv:2004.03516  [pdf, other

    physics.soc-ph cs.SI

    Divergent modes of online collective attention to the COVID-19 pandemic are associated with future caseload variance

    Authors: David Rushing Dewhurst, Thayer Alshaabi, Michael V. Arnold, Joshua R. Minot, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: Using a random 10% sample of tweets authored from 2019-09-01 through 2020-04-30, we analyze the dynamic behavior of words (1-grams) used on Twitter to describe the ongoing COVID-19 pandemic. Across 24 languages, we find two distinct dynamic regimes: One characterizing the rise and subsequent collapse in collective attention to the initial Coronavirus outbreak in late January, and a second that rep… ▽ More

    Submitted 19 May, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: 12 + 4 pages, 11 + 4 figures, code + data + figures will soon be available at http://compstorylab.org/covid19ngrams/

  13. arXiv:2003.14291  [pdf, other

    cs.SI physics.soc-ph

    Hurricanes and hashtags: Characterizing online collective attention for natural disasters

    Authors: Michael V. Arnold, David Rushing Dewhurst, Thayer Alshaabi, Joshua R. Minot, Jane L. Adams, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: We study collective attention paid towards hurricanes through the lens of $n$-grams on Twitter, a social media platform with global reach. Using hurricane name mentions as a proxy for awareness, we find that the exogenous temporal dynamics are remarkably similar across storms, but that overall collective attention varies widely even among storms causing comparable deaths and damage. We construct `… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: 31 pages (14 main, 17 Supplemental), 19 figures (5 main, 14 appendix)

  14. arXiv:2003.12614  [pdf, other

    physics.soc-ph cs.SI

    How the world's collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter

    Authors: T. Alshaabi, J. R. Minot, M. V. Arnold, J. L. Adams, D. R. Dewhurst, A. J. Reagan, R. Muhamad, C. M. Danforth, P. S. Dodds

    Abstract: In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as confirmed recovery through serological tests for… ▽ More

    Submitted 6 January, 2021; v1 submitted 27 March, 2020; originally announced March 2020.

    Comments: 13 pages, 6 figures, 3 tables, website: http://compstorylab.org/covid19ngrams/

  15. The growing amplification of social media: Measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009-2020

    Authors: Thayer Alshaabi, David R. Dewhurst, Joshua R. Minot, Michael V. Arnold, Jane L. Adams, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: Working from a dataset of 118 billion messages running from the start of 2009 to the end of 2019, we identify and explore the relative daily use of over 150 languages on Twitter. We find that eight languages comprise 80% of all tweets, with English, Japanese, Spanish, and Portuguese being the most dominant. To quantify social spreading in each language over time, we compute the 'contagion ratio':… ▽ More

    Submitted 8 March, 2021; v1 submitted 7 March, 2020; originally announced March 2020.

    Comments: 26 pages (15 main, 11 appendix), 13 figures (6 main, 7 appendix), and 4 online appendices available at http://compstorylab.org/storywrangler/papers/tlid/

  16. arXiv:2002.09770  [pdf, other

    physics.soc-ph physics.data-an

    Allotaxonometry and rank-turbulence divergence: A universal instrument for comparing complex systems

    Authors: P. S. Dodds, J. R. Minot, M. V. Arnold, T. Alshaabi, J. L. Adams, D. R. Dewhurst, T. J. Gray, M. R. Frank, A. J. Reagan, C. M. Danforth

    Abstract: Complex systems often comprise many kinds of components which vary over many orders of magnitude in size: Populations of cities in countries, individual and corporate wealth in economies, species abundance in ecologies, word frequency in natural language, and node degree in complex networks. Here, we introduce `allotaxonometry' along with `rank-turbulence divergence' (RTD), a tunable instrument fo… ▽ More

    Submitted 2 August, 2023; v1 submitted 22 February, 2020; originally announced February 2020.

    Comments: 36 pages, 10 main figures, 15 inset figures, 1 table; online appendices: http://compstorylab.org/allotaxonometry/

  17. arXiv:1910.00149  [pdf, other

    physics.soc-ph cs.SI

    Fame and Ultrafame: Measuring and comparing daily levels of `being talked about' for United States' presidents, their rivals, God, countries, and K-pop

    Authors: Peter Sheridan Dodds, Joshua R. Minot, Michael V. Arnold, Thayer Alshaabi, Jane Lydia Adams, David Rushing Dewhurst, Andrew J. Reagan, Christopher M. Danforth

    Abstract: When building a global brand of any kind -- a political actor, clothing style, or belief system -- develo** widespread awareness is a primary goal. Short of knowing any of the stories or products of a brand, being talked about in whatever fashion -- raw fame -- is, as Oscar Wilde would have it, better than not being talked about at all. Here, we measure, examine, and contrast the day-to-day raw… ▽ More

    Submitted 29 October, 2021; v1 submitted 30 September, 2019; originally announced October 2019.

    Comments: 31 pages (21 pages main text, 10 pages appendix), 8 figures (7 in main text, 1 in appendix), 10 tables (1 in main text, 9 in appendix)

  18. arXiv:1906.11710  [pdf, other

    physics.soc-ph cs.DS eess.SP physics.data-an

    The shocklet transform: A decomposition method for the identification of local, mechanism-driven dynamics in sociotechnical time series

    Authors: David Rushing Dewhurst, Thayer Alshaabi, Dilan Kiley, Michael V. Arnold, Joshua R. Minot, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: We introduce a qualitative, shape-based, timescale-independent time-domain transform used to extract local dynamics from sociotechnical time series---termed the Discrete Shocklet Transform (DST)---and an associated similarity search routine, the Shocklet Transform And Ranking (STAR) algorithm, that indicates time windows during which panels of time series display qualitatively-similar anomalous be… ▽ More

    Submitted 18 December, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

    Comments: 29 pages (20 body, 9 appendix), 20 figures (13 body, 7 appendix), three online appendices available at http://compstorylab.org/shocklets/ (two displaying interactive visualizations and one containing over 10,000 figures), open-source implementation of STAR algorithm and discrete shocklet transform available at https://gitlab.com/compstorylab/discrete-shocklet-transform