Search | arXiv e-print repository

Where there's a will there's a way: ChatGPT is used more for science in countries where it is prohibited

Authors: Honglin Bao, Mengyi Sun, Misha Teplitskiy

Abstract: Regulating AI is a key societal challenge, but which regulation methods are effective is unclear. This study measures the effectiveness of restricting AI services geographically, focusing on ChatGPT. OpenAI restricts ChatGPT access in several countries, including China and Russia. If restrictions are effective, ChatGPT use should be minimal in these countries. We measured use with a classifier bas… ▽ More Regulating AI is a key societal challenge, but which regulation methods are effective is unclear. This study measures the effectiveness of restricting AI services geographically, focusing on ChatGPT. OpenAI restricts ChatGPT access in several countries, including China and Russia. If restrictions are effective, ChatGPT use should be minimal in these countries. We measured use with a classifier based on distinctive word usage found in early versions of ChatGPT, e.g. "delve." We trained the classifier on pre- and post-ChatGPT "polished" abstracts and found it outperformed GPTZero and ZeroGPT on validation sets, including papers with self-reported AI use. Applying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv showed ChatGPT was used in about 12.6% of preprints by August 2023, with 7.7% higher usage in restricted countries. The gap appeared before China's first major legal LLM became widely available. To test the possibility that, due to high demand, use in restricted countries would have been even higher without restrictions, we compared Asian countries with high expected demand (where English is not an official language) and found that use was higher in those with restrictions. ChatGPT use was correlated with higher views and downloads, but not citations or journal placement. Overall, restricting ChatGPT geographically has proven ineffective in science and possibly other domains, likely due to widespread workarounds. △ Less

Submitted 27 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Three figures, two tables, 21 pages, and a 19-page appendix

arXiv:2304.06190 [pdf]

Do "bad" citations have "good" effects?

Authors: Honglin Bao, Misha Teplitskiy

Abstract: The scientific community discourages authors of research papers from citing papers that did not influence them. Such "rhetorical" citations are assumed to degrade the literature and incentives for good work. While a world where authors cite only substantively appears attractive, we argue that mandating substantive citing may have underappreciated consequences on the allocation of attention and dyn… ▽ More The scientific community discourages authors of research papers from citing papers that did not influence them. Such "rhetorical" citations are assumed to degrade the literature and incentives for good work. While a world where authors cite only substantively appears attractive, we argue that mandating substantive citing may have underappreciated consequences on the allocation of attention and dynamism in scientific literatures. We develop a novel agent-based model in which agents cite substantively and rhetorically. Agents first select papers to read based on their expected quality, read them and observe their actual quality, become influenced by those that are sufficiently good, and substantively cite them. Next, agents fill any remaining slots in the reference lists by (rhetorically) citing papers that support their narrative, regardless of whether they were actually influential. By turning rhetorical citing on-and-off, we find that rhetorical citing increases the correlation between quality and citations, increases citation churn, and reduces citation inequality. This occurs because rhetorical citing redistributes some citations from a stable set of elite-quality papers to a more dynamic set with high-to-moderate quality and high rhetorical value. Increasing the size of reference lists, often seen as an undesirable trend, amplifies the effects. In sum, rhetorical citing helps deconcentrate attention and makes it easier to displace incumbent ideas, so whether it is indeed undesirable depends on the metrics used to judge desirability. △ Less

Submitted 16 April, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

Comments: Main: 28 pages, one table, 5 figures; Appendix: 11 pages, 13 figures

arXiv:2209.01175 [pdf]

Intentional and serendipitous diffusion of ideas: Evidence from academic conferences

Authors: Misha Teplitskiy, Soya Park, Neil Thompson, David Karger

Abstract: This paper investigates the effects of seeing ideas presented in-person when they are easily accessible online. Presentations may increase the diffusion of ideas intentionally (when one attends the presentation of an idea of interest) and serendipitously (when one sees other ideas presented in the same session). We measure these effects in the context of 25 computer science conferences using data… ▽ More This paper investigates the effects of seeing ideas presented in-person when they are easily accessible online. Presentations may increase the diffusion of ideas intentionally (when one attends the presentation of an idea of interest) and serendipitously (when one sees other ideas presented in the same session). We measure these effects in the context of 25 computer science conferences using data from the scheduling application Confer, which lets users browse papers, Like those of interest, and receive schedules of their presentations. We address endogeneity concerns in presentation attendance by exploiting scheduling conflicts: when a user Likes multiple papers that are presented at the same time, she cannot see them both, potentially affecting their diffusion. Estimates show that being able to see presentations increases citing of Liked papers within two years by 1.5 percentage points (62.5% boost over the baseline citation rate). Attention to Liked papers also spills over to non-Liked papers in the same session, increasing their citing by 0.5 percentage points (125% boost), and this serendipitous diffusion represents 30.5% of the total effect. Both diffusion types were concentrated among papers semantically close to an attendee's prior work, suggesting that there are inefficiencies in finding related research that conferences help overcome. Overall, even when ideas are easily accessible online, in-person presentations substantially increase diffusion, much of it serendipitous. △ Less

Submitted 19 January, 2024; v1 submitted 2 September, 2022; originally announced September 2022.

arXiv:2206.05330 [pdf, other]

The Gender Gap in Scholarly Self-Promotion on Social Media

Authors: Hao Peng, Misha Teplitskiy, Daniel M. Romero, Emőke-Ágnes Horvát

Abstract: Self-promotion in science is ubiquitous but may not be exercised equally by men and women. Research on self-promotion in other domains suggests that, due to bias in self-assessment and adverse reactions to non-gender-conforming behaviors (``pushback''), women tend to self-promote less often than men. We test whether this pattern extends to scholars by examining self-promotion over six years using… ▽ More Self-promotion in science is ubiquitous but may not be exercised equally by men and women. Research on self-promotion in other domains suggests that, due to bias in self-assessment and adverse reactions to non-gender-conforming behaviors (``pushback''), women tend to self-promote less often than men. We test whether this pattern extends to scholars by examining self-promotion over six years using 23M Tweets about 2.8M research papers by 3.5M authors. Overall, women are about 28% less likely than men to self-promote their papers even after accounting for important confounds, and this gap has grown over time. Moreover, differential adoption of Twitter does not explain the gender gap, which is large even in relatively gender-balanced broad research areas, where bias in self-assessment and pushback are expected to be smaller. Further, the gap increases with higher performance and status, being most pronounced for productive women from top-ranked institutions who publish in high-impact journals. Critically, we find differential returns with respect to gender: while self-promotion is associated with increased tweets of papers, the increase is smaller for women than for men. Our findings suggest that self-promotion varies meaningfully by gender and help explain gender differences in the visibility of scientific ideas. △ Less

Submitted 10 October, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

arXiv:2107.04165 [pdf]

doi 10.1016/j.respol.2023.104911

Being Together in Place as a Catalyst for Scientific Advance

Authors: Eamon Duede, Misha Teplitskiy, Karim Lakhani, James Evans

Abstract: The COVID-19 pandemic necessitated social distancing at every level of society, including universities and research institutes, raising essential questions concerning the continuing importance of physical proximity for scientific and scholarly advance. Using customized author surveys about the intellectual influence of referenced work on scientists' own papers, combined with precise measures of ge… ▽ More The COVID-19 pandemic necessitated social distancing at every level of society, including universities and research institutes, raising essential questions concerning the continuing importance of physical proximity for scientific and scholarly advance. Using customized author surveys about the intellectual influence of referenced work on scientists' own papers, combined with precise measures of geographical and semantic distance between focal and referenced works, we find that being at the same institution is strongly associated with intellectual influence on scientists' and scholars' published work. However, this influence increases with intellectual distance: the more different the referenced work done by colleagues at one's institution, the more influential it is on one's own. Universities worldwide constitute places where people doing very different work engage in sustained interactions through departments, committees, seminars, and communities. These interactions come to uniquely influence their published research, suggesting the need to replace rather than displace diverse engagements for sustainable advance. △ Less

Submitted 11 October, 2023; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2101.02701 [pdf]

doi 10.1002/asi.24582

Does double-blind peer-review reduce bias? Evidence from a top computer science conference

Authors: Mengyi Sun, Jainabou Barry Danfa, Misha Teplitskiy

Abstract: Peer review is widely regarded as essential for advancing scientific research. However, reviewers may be biased by authors' prestige or other characteristics. Double-blind peer review, in which the authors' identities are masked from the reviewers, has been proposed as a way to reduce reviewer bias. Although intuitive, evidence for the effectiveness of double-blind peer review in reducing bias is… ▽ More Peer review is widely regarded as essential for advancing scientific research. However, reviewers may be biased by authors' prestige or other characteristics. Double-blind peer review, in which the authors' identities are masked from the reviewers, has been proposed as a way to reduce reviewer bias. Although intuitive, evidence for the effectiveness of double-blind peer review in reducing bias is limited and mixed. Here, we examine the effects of double-blind peer review on prestige bias by analyzing the peer review files of 5027 papers submitted to the International Conference on Learning Representations (ICLR), a top computer science conference that changed its reviewing policy from single-blind peer review to double-blind peer review in 2018. We find that after switching to double-blind review, the scores given to the most prestigious authors significantly decreased. However, because many of these papers were above the threshold for acceptance, the change did not affect paper acceptance decisions significantly. Nevertheless, we show that double-blind peer review may have improved the quality of the selections by limiting other (non-author-prestige) biases. Specifically, papers rejected in the single-blind format are cited more than those rejected under the double-blind format, suggesting that double-blind review better identifies poorer quality papers. Interestingly, an apparently unrelated change - the change of rating scale from 10 to 4 points - likely reduced prestige bias significantly, to an extent that affected papers' acceptance. These results provide some support for the effectiveness of double-blind review in reducing prestige bias, while opening new research directions on the impact of peer review formats. △ Less

Submitted 7 January, 2021; originally announced January 2021.

arXiv:2009.01896 [pdf, other]

Author Mentions in Science News Reveal Widespread Disparities Across Name-inferred Ethnicities

Authors: Hao Peng, Misha Teplitskiy, David Jurgens

Abstract: Media outlets play a key role in spreading scientific knowledge to the general public and raising the profile of researchers among their peers. Yet, how journalists choose to present researchers in their stories is poorly understood. Using a comprehensive dataset of 223,587 news stories from 288 U.S. outlets reporting on 100,486 research papers across all areas of science, we investigate if the au… ▽ More Media outlets play a key role in spreading scientific knowledge to the general public and raising the profile of researchers among their peers. Yet, how journalists choose to present researchers in their stories is poorly understood. Using a comprehensive dataset of 223,587 news stories from 288 U.S. outlets reporting on 100,486 research papers across all areas of science, we investigate if the authors' ethnicities, as inferred from names, are associated with whether journalists explicitly mention them by name. By focusing on research papers news outlets chose to cover, our analysis reduces concerns that differences in name mentions are driven by differences in research quality or newsworthiness. We find substantial disparities in name mention rates across ethnically-distinctive names. Researchers with non-Anglo names, especially those with East Asian and African names, are significantly less likely to be mentioned in news stories covering their research, even when comparing stories from a particular news outlet reporting on publications in a particular scientific venue on a particular research topic. The disparities are not fully explained by authors' affiliation locations, suggesting that pragmatic factors such as difficulties in scheduling interviews play only a partial role. Furthermore, among U.S.-based authors, journalists more often use authors' institutions instead of names when referring to non-Anglo-named authors, suggesting that journalists' rhetorical choices are also key. Overall, this study finds evidence of ethnic disparities in how researchers are described in the media coverage of their research, likely affecting thousands of non-Anglo-named scholars in our data alone. △ Less

Submitted 22 January, 2024; v1 submitted 3 September, 2020; originally announced September 2020.

Comments: 68 pages, 8 figures, 11 tables

arXiv:2002.10033 [pdf]

Status drives how we cite: Evidence from thousands of authors

Authors: Misha Teplitskiy, Eamon Duede, Michael Menietti, Karim R. Lakhani

Abstract: Researchers cite works for a variety of reasons, including some having nothing to do with acknowledging influence. The distribution of different citation types in the literature, and which papers attract which types, is poorly understood. We investigate high-influence and low-influence citations and the mechanisms producing them using 17,154 ground-truth citation types provided via survey by 9,380… ▽ More Researchers cite works for a variety of reasons, including some having nothing to do with acknowledging influence. The distribution of different citation types in the literature, and which papers attract which types, is poorly understood. We investigate high-influence and low-influence citations and the mechanisms producing them using 17,154 ground-truth citation types provided via survey by 9,380 authors systematically sampled across academic fields. Overall, 54% of citations denote little-to-no influence and these citations are concentrated among low status (lightly cited) papers. In contrast, high-influence citations are concentrated among high status (highly cited) papers through a number of steps that resemble a pipeline. Authors discover highly cited papers earlier in their projects, more often through social contacts, and read them more closely. Papers' status, above and beyond any quality differences, directly helps determine their pipeline: experimentally revealing or hiding citation counts during the survey shows that low counts cause lowered perceptions of quality. Accounting for citation types thus reveals a "double status effect": in addition to affecting how often a work is cited, status affects how meaningfully it is cited. Consequently, highly cited papers are even more influential than their raw citation counts suggest. △ Less

Submitted 31 August, 2020; v1 submitted 23 February, 2020; originally announced February 2020.

Comments: Heavily revised narrative in this version including new title, figures, abstract, and narrative structure. Central empirical findings unchanged

arXiv:1802.01270 [pdf]

doi 10.1016/j.respol.2018.06.014

The Social Structure of Consensus in Scientific Review

Authors: Misha Teplitskiy, Daniel Acuna, Aida Elamrani-Raoult, Konrad Kording, James Evans

Abstract: Personal connections between creators and evaluators of scientific works are ubiquitous, and the possibility of bias ever-present. Although connections have been shown to bias prospective judgments of (uncertain) future performance, it is unknown whether such biases occur in the much more concrete task of assessing the scientific validity of already completed work, and if so, why. This study prese… ▽ More Personal connections between creators and evaluators of scientific works are ubiquitous, and the possibility of bias ever-present. Although connections have been shown to bias prospective judgments of (uncertain) future performance, it is unknown whether such biases occur in the much more concrete task of assessing the scientific validity of already completed work, and if so, why. This study presents evidence that personal connections between authors and reviewers of neuroscience manuscripts are associated with biased judgments and explores the mechanisms driving the effect. Using reviews from 7,981 neuroscience manuscripts submitted to the journal PLOS ONE, which instructs reviewers to evaluate manuscripts only on scientific validity, we find that reviewers favored authors close in the co-authorship network by ~0.11 points on a 1.0 - 4.0 scale for each step of proximity. PLOS ONE's validity-focused review and the substantial amount of favoritism shown by distant vs. very distant reviewers, both of whom should have little to gain from nepotism, point to the central role of substantive disagreements between scientists in different "schools of thought." The results suggest that removing bias from peer review cannot be accomplished simply by recusing the closely-connected reviewers, and highlight the value of recruiting reviewers embedded in diverse professional networks. △ Less

Submitted 5 February, 2018; originally announced February 2018.

Journal ref: Research Policy. 2018

arXiv:1712.06414 [pdf, other]

doi 10.1038/s41562-019-0541-6

The Wisdom of Polarized Crowds

Authors: Feng Shi, Misha Teplitskiy, Eamon Duede, James Evans

Abstract: As political polarization in the United States continues to rise, the question of whether polarized individuals can fruitfully cooperate becomes pressing. Although diversity of individual perspectives typically leads to superior team performance on complex tasks, strong political perspectives have been associated with conflict, misinformation and a reluctance to engage with people and perspectives… ▽ More As political polarization in the United States continues to rise, the question of whether polarized individuals can fruitfully cooperate becomes pressing. Although diversity of individual perspectives typically leads to superior team performance on complex tasks, strong political perspectives have been associated with conflict, misinformation and a reluctance to engage with people and perspectives beyond one's echo chamber. It is unclear whether self-selected teams of politically diverse individuals will create higher or lower quality outcomes. In this paper, we explore the effect of team political composition on performance through analysis of millions of edits to Wikipedia's Political, Social Issues, and Science articles. We measure editors' political alignments by their contributions to conservative versus liberal articles. A survey of editors validates that those who primarily edit liberal articles identify more strongly with the Democratic party and those who edit conservative ones with the Republican party. Our analysis then reveals that polarized teams---those consisting of a balanced set of politically diverse editors---create articles of higher quality than politically homogeneous teams. The effect appears most strongly in Wikipedia's Political articles, but is also observed in Social Issues and even Science articles. Analysis of article "talk pages" reveals that politically polarized teams engage in longer, more constructive, competitive, and substantively focused but linguistically diverse debates than political moderates. More intense use of Wikipedia policies by politically diverse teams suggests institutional design principles to help unleash the power of politically polarized teams. △ Less

Submitted 29 November, 2017; originally announced December 2017.

Journal ref: Nature Human Behavior. 2019

arXiv:1506.07608 [pdf]

doi 10.1002/asi.23687

Amplifying the Impact of Open Access: Wikipedia and the Diffusion of Science

Authors: Misha Teplitskiy, Grace Lu, Eamon Duede

Abstract: With the rise of Wikipedia as a first-stop source for scientific knowledge, it is important to compare its representation of that knowledge to that of the academic literature. Here we identify the 250 most heavily used journals in each of 26 research fields (4,721 journals, 19.4M articles in total) indexed by the Scopus database, and test whether topic, academic status, and accessibility make arti… ▽ More With the rise of Wikipedia as a first-stop source for scientific knowledge, it is important to compare its representation of that knowledge to that of the academic literature. Here we identify the 250 most heavily used journals in each of 26 research fields (4,721 journals, 19.4M articles in total) indexed by the Scopus database, and test whether topic, academic status, and accessibility make articles from these journals more or less likely to be referenced on Wikipedia. We find that a journal's academic status (impact factor) and accessibility (open access policy) both strongly increase the probability of it being referenced on Wikipedia. Controlling for field and impact factor, the odds that an open access journal is referenced on the English Wikipedia are 47% higher compared to paywall journals. One of the implications of this study is that a major consequence of open access policies is to significantly amplify the diffusion of science, through an intermediary like Wikipedia, to a broad audience. △ Less

Submitted 3 June, 2016; v1 submitted 25 June, 2015; originally announced June 2015.

Comments: An earlier version of this paper was presented at the Wikipedia Workshop at 9th International Conference on Web and Social Media (ICWSM), Oxford, UK, Forthcoming in: Journal of the Association for Information Science and Technology, 2015

Showing 1–11 of 11 results for author: Teplitskiy, M