-
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Authors:
Weixin Liang,
Zachary Izzo,
Yaohui Zhang,
Haley Lepp,
Hancheng Cao,
Xuandong Zhao,
Lingjiao Chen,
Haotian Ye,
Sheng Liu,
Zhi Huang,
Daniel A. McFarland,
James Y. Zou
Abstract:
We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in…
▽ More
We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in AI conferences that took place after the release of ChatGPT: ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023. Our results suggest that between 6.5% and 16.9% of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates. The circumstances in which generated text occurs offer insight into user behavior: the estimated fraction of LLM-generated text is higher in reviews which report lower confidence, were submitted close to the deadline, and from reviewers who are less likely to respond to author rebuttals. We also observe corpus-level trends in generated text which may be too subtle to detect at the individual level, and discuss the implications of such trends on peer review. We call for future interdisciplinary work to examine how LLM use is changing our information and knowledge practices.
△ Less
Submitted 15 June, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices
Authors:
Hancheng Cao,
Jesse Dodge,
Kyle Lo,
Daniel A. McFarland,
Lucy Lu Wang
Abstract:
In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and co…
▽ More
In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and computer science to analyze the adoption of data and method link-sharing practices over time and their impact on article reception. To identify links to data and methods, we train a neural text classification model to automatically classify URL types based on contextual mentions in papers. We find evidence that the practice of link-sharing to methods and data is spreading as more papers include such URLs over time. Reproducibility efforts may also be spreading because the same links are being increasingly reused across papers (especially in computer science); and these links are increasingly concentrated within fewer web domains (e.g. Github) over time. Lastly, articles that share data and method links receive increased recognition in terms of citation count, with a stronger effect when the shared links are active (rather than defunct). Together, these findings demonstrate the increased spread and perceived value of data and method sharing practices in open science.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Authors:
Weixin Liang,
Yuhui Zhang,
Hancheng Cao,
Binglu Wang,
Daisy Ding,
Xinyu Yang,
Kailas Vodrahalli,
Siyu He,
Daniel Smith,
Yian Yin,
Daniel McFarland,
James Zou
Abstract:
Expert feedback lays the foundation of rigorous research. However, the rapid growth of scholarly production and intricate knowledge specialization challenge the conventional scientific feedback mechanisms. High-quality peer reviews are increasingly difficult to obtain. Researchers who are more junior or from under-resourced settings have especially hard times getting timely feedback. With the brea…
▽ More
Expert feedback lays the foundation of rigorous research. However, the rapid growth of scholarly production and intricate knowledge specialization challenge the conventional scientific feedback mechanisms. High-quality peer reviews are increasingly difficult to obtain. Researchers who are more junior or from under-resourced settings have especially hard times getting timely feedback. With the breakthrough of large language models (LLM) such as GPT-4, there is growing interest in using LLMs to generate scientific feedback on research manuscripts. However, the utility of LLM-generated feedback has not been systematically studied. To address this gap, we created an automated pipeline using GPT-4 to provide comments on the full PDFs of scientific papers. We evaluated the quality of GPT-4's feedback through two large-scale studies. We first quantitatively compared GPT-4's generated feedback with human peer reviewer feedback in 15 Nature family journals (3,096 papers in total) and the ICLR machine learning conference (1,709 papers). The overlap in the points raised by GPT-4 and by human reviewers (average overlap 30.85% for Nature journals, 39.23% for ICLR) is comparable to the overlap between two human reviewers (average overlap 28.58% for Nature journals, 35.25% for ICLR). The overlap between GPT-4 and human reviewers is larger for the weaker papers. We then conducted a prospective user study with 308 researchers from 110 US institutions in the field of AI and computational biology to understand how researchers perceive feedback generated by our GPT-4 system on their own papers. Overall, more than half (57.4%) of the users found GPT-4 generated feedback helpful/very helpful and 82.4% found it more beneficial than feedback from at least some human reviewers. While our findings show that LLM-generated feedback can help researchers, we also identify several limitations.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Breaking Out of the Ivory Tower: A Large-scale Analysis of Patent Citations to HCI Research
Authors:
Hancheng Cao,
Yujie Lu,
Yuting Deng,
Daniel A. McFarland,
Michael S. Bernstein
Abstract:
What is the impact of human-computer interaction research on industry? While it is impossible to track all research impact pathways, the growing literature on translational research impact measurement offers patent citations as one measure of how industry recognizes and draws on research in its inventions. In this paper, we perform a large-scale measurement study primarily of 70,000 patent citatio…
▽ More
What is the impact of human-computer interaction research on industry? While it is impossible to track all research impact pathways, the growing literature on translational research impact measurement offers patent citations as one measure of how industry recognizes and draws on research in its inventions. In this paper, we perform a large-scale measurement study primarily of 70,000 patent citations to premier HCI research venues, tracing how HCI research are cited in United States patents over the last 30 years. We observe that 20.1% of papers from these venues, including 60--80% of papers at UIST and 13% of papers in a broader dataset of SIGCHI-sponsored venues overall, are cited by patents -- far greater than premier venues in science overall (9.7%) and NLP (11%). However, the time lag between a patent and its paper citations is long (10.5 years) and getting longer, suggesting that HCI research and practice may not be efficiently connected.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
A Bayesian actor-oriented multilevel relational event model with hypothesis testing procedures
Authors:
Fabio Vieira,
Roger Leenders,
Daniel McFarland,
Joris Mulder
Abstract:
Relational event network data are becoming increasingly available. Consequently, statistical models for such data have also surfaced. These models mainly focus on the analysis of single networks, while in many applications, multiple independent event sequences are observed, which are likely to display similar social interaction dynamics. Furthermore, statistical methods for testing hypotheses abou…
▽ More
Relational event network data are becoming increasingly available. Consequently, statistical models for such data have also surfaced. These models mainly focus on the analysis of single networks, while in many applications, multiple independent event sequences are observed, which are likely to display similar social interaction dynamics. Furthermore, statistical methods for testing hypotheses about social interaction behavior are underdeveloped. Therefore, the contribution of the current paper is twofold. First, we present a multilevel extension of the dynamic actor-oriented model, which allows researchers to model sender and receiver processes separately. The multilevel formulation enables principled probabilistic borrowing of information across networks to accurately estimate drivers of social dynamics. Second, a flexible methodology is proposed to test hypotheses about common and heterogeneous social interaction drivers across relational event sequences. Social interaction data between children and teachers in classrooms are used to showcase the methodology.
△ Less
Submitted 7 June, 2023; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Will This Idea Spread Beyond Academia? Understanding Knowledge Transfer of Scientific Concepts across Text Corpora
Authors:
Hancheng Cao,
Mengjie Cheng,
Zhepeng Cen,
Daniel A. McFarland,
Xiang Ren
Abstract:
What kind of basic research ideas are more likely to get applied in practice? There is a long line of research investigating patterns of knowledge transfer, but it generally focuses on documents as the unit of analysis and follow their transfer into practice for a specific scientific domain. Here we study translational research at the level of scientific concepts for all scientific fields. We do t…
▽ More
What kind of basic research ideas are more likely to get applied in practice? There is a long line of research investigating patterns of knowledge transfer, but it generally focuses on documents as the unit of analysis and follow their transfer into practice for a specific scientific domain. Here we study translational research at the level of scientific concepts for all scientific fields. We do this through text mining and predictive modeling using three corpora: 38.6 million paper abstracts, 4 million patent documents, and 0.28 million clinical trials. We extract scientific concepts (i.e., phrases) from corpora as instantiations of "research ideas", create concept-level features as motivated by literature, and then follow the trajectories of over 450,000 new concepts (emerged from 1995-2014) to identify factors that lead only a small proportion of these ideas to be used in inventions and drug trials. Results from our analysis suggest several mechanisms that distinguish which scientific concept will be adopted in practice, and which will not. We also demonstrate that our derived features can be used to explain and predict knowledge transfer with high accuracy. Our work provides greater understanding of knowledge transfer for researchers, practitioners, and government agencies interested in encouraging translational research.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
A Preliminary Investigation in the Molecular Basis of Host Shutoff Mechanism in SARS-CoV
Authors:
Niharika Pandala,
Casey A. Cole,
Devaun McFarland,
Anita Nag,
Homayoun Valafar
Abstract:
Recent events leading to the worldwide pandemic of COVID-19 have demonstrated the effective use of genomic sequencing technologies to establish the genetic sequence of this virus. In contrast, the COVID-19 pandemic has demonstrated the absence of computational approaches to understand the molecular basis of this infection rapidly. Here we present an integrated approach to the study of the nsp1 pro…
▽ More
Recent events leading to the worldwide pandemic of COVID-19 have demonstrated the effective use of genomic sequencing technologies to establish the genetic sequence of this virus. In contrast, the COVID-19 pandemic has demonstrated the absence of computational approaches to understand the molecular basis of this infection rapidly. Here we present an integrated approach to the study of the nsp1 protein in SARS-CoV-1, which plays an essential role in maintaining the expression of viral proteins and further disabling the host protein expression, also known as the host shutoff mechanism. We present three independent methods of evaluating two potential binding sites speculated to participate in host shutoff by nsp1. We have combined results from computed models of nsp1, with deep mining of all existing protein structures (using PDBMine), and binding site recognition (using msTALI) to examine the two sites consisting of residues 55-59 and 73-80. Based on our preliminary results, we conclude that the residues 73-80 appear as the regions that facilitate the critical initial steps in the function of nsp1. Given the 90% sequence identity between nsp1 from SARS-CoV-1 and SARS-CoV-2, we conjecture the same critical initiation step in the function of COVID-19 nsp1.
△ Less
Submitted 23 July, 2020;
originally announced July 2020.
-
Assessing the Precision and Recall of msTALI as Applied to an Active-Site Study on Fold Families
Authors:
Devaun McFarland,
Homayoun Valafar
Abstract:
Proteins execute various activities required by biological cells. Further, they structurally support and pro-mote important biochemical reactions which functionally are sparked by active-sites. Active-sites are regions where reac-tions and binding events take place directly; they foster pro-tein purpose. Describing functional relationships depends on factors that incorporate sequence, structure, a…
▽ More
Proteins execute various activities required by biological cells. Further, they structurally support and pro-mote important biochemical reactions which functionally are sparked by active-sites. Active-sites are regions where reac-tions and binding events take place directly; they foster pro-tein purpose. Describing functional relationships depends on factors that incorporate sequence, structure, and the biochem-ical properties of amino acids that form proteins. Our ap-proach to active-site description is computational, and many other approaches utilizing available protein data fall short of ideal. Successful recognition of functional interactions is cru-cial to advancements in protein annotation and the bioinfor-matics field at large. This research outlines our Multiple Structure Torsion Angle Alignment (msTALI) as a suitable strategy for addressing active-site identification by comparing results to other existing methods. Specifically, we address the precision of msTALI across three protein families. Our target proteins are PDBIDs 1A2B, 1B4V, 1B8S, 1COY, 1CXZ, 3COX, 1D7E, 1DPF, 1F9I, 1FTN, 1IJH, 1KOU, 1NWZ, 2PHY, and 1SIC.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Map** Three Decades of Intellectual Change in Academia
Authors:
Daniel Ramage,
Christopher D. Manning,
Daniel A. McFarland
Abstract:
Research on the development of science has focused on the creation of multidisciplinary teams. However, while this coming together of people is symmetrical, the ideas, methods, and vocabulary of science have a directional flow. We present a statistical model of the text of dissertation abstracts from 1980 to 2010, revealing for the first time the large-scale flow of language across fields. Results…
▽ More
Research on the development of science has focused on the creation of multidisciplinary teams. However, while this coming together of people is symmetrical, the ideas, methods, and vocabulary of science have a directional flow. We present a statistical model of the text of dissertation abstracts from 1980 to 2010, revealing for the first time the large-scale flow of language across fields. Results of the analysis include identifying methodological fields that export broadly, emerging topical fields that borrow heavily and expand, and old topical fields that grow insular and retract. Particular findings show a growing split between molecular and ecological forms of biology and a sea change in the humanities and social sciences driven by the rise of gender and ethnic studies.
△ Less
Submitted 18 June, 2020; v1 submitted 2 April, 2020;
originally announced April 2020.
-
The Diversity-Innovation Paradox in Science
Authors:
Bas Hofstra,
Vivek V. Kulkarni,
Sebastian Munoz-Najar Galvez,
Bryan He,
Dan Jurafsky,
Daniel A. McFarland
Abstract:
Prior work finds a diversity paradox: diversity breeds innovation, and yet, underrepresented groups that diversify organizations have less successful careers within them. Does the diversity paradox hold for scientists as well? We study this by utilizing a near-population of ~1.2 million US doctoral recipients from 1977-2015 and following their careers into publishing and faculty positions. We use…
▽ More
Prior work finds a diversity paradox: diversity breeds innovation, and yet, underrepresented groups that diversify organizations have less successful careers within them. Does the diversity paradox hold for scientists as well? We study this by utilizing a near-population of ~1.2 million US doctoral recipients from 1977-2015 and following their careers into publishing and faculty positions. We use text analysis and machine learning to answer a series of questions: How do we detect scientific innovations? Are underrepresented groups more likely to generate scientific innovations? And are the innovations of underrepresented groups adopted and rewarded? Our analyses show that underrepresented groups produce higher rates of scientific novelty. However, their novel contributions are devalued and discounted: e.g., novel contributions by gender and racial minorities are taken up by other scholars at lower rates than novel contributions by gender and racial majorities, and equally impactful contributions of gender and racial minorities are less likely to result in successful scientific careers than for majority groups. These results suggest there may be unwarranted reproduction of stratification in academic careers that discounts diversity's role in innovation and partly explains the underrepresentation of some groups in academia.
△ Less
Submitted 15 January, 2020; v1 submitted 4 September, 2019;
originally announced September 2019.
-
Citation Classification for Behavioral Analysis of a Scientific Field
Authors:
David Jurgens,
Srijan Kumar,
Raine Hoover,
Dan McFarland,
Dan Jurafsky
Abstract:
Citations are an important indicator of the state of a scientific field, reflecting how authors frame their work, and influencing uptake by future scholars. However, our understanding of citation behavior has been limited to small-scale manual citation analysis. We perform the largest behavioral study of citations to date, analyzing how citations are both framed and taken up by scholars in one e…
▽ More
Citations are an important indicator of the state of a scientific field, reflecting how authors frame their work, and influencing uptake by future scholars. However, our understanding of citation behavior has been limited to small-scale manual citation analysis. We perform the largest behavioral study of citations to date, analyzing how citations are both framed and taken up by scholars in one entire field: natural language processing. We introduce a new dataset of nearly 2,000 citations annotated for function and centrality, and use it to develop a state-of-the-art classifier and label the entire ACL Reference Corpus. We then study how citations are framed by authors and use both papers and online traces to track how citations are followed by readers. We demonstrate that authors are sensitive to discourse structure and publication venue when citing, that online readers follow temporal links to previous and future work rather than methodological links, and that how a paper cites related work is predictive of its citation count. Finally, we use changes in citation roles to show that the field of NLP is undergoing a significant increase in consensus.
△ Less
Submitted 1 September, 2016;
originally announced September 2016.
-
A modified ziggurat algorithm for generating exponentially- and normally-distributed pseudorandom numbers
Authors:
Christopher D McFarland
Abstract:
The Ziggurat Algorithm is a very fast rejection sampling method for generating PseudoRandom Numbers (PRNs) from common statistical distributions. The algorithm divides a distribution into rectangular layers that stack on top of each other (resembling a Ziggurat), subsuming the desired distribution. Random values within these rectangular layers are then sampled by rejection. This implementation spl…
▽ More
The Ziggurat Algorithm is a very fast rejection sampling method for generating PseudoRandom Numbers (PRNs) from common statistical distributions. The algorithm divides a distribution into rectangular layers that stack on top of each other (resembling a Ziggurat), subsuming the desired distribution. Random values within these rectangular layers are then sampled by rejection. This implementation splits layers into two types: those constituting the majority that fall completely under the distribution and can be sampled extremely fast without a rejection test, and a few additional layers that encapsulate the fringe of the distribution and require a rejection test. This method offers speedups of 65% for exponentially- and 82% for normally-distributed PRNs when compared to the best available C implementations of these generators. Even greater speedups are obtained when the algorithm is extended to the Python and MATLAB/OCTAVE programming environments.
△ Less
Submitted 21 April, 2014; v1 submitted 26 March, 2014;
originally announced March 2014.
-
Citing for High Impact
Authors:
Xiaolin Shi,
Jure Leskovec,
Daniel A. McFarland
Abstract:
The question of citation behavior has always intrigued scientists from various disciplines. While general citation patterns have been widely studied in the literature we develop the notion of citation projection graphs by investigating the citations among the publications that a given paper cites. We investigate how patterns of citations vary between various scientific disciplines and how such pat…
▽ More
The question of citation behavior has always intrigued scientists from various disciplines. While general citation patterns have been widely studied in the literature we develop the notion of citation projection graphs by investigating the citations among the publications that a given paper cites. We investigate how patterns of citations vary between various scientific disciplines and how such patterns reflect the scientific impact of the paper. We find that idiosyncratic citation patterns are characteristic for low impact papers; while narrow, discipline-focused citation patterns are common for medium impact papers. Our results show that crossing-community, or bridging citation patters are high risk and high reward since such patterns are characteristic for both low and high impact papers. Last, we observe that recently citation networks are trending toward more bridging and interdisciplinary forms.
△ Less
Submitted 20 April, 2010;
originally announced April 2010.