Skip to main content

Showing 1–13 of 13 results for author: McFarland, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.07183  [pdf, other

    cs.CL cs.AI cs.LG cs.SI

    Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

    Authors: Weixin Liang, Zachary Izzo, Yaohui Zhang, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Haotian Ye, Sheng Liu, Zhi Huang, Daniel A. McFarland, James Y. Zou

    Abstract: We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in… ▽ More

    Submitted 15 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 46 pages, 31 figures, ICML '24

    ACM Class: I.2.7

  2. arXiv:2310.03193  [pdf

    cs.DL cs.CL cs.CY physics.hist-ph physics.soc-ph

    The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices

    Authors: Hancheng Cao, Jesse Dodge, Kyle Lo, Daniel A. McFarland, Lucy Lu Wang

    Abstract: In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and co… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  3. arXiv:2310.01783  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    Can large language models provide useful feedback on research papers? A large-scale empirical analysis

    Authors: Weixin Liang, Yuhui Zhang, Hancheng Cao, Binglu Wang, Daisy Ding, Xinyu Yang, Kailas Vodrahalli, Siyu He, Daniel Smith, Yian Yin, Daniel McFarland, James Zou

    Abstract: Expert feedback lays the foundation of rigorous research. However, the rapid growth of scholarly production and intricate knowledge specialization challenge the conventional scientific feedback mechanisms. High-quality peer reviews are increasingly difficult to obtain. Researchers who are more junior or from under-resourced settings have especially hard times getting timely feedback. With the brea… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  4. arXiv:2301.13431  [pdf, other

    cs.HC cs.CY cs.DL

    Breaking Out of the Ivory Tower: A Large-scale Analysis of Patent Citations to HCI Research

    Authors: Hancheng Cao, Yujie Lu, Yuting Deng, Daniel A. McFarland, Michael S. Bernstein

    Abstract: What is the impact of human-computer interaction research on industry? While it is impossible to track all research impact pathways, the growing literature on translational research impact measurement offers patent citations as one measure of how industry recognizes and draws on research in its inventions. In this paper, we perform a large-scale measurement study primarily of 70,000 patent citatio… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: accepted to CHI 2023

  5. arXiv:2204.10676  [pdf, other

    stat.ME cs.SI

    A Bayesian actor-oriented multilevel relational event model with hypothesis testing procedures

    Authors: Fabio Vieira, Roger Leenders, Daniel McFarland, Joris Mulder

    Abstract: Relational event network data are becoming increasingly available. Consequently, statistical models for such data have also surfaced. These models mainly focus on the analysis of single networks, while in many applications, multiple independent event sequences are observed, which are likely to display similar social interaction dynamics. Furthermore, statistical methods for testing hypotheses abou… ▽ More

    Submitted 7 June, 2023; v1 submitted 22 April, 2022; originally announced April 2022.

  6. arXiv:2010.06657  [pdf, other

    cs.CY cs.CL cs.DL

    Will This Idea Spread Beyond Academia? Understanding Knowledge Transfer of Scientific Concepts across Text Corpora

    Authors: Hancheng Cao, Mengjie Cheng, Zhepeng Cen, Daniel A. McFarland, Xiang Ren

    Abstract: What kind of basic research ideas are more likely to get applied in practice? There is a long line of research investigating patterns of knowledge transfer, but it generally focuses on documents as the unit of analysis and follow their transfer into practice for a specific scientific domain. Here we study translational research at the level of scientific concepts for all scientific fields. We do t… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020 Findings

  7. arXiv:2007.13469  [pdf

    q-bio.BM cs.CE q-bio.QM

    A Preliminary Investigation in the Molecular Basis of Host Shutoff Mechanism in SARS-CoV

    Authors: Niharika Pandala, Casey A. Cole, Devaun McFarland, Anita Nag, Homayoun Valafar

    Abstract: Recent events leading to the worldwide pandemic of COVID-19 have demonstrated the effective use of genomic sequencing technologies to establish the genetic sequence of this virus. In contrast, the COVID-19 pandemic has demonstrated the absence of computational approaches to understand the molecular basis of this infection rapidly. Here we present an integrated approach to the study of the nsp1 pro… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: Consists of 9 pages, 8 figures and 7 tables. 11th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics 2020

  8. arXiv:2005.10739  [pdf

    q-bio.QM cs.CE

    Assessing the Precision and Recall of msTALI as Applied to an Active-Site Study on Fold Families

    Authors: Devaun McFarland, Homayoun Valafar

    Abstract: Proteins execute various activities required by biological cells. Further, they structurally support and pro-mote important biochemical reactions which functionally are sparked by active-sites. Active-sites are regions where reac-tions and binding events take place directly; they foster pro-tein purpose. Describing functional relationships depends on factors that incorporate sequence, structure, a… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: 8 pages, 3 figures, 5 tables. This is an extended version of a similar abridged or short version used in conference

  9. arXiv:2004.01291  [pdf

    cs.DL stat.AP

    Map** Three Decades of Intellectual Change in Academia

    Authors: Daniel Ramage, Christopher D. Manning, Daniel A. McFarland

    Abstract: Research on the development of science has focused on the creation of multidisciplinary teams. However, while this coming together of people is symmetrical, the ideas, methods, and vocabulary of science have a directional flow. We present a statistical model of the text of dissertation abstracts from 1980 to 2010, revealing for the first time the large-scale flow of language across fields. Results… ▽ More

    Submitted 18 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: 10 pages and 6 figures plus appendix of 5 pages and 1 figure

  10. arXiv:1909.02063  [pdf

    cs.SI cs.CL stat.AP stat.ML

    The Diversity-Innovation Paradox in Science

    Authors: Bas Hofstra, Vivek V. Kulkarni, Sebastian Munoz-Najar Galvez, Bryan He, Dan Jurafsky, Daniel A. McFarland

    Abstract: Prior work finds a diversity paradox: diversity breeds innovation, and yet, underrepresented groups that diversify organizations have less successful careers within them. Does the diversity paradox hold for scientists as well? We study this by utilizing a near-population of ~1.2 million US doctoral recipients from 1977-2015 and following their careers into publishing and faculty positions. We use… ▽ More

    Submitted 15 January, 2020; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: Updated paper; tightened up terminology, added better theoretical explanation, tested for a mechanism in the updated paper, added robustness analyses, updated and improved metrics across the board

  11. arXiv:1609.00435  [pdf, other

    cs.CL cs.DL

    Citation Classification for Behavioral Analysis of a Scientific Field

    Authors: David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky

    Abstract: Citations are an important indicator of the state of a scientific field, reflecting how authors frame their work, and influencing uptake by future scholars. However, our understanding of citation behavior has been limited to small-scale manual citation analysis. We perform the largest behavioral study of citations to date, analyzing how citations are both framed and taken up by scholars in one e… ▽ More

    Submitted 1 September, 2016; originally announced September 2016.

  12. arXiv:1403.6870  [pdf, other

    cs.MS

    A modified ziggurat algorithm for generating exponentially- and normally-distributed pseudorandom numbers

    Authors: Christopher D McFarland

    Abstract: The Ziggurat Algorithm is a very fast rejection sampling method for generating PseudoRandom Numbers (PRNs) from common statistical distributions. The algorithm divides a distribution into rectangular layers that stack on top of each other (resembling a Ziggurat), subsuming the desired distribution. Random values within these rectangular layers are then sampled by rejection. This implementation spl… ▽ More

    Submitted 21 April, 2014; v1 submitted 26 March, 2014; originally announced March 2014.

  13. arXiv:1004.3351  [pdf, other

    cs.DL

    Citing for High Impact

    Authors: Xiaolin Shi, Jure Leskovec, Daniel A. McFarland

    Abstract: The question of citation behavior has always intrigued scientists from various disciplines. While general citation patterns have been widely studied in the literature we develop the notion of citation projection graphs by investigating the citations among the publications that a given paper cites. We investigate how patterns of citations vary between various scientific disciplines and how such pat… ▽ More

    Submitted 20 April, 2010; originally announced April 2010.

    Comments: 10 pages, 6 figures, 1 table

    ACM Class: H.3.7; H.4.0