Skip to main content

Showing 1–13 of 13 results for author: Hofman, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.18807  [pdf, other

    cs.LG stat.ME

    Pre-registration for Predictive Modeling

    Authors: Jake M. Hofman, Angelos Chatzimparmpas, Amit Sharma, Duncan J. Watts, Jessica Hullman

    Abstract: Amid rising concerns of reproducibility and generalizability in predictive modeling, we explore the possibility and potential benefits of introducing pre-registration to the field. Despite notable advancements in predictive modeling, spanning core machine learning tasks to various scientific applications, challenges such as overlooked contextual factors, data-dependent decision-making, and uninten… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  2. arXiv:2308.16491  [pdf, other

    cs.CY cs.AI cs.LG cs.SI

    In-class Data Analysis Replications: Teaching Students while Testing Science

    Authors: Kristina Gligoric, Tiziano Piccardi, Jake Hofman, Robert West

    Abstract: Science is facing a reproducibility crisis. Previous work has proposed incorporating data analysis replications into classrooms as a potential solution. However, despite the potential benefits, it is unclear whether this approach is feasible, and if so, what the involved stakeholders-students, educators, and scientists-should expect from it. Can students perform a data analysis replication over th… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  3. arXiv:2308.07832  [pdf, ps, other

    cs.LG cs.AI stat.ME

    REFORMS: Reporting Standards for Machine Learning Based Science

    Authors: Sayash Kapoor, Emily Cantrell, Kenny Peng, Thanh Hien Pham, Christopher A. Bail, Odd Erik Gundersen, Jake M. Hofman, Jessica Hullman, Michael A. Lones, Momin M. Malik, Priyanka Nanayakkara, Russell A. Poldrack, Inioluwa Deborah Raji, Michael Roberts, Matthew J. Salganik, Marta Serra-Garcia, Brandon M. Stewart, Gilles Vandewiele, Arvind Narayanan

    Abstract: Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways acros… ▽ More

    Submitted 19 September, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

  4. arXiv:2308.01535  [pdf

    cs.HC cs.CL cs.CY

    Comparing scalable strategies for generating numerical perspectives

    Authors: Hancheng Cao, Sofia Eleni Spatharioti, Daniel G. Goldstein, Jake M. Hofman

    Abstract: Numerical perspectives help people understand extreme and unfamiliar numbers (e.g., \$330 billion is about \$1,000 per person in the United States). While research shows perspectives to be helpful, generating them at scale is challenging both because it is difficult to identify what makes some analogies more helpful than others, and because what is most helpful can vary based on the context in whi… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  5. arXiv:2307.03744  [pdf, other

    cs.HC

    Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment

    Authors: Sofia Eleni Spatharioti, David M. Rothschild, Daniel G. Goldstein, Jake M. Hofman

    Abstract: Recent advances in the development of large language models are rapidly changing how online applications function. LLM-based search tools, for instance, offer a natural language interface that can accommodate complex queries and provide detailed, direct responses. At the same time, there have been concerns about the veracity of the information provided by LLM-based tools due to potential mistakes… ▽ More

    Submitted 8 November, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

  6. Multi-Target Multiplicity: Flexibility and Fairness in Target Specification under Resource Constraints

    Authors: Jamelle Watson-Daniels, Solon Barocas, Jake M. Hofman, Alexandra Chouldechova

    Abstract: Prediction models have been widely adopted as the basis for decision-making in domains as diverse as employment, education, lending, and health. Yet, few real world problems readily present themselves as precisely formulated prediction tasks. In particular, there are often many reasonable target variable options. Prior work has argued that this is an important and sometimes underappreciated choice… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: Conference on Fairness, Accountability, and Transparency (FAccT '23), June 12-15, 2023, Chicago, IL, USA

    Journal ref: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, June 2023, Pages 297-311

  7. arXiv:2005.04343  [pdf, other

    cs.CY cs.CR cs.HC cs.LG

    How good is good enough for COVID19 apps? The influence of benefits, accuracy, and privacy on willingness to adopt

    Authors: Gabriel Kaptchuk, Daniel G. Goldstein, Eszter Hargittai, Jake Hofman, Elissa M. Redmiles

    Abstract: A growing number of contact tracing apps are being developed to complement manual contact tracing. A key question is whether users will be willing to adopt these contact tracing apps. In this work, we survey over 4,500 Americans to evaluate (1) the effect of both accuracy and privacy concerns on reported willingness to install COVID19 contact tracing apps and (2) how different groups of users weig… ▽ More

    Submitted 18 May, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

  8. arXiv:1802.07810  [pdf, other

    cs.AI cs.CY

    Manipulating and Measuring Model Interpretability

    Authors: Forough Poursabzi-Sangdeh, Daniel G. Goldstein, Jake M. Hofman, Jennifer Wortman Vaughan, Hanna Wallach

    Abstract: With machine learning models being increasingly used to aid decision making even in high-stakes domains, there has been a growing interest in develo** interpretable models. Although many supposedly interpretable models have been proposed, there have been relatively few experimental studies investigating whether these models achieve their intended effects, such as making people more closely follo… ▽ More

    Submitted 15 August, 2021; v1 submitted 21 February, 2018; originally announced February 2018.

    ACM Class: I.2

  9. arXiv:1611.09414  [pdf, other

    stat.ME cs.AI stat.AP

    Split-door criterion: Identification of causal effects through auxiliary outcomes

    Authors: Amit Sharma, Jake M. Hofman, Duncan J. Watts

    Abstract: We present a method for estimating causal effects in time series data when fine-grained information about the outcome of interest is available. Specifically, we examine what we call the split-door setting, where the outcome variable can be split into two parts: one that is potentially affected by the cause being studied and another that is independent of it, with both parts sharing the same (unobs… ▽ More

    Submitted 14 June, 2018; v1 submitted 28 November, 2016; originally announced November 2016.

    ACM Class: H.2.8; I.2.6; H.3.3

  10. arXiv:1602.01013  [pdf, other

    cs.SI physics.data-an physics.soc-ph

    Exploring limits to prediction in complex social systems

    Authors: Travis Martin, Jake M. Hofman, Amit Sharma, Ashton Anderson, Duncan J. Watts

    Abstract: How predictable is success in complex social systems? In spite of a recent profusion of prediction studies that exploit online social and information network data, this question remains unanswered, in part because it has not been adequately specified. In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two gen… ▽ More

    Submitted 2 February, 2016; originally announced February 2016.

    Comments: 12 pages, 6 figures, Proceedings of the 25th ACM International World Wide Web Conference (WWW) 2016

  11. Estimating the Causal Impact of Recommendation Systems from Observational Data

    Authors: Amit Sharma, Jake M. Hofman, Duncan J. Watts

    Abstract: Recommendation systems are an increasingly prominent part of the web, accounting for up to a third of all traffic on several of the world's most popular sites. Nevertheless, little is known about how much activity such systems actually cause over and above activity that would have occurred via other means (e.g., search) if recommendations were absent. Although the ideal way to estimate the causal… ▽ More

    Submitted 19 October, 2015; originally announced October 2015.

    Comments: ACM Conference on Economics and Computation (EC 2015)

    ACM Class: J.4

  12. arXiv:1311.1704  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    Scalable Recommendation with Poisson Factorization

    Authors: Prem Gopalan, Jake M. Hofman, David M. Blei

    Abstract: We develop a Bayesian Poisson matrix factorization model for forming recommendations from sparse user behavior data. These data are large user/item matrices where each user has provided feedback on only a small subset of items, either explicitly (e.g., through star ratings) or implicitly (e.g., through views or purchases). In contrast to traditional matrix factorization approaches, Poisson factori… ▽ More

    Submitted 20 May, 2014; v1 submitted 7 November, 2013; originally announced November 2013.

  13. arXiv:0905.0106  [pdf, ps, other

    physics.soc-ph cs.CY physics.data-an

    Characterizing Individual Communication Patterns

    Authors: R. Dean Malmgren, Jake M. Hofman, Luis A. N. Amaral, Duncan J. Watts

    Abstract: The increasing availability of electronic communication data, such as that arising from e-mail exchange, presents social and information scientists with new possibilities for characterizing individual behavior and, by extension, identifying latent structure in human populations. Here, we propose a model of individual e-mail communication that is sufficiently rich to capture meaningful variabilit… ▽ More

    Submitted 1 May, 2009; originally announced May 2009.

    Comments: 9 pages, 6 figures, to appear in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'09), June 28-July 1, Paris, France