Skip to main content

Showing 1–13 of 13 results for author: Vincent, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14614  [pdf, ps, other

    cs.CY cs.ET cs.IR

    Push and Pull: A Framework for Measuring Attentional Agency

    Authors: Zachary Wojtowicz, Shrey Jain, Nicholas Vincent

    Abstract: We propose a framework for measuring attentional agency - the ability to allocate one's attention according to personal desires, goals, and intentions - on digital platforms. Platforms extend people's limited powers of attention by extrapolating their preferences to large collections of previously unconsidered informational objects. However, platforms typically also allow people to influence one a… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. A Canary in the AI Coal Mine: American Jews May Be Disproportionately Harmed by Intellectual Property Dispossession in Large Language Model Training

    Authors: Heila Precel, Allison McDonald, Brent Hecht, Nicholas Vincent

    Abstract: Systemic property dispossession from minority groups has often been carried out in the name of technological progress. In this paper, we identify evidence that the current paradigm of large language models (LLMs) likely continues this long history. Examining common LLM training datasets, we find that a disproportionate amount of content authored by Jewish Americans is used for training without the… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Preprint, to appear in CHI 2024 proceedings

  3. arXiv:2311.11350  [pdf, ps, other

    cs.CY

    An Alternative to Regulation: The Case for Public AI

    Authors: Nicholas Vincent, David Bau, Sarah Schwettmann, Joshua Tan

    Abstract: Can governments build AI? In this paper, we describe an ongoing effort to develop ``public AI'' -- publicly accessible AI models funded, provisioned, and governed by governments or other public bodies. Public AI presents both an alternative and a complement to standard regulatory approaches to AI, but it also suggests new technical and policy challenges. We present a roadmap for how the ML researc… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: To be presented at Regulatable ML @ NeurIPS2023 workshop

  4. arXiv:2310.04329  [pdf, other

    cs.HC

    Pika: Empowering Non-Programmers to Author Executable Governance Policies in Online Communities

    Authors: Leijie Wang, Nicolas Vincent, Julija Rukanskaitė, Amy X. Zhang

    Abstract: Internet users have formed a wide array of online communities with nuanced and diverse community goals and norms. However, most online platforms only offer a limited set of governance models in their software infrastructure and leave little room for customization. Consequently, technical proficiency becomes a prerequisite for online communities to build governance policies in code, excluding non-p… ▽ More

    Submitted 27 February, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: Conditionally accepted by CHI'2024

  5. The Dimensions of Data Labor: A Road Map for Researchers, Activists, and Policymakers to Empower Data Producers

    Authors: Hanlin Li, Nicholas Vincent, Stevie Chancellor, Brent Hecht

    Abstract: Many recent technological advances (e.g. ChatGPT and search engines) are possible only because of massive amounts of user-generated data produced through user interactions with computing systems or scraped from the web (e.g. behavior logs, user-generated content, and artwork). However, data producers have little say in what data is captured, how it is used, or who it benefits. Organizations with t… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: To appear at the 2023 ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT)

  6. arXiv:2303.16302  [pdf

    cs.SI

    Retracted Articles about COVID-19 Vaccines Enable Vaccine Misinformation on Twitter

    Authors: Rod Abhari, Esteban Villa-Turek, Nicholas Vincent, Henry Dambanemuya, Emőke-Ágnes Horvát

    Abstract: Retracted scientific articles about COVID-19 vaccines have proliferated false claims about vaccination harms and discouraged vaccine acceptance. Our study analyzed the topical content of 4,876 English-language tweets about retracted COVID-19 vaccine research and found that 27.4% of tweets contained retraction-related misinformation. Misinformed tweets either ignored the retraction, or less commonl… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

  7. arXiv:2203.04228  [pdf, other

    cs.SI

    Online Engagement with Retracted Articles: Who, When, and How?

    Authors: Henry K. Dambanemuya, Rod Abhari, Nicholas Vincent, Emőke-Ágnes Horvát

    Abstract: Retracted research discussed on social media can spread misinformation. Yet we lack an understanding of how retracted articles are mentioned by academic and non-academic users. This is especially relevant on Twitter due to the platform's prominent role in science communication. Here, we analyze the pre- and post-retraction differences in Twitter attention and engagement metrics for over 3,800 retr… ▽ More

    Submitted 29 January, 2024; v1 submitted 8 March, 2022; originally announced March 2022.

    ACM Class: K.4.0

  8. arXiv:2105.05241  [pdf, ps, other

    cs.CL cs.CY cs.LG

    Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus

    Authors: Jack Bandy, Nicholas Vincent

    Abstract: Recent literature has underscored the importance of dataset documentation work for machine learning, and part of this work involves addressing "documentation debt" for datasets that have been used widely but documented sparsely. This paper aims to help address documentation debt for BookCorpus, a popular text dataset for training large language models. Notably, researchers have used BookCorpus to… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: Working paper

  9. arXiv:2012.09995  [pdf, other

    cs.CY

    Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies

    Authors: Nicholas Vincent, Hanlin Li, Nicole Tilly, Stevie Chancellor, Brent Hecht

    Abstract: Many powerful computing technologies rely on implicit and explicit data contributions from the public. This dependency suggests a potential source of leverage for the public in its relationship with technology companies: by reducing, stop**, redirecting, or otherwise manipulating data contributions, the public can reduce the effectiveness of many lucrative technologies. In this paper, we synthes… ▽ More

    Submitted 17 February, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: This is a preprint. The paper will be presented at the 2021 Conference on Fairness, Accountability, and Transparency (FAccT 2021)

  10. Behavioral Use Licensing for Responsible AI

    Authors: Danish Contractor, Daniel McDuff, Julia Haines, Jenny Lee, Christopher Hines, Brent Hecht, Nicholas Vincent, Hanlin Li

    Abstract: With the growing reliance on artificial intelligence (AI) for many different applications, the sharing of code, data, and models is important to ensure the replicability and democratization of scientific knowledge. Many high-profile academic publishing venues expect code and models to be submitted and released with papers. Furthermore, developers often want to release these assets to encourage dev… ▽ More

    Submitted 20 October, 2022; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Paper published at ACM FAccT 2022

  11. arXiv:2004.10265  [pdf

    cs.CY cs.IR

    A Deeper Investigation of the Importance of Wikipedia Links to the Success of Search Engines

    Authors: Nicholas Vincent, Brent Hecht

    Abstract: A growing body of work has highlighted the important role that Wikipedia's volunteer-created content plays in hel** search engines achieve their core goal of addressing the information needs of millions of people. In this paper, we report the results of an investigation into the incidence of Wikipedia links in search engine results pages (SERPs). Our results extend prior work by considering thre… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: This is a pre-print of a paper accepted to the non-archival track of the WikiWorkshop at the Web Conference 2020

  12. arXiv:1912.00757  [pdf

    cs.CY

    Map** the Potential and Pitfalls of "Data Dividends" as a Means of Sharing the Profits of Artificial Intelligence

    Authors: Nicholas Vincent, Yichun Li, Renee Zha, Brent Hecht

    Abstract: Identifying strategies to more broadly distribute the economic winnings of AI technologies is a growing priority in HCI and other fields. One idea gaining prominence centers on "data dividends", or sharing the profits of AI technologies with the people who generated the data on which these technologies rely. Despite the rapidly growing discussion around data dividends - including backing by promin… ▽ More

    Submitted 18 November, 2019; originally announced December 2019.

    Comments: This is a working draft. It has not been peer-reviewed and is intended for internal discussion in the computing community

  13. arXiv:1906.08576  [pdf

    cs.CY

    Measuring the Importance of User-Generated Content to Search Engines

    Authors: Nicholas Vincent, Isaac Johnson, Patrick Sheehan, Brent Hecht

    Abstract: Search engines are some of the most popular and profitable intelligent technologies in existence. Recent research, however, has suggested that search engines may be surprisingly dependent on user-created content like Wikipedia articles to address user information needs. In this paper, we perform a rigorous audit of the extent to which Google leverages Wikipedia and other user-generated content to… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: This version includes a bibliography entry that was missing from the first version of the text due to a processing error. This is a preprint of a paper accepted at ICWSM 2019. Please cite that version instead