Skip to main content

Showing 1–3 of 3 results for author: Wolken, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17633  [pdf, other

    cs.CL cs.LG

    Knowledge Distillation in Automated Annotation: Supervised Text Classification with LLM-Generated Training Labels

    Authors: Nicholas Pangakis, Samuel Wolken

    Abstract: Computational social science (CSS) practitioners often rely on human-labeled data to fine-tune supervised text classifiers. We assess the potential for researchers to augment or replace human-generated training data with surrogate training labels from generative large language models (LLMs). We introduce a recommended workflow and test this LLM application by replicating 14 classification tasks an… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: In Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science

  2. arXiv:2310.18863  [pdf, other

    cs.IR cs.CY physics.soc-ph

    The diminishing state of shared reality on US television news

    Authors: Homa Hosseinmardi, Samuel Wolken, David M. Rothschild, Duncan J. Watts

    Abstract: The potential for a large, diverse population to coexist peacefully is thought to depend on the existence of a ``shared reality:'' a public sphere in which participants are exposed to similar facts about similar topics. A generation ago, broadcast television news was widely considered to serve this function; however, since the rise of cable news in the 1990s, critics and scholars have worried that… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  3. arXiv:2306.00176  [pdf, other

    cs.CL cs.AI

    Automated Annotation with Generative AI Requires Validation

    Authors: Nicholas Pangakis, Samuel Wolken, Neil Fasching

    Abstract: Generative large language models (LLMs) can be a powerful tool for augmenting text annotation procedures, but their performance varies across annotation tasks due to prompt quality, text data idiosyncrasies, and conceptual difficulty. Because these challenges will persist even as LLM technology improves, we argue that any automated annotation process using an LLM must validate the LLM's performanc… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.