Skip to main content

Showing 1–10 of 10 results for author: Alvarez, R M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.07010  [pdf, other

    cs.CY cs.CL

    Deciphering public attention to geoengineering and climate issues using machine learning and dynamic analysis

    Authors: Ramit Debnath, Pengyu Zhang, Tianzhu Qin, R. Michael Alvarez, Shaun D. Fitzgerald

    Abstract: As the conversation around using geoengineering to combat climate change intensifies, it is imperative to engage the public and deeply understand their perspectives on geoengineering research, development, and potential deployment. Through a comprehensive data-driven investigation, this paper explores the types of news that captivate public interest in geoengineering. We delved into 30,773 English… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 46 page, 6 main figures and SI

    ACM Class: J.4; K.4

  2. arXiv:2405.04716  [pdf, other

    cs.CY cs.AI cs.LG cs.NE

    Physics-based deep learning reveals rising heating demand heightens air pollution in Norwegian cities

    Authors: Cong Cao, Ramit Debnath, R. Michael Alvarez

    Abstract: Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air p… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 52 pages, 23 figures

    ACM Class: K.4.1; J.2; I.2

  3. arXiv:2404.08816  [pdf, other

    cs.CL econ.EM

    Evaluating the Quality of Answers in Political Q&A Sessions with Large Language Models

    Authors: R. Michael Alvarez, Jacob Morrier

    Abstract: This paper presents a new approach to evaluating the quality of answers in political question-and-answer sessions. We propose to measure an answer's quality based on the degree to which it allows us to infer the initial question accurately. This conception of answer quality inherently reflects their relevance to initial questions. Drawing parallels with semantic search, we argue that this measurem… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  4. arXiv:2302.07371  [pdf, other

    cs.CL cs.CY

    BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models

    Authors: Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, Roy Jiang, R. Michael Alvarez, Anima Anandkumar

    Abstract: Pretrained Language Models (PLMs) harbor inherent social biases that can result in harmful real-world implications. Such social biases are measured through the probability values that PLMs output for different social groups and attributes appearing in a set of test sentences. However, bias testing is currently cumbersome since the test sentences are generated either from a limited set of manual te… ▽ More

    Submitted 6 December, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    MSC Class: 68T50 ACM Class: I.2.7; J.5; K.4.1

  5. arXiv:2211.11798  [pdf, other

    cs.CL cs.AI

    Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

    Authors: Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R. Michael Alvarez, Anima Anandkumar

    Abstract: Labeling social-media data for custom dimensions of toxicity and social bias is challenging and labor-intensive. Existing transfer and active learning approaches meant to reduce annotation effort require fine-tuning, which suffers from over-fitting to noise and can cause domain shift with small sample sizes. In this work, we propose a novel Active Transfer Few-shot Instructions (ATF) approach whic… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS Workshop on Transfer Learning for Natural Language Processing, 2022, New Orleans

  6. arXiv:2203.02818  [pdf, other

    cs.LG stat.ML

    Fuzzy Forests For Feature Selection in High-Dimensional Survey Data: An Application to the 2020 U.S. Presidential Election

    Authors: Sreemanti Dey, R. Michael Alvarez

    Abstract: An increasingly common methodological issue in the field of social science is high-dimensional and highly correlated datasets that are unamenable to the traditional deductive framework of study. Analysis of candidate choice in the 2020 Presidential Election is one area in which this issue presents itself: in order to test the many theories explaining the outcome of the election, it is necessary to… ▽ More

    Submitted 5 March, 2022; originally announced March 2022.

    Comments: Paper presented at The 3rd International Conference on Applied Machine Learning and Data Analytics, December 16-17 2021, where it was named the Best Paper of the conference

  7. arXiv:2102.12596  [pdf, other

    cs.SI cs.LG

    Dynamic Social Media Monitoring for Fast-Evolving Online Discussions

    Authors: Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, Anima Anandkumar

    Abstract: Tracking and collecting fast-evolving online discussions provides vast data for studying social media usage and its role in people's public lives. However, collecting social media data using a static set of keywords fails to satisfy the growing need to monitor dynamic conversations and to study fast-changing topics. We propose a dynamic keyword search method to maximize the coverage of relevant in… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: Preprint, Under Review

  8. arXiv:2006.09693  [pdf, other

    stat.ML cs.LG

    FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

    Authors: Yuancheng Xu, Athanasse Zafirov, R. Michael Alvarez, Dan Kojis, Min Tan, Christina M. Ramirez

    Abstract: This paper proposes FREEtree, a tree-based method for high dimensional longitudinal data with correlated features. Popular machine learning approaches, like Random Forests, commonly used for variable selection do not perform well when there are correlated features and do not account for data observed over time. FREEtree deals with longitudinal data by using a piecewise random effects model. It als… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

  9. arXiv:2005.02442  [pdf, other

    cs.CY

    Reliable and Efficient Long-Term Social Media Monitoring

    Authors: Jian Cao, Nicholas Adams-Cohen, R. Michael Alvarez

    Abstract: Social media data is now widely used by many academic researchers. However, long-term social media data collection projects, which most typically involve collecting data from public-use APIs, often encounter issues when relying on local-area network servers (LANs) to collect high-volume streaming social media data over long periods of time. In this technical report, we present a cloud-based data c… ▽ More

    Submitted 16 November, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

  10. arXiv:1911.05332  [pdf, other

    cs.LG cs.CY cs.SI stat.ML

    Finding Social Media Trolls: Dynamic Keyword Selection Methods for Rapidly-Evolving Online Debates

    Authors: Anqi Liu, Maya Srikanth, Nicholas Adams-Cohen, R. Michael Alvarez, Anima Anandkumar

    Abstract: Online harassment is a significant social problem. Prevention of online harassment requires rapid detection of harassing, offensive, and negative social media posts. In this paper, we propose the use of word embedding models to identify offensive and harassing social media messages in two aspects: detecting fast-changing topics for more effective data collection and representing word semantics in… ▽ More

    Submitted 15 November, 2019; v1 submitted 13 November, 2019; originally announced November 2019.

    Comments: AI for Social Good workshop at NeurIPS (2019)