Skip to main content

Showing 1–10 of 10 results for author: Aguirre, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.08472  [pdf, other

    cs.CL

    Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models

    Authors: Carlos Aguirre, Kuleen Sasse, Isabel Cachola, Mark Dredze

    Abstract: Recently, work in NLP has shifted to few-shot (in-context) learning, with large language models (LLMs) performing well across a range of tasks. However, while fairness evaluations have become a standard for supervised methods, little is known about the fairness of LLMs as prediction systems. Further, common standard methods for fairness involve access to models weights or are applied during finetu… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  2. arXiv:2305.12671  [pdf, other

    cs.LG cs.CY

    Transferring Fairness using Multi-Task Learning with Limited Demographic Information

    Authors: Carlos Aguirre, Mark Dredze

    Abstract: Training supervised machine learning systems with a fairness loss can improve prediction fairness across different demographic groups. However, doing so requires demographic annotations for training data, without which we cannot produce debiased classifiers for most tasks. Drawing inspiration from transfer learning methods, we investigate whether we can utilize demographic data from a related task… ▽ More

    Submitted 15 April, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

  3. arXiv:2211.07932  [pdf, other

    cs.CL

    Using Open-Ended Stressor Responses to Predict Depressive Symptoms across Demographics

    Authors: Carlos Aguirre, Mark Dredze, Philip Resnik

    Abstract: Stressors are related to depression, but this relationship is complex. We investigate the relationship between open-ended text responses about stressors and depressive symptoms across gender and racial/ethnic groups. First, we use topic models and other NLP tools to find thematic and vocabulary differences when reporting stressors across demographic groups. We train language models using self-repo… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 6 pages

  4. arXiv:2103.10550  [pdf, other

    cs.CL

    Gender and Racial Fairness in Depression Research using Social Media

    Authors: Carlos Aguirre, Keith Harrigian, Mark Dredze

    Abstract: Multiple studies have demonstrated that behavior on internet-based social media platforms can be indicative of an individual's mental health status. The widespread availability of such data has spurred interest in mental health research from a computational lens. While previous research has raised concerns about possible biases in models produced from this data, no study has quantified how these b… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: Accepted to EACL 2021

  5. arXiv:2011.05233  [pdf, other

    cs.CL

    On the State of Social Media Data for Mental Health Research

    Authors: Keith Harrigian, Carlos Aguirre, Mark Dredze

    Abstract: Data-driven methods for mental health treatment and surveillance have become a major focus in computational science research in the last decade. However, progress in the domain, in terms of both medical understanding and system performance, remains bounded by the availability of adequate data. Prior systematic reviews have not necessarily made it possible to measure the degree to which data-relate… ▽ More

    Submitted 25 April, 2021; v1 submitted 10 November, 2020; originally announced November 2020.

    Comments: Originally submitted to ICWSM in January 2020. v1 updated November 2020. v2 updated April 2021, to appear at CLPsych 2021. Supplementary material at https://github.com/kharrigian/mental-health-datasets

  6. arXiv:2002.00994  [pdf, other

    astro-ph.IM cs.LG

    Scalable End-to-end Recurrent Neural Network for Variable star classification

    Authors: Ignacio Becker, Karim Pichara, Márcio Catelan, Pavlos Protopapas, Carlos Aguirre, Fatemeh Nikzat

    Abstract: During the last decade, considerable effort has been made to perform automatic classification of variable stars using machine learning techniques. Traditionally, light curves are represented as a vector of descriptors or features used as input for many algorithms. Some features are computationally expensive, cannot be updated quickly and hence for large datasets such as the LSST cannot be applied.… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

    Comments: 15 pages, 17 figures. To be published in MNRAS

  7. arXiv:1912.07747  [pdf

    cs.IR cs.CL cs.LG

    Pipelines for Procedural Information Extraction from Scientific Literature: Towards Recipes using Machine Learning and Data Science

    Authors: Huichen Yang, Carlos A. Aguirre, Maria F. De La Torre, Derek Christensen, Luis Bobadilla, Emily Davich, Jordan Roth, Lei Luo, Yihong Theis, Alice Lam, T. Yong-** Han, David Buttler, William H. Hsu

    Abstract: This paper describes a machine learning and data science pipeline for structured information extraction from documents, implemented as a suite of open-source tools and extensions to existing tools. It centers around a methodology for extracting procedural information in the form of recipes, stepwise procedures for creating an artifact (in this case synthesizing a nanomaterial), from published scie… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: 15th International Conference on Document Analysis and Recognition Workshops (ICDARW 2019)

    Report number: 2019-1 MSC Class: I.2.7; I.2.6; H.3.3; H.3.4; I.2.10; I.5.4 ACM Class: I.2.7; I.2.6; H.3.3; H.3.4; I.2.10; I.5.4

  8. arXiv:1907.07768  [pdf, other

    cs.IR cs.CR cs.LG cs.SI stat.ML

    A Novel Approach for Detection and Ranking of Trendy and Emerging Cyber Threat Events in Twitter Streams

    Authors: Avishek Bose, Vahid Behzadan, Carlos Aguirre, William H. Hsu

    Abstract: We present a new machine learning and text information extraction approach to detection of cyber threat events in Twitter that are novel (previously non-extant) and develo** (marked by significance with respect to similarity with a previously detected event). While some existing approaches to event detection measure novelty and trendiness, typically as independent criteria and occasionally as a… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

    Comments: 9 pages, 3 figures, and 5 tables

  9. arXiv:1810.09440  [pdf, other

    astro-ph.IM cs.LG stat.ML

    Deep multi-survey classification of variable stars

    Authors: Carlos Aguirre, Karim Pichara, Ignacio Becker

    Abstract: During the last decade, a considerable amount of effort has been made to classify variable stars using different machine learning techniques. Typically, light curves are represented as vectors of statistical descriptors or features that are used to train various algorithms. These features demand big computational powers that can last from hours to days, making impossible to create scalable and eff… ▽ More

    Submitted 21 October, 2018; originally announced October 2018.

    Comments: Accepted for publication in Monthly Notices of the Royal Astronomical Society

  10. arXiv:1211.5986  [pdf, ps, other

    physics.data-an cs.IR math.NA

    Signal recognition and adapted filtering by non-commutative tomography

    Authors: Carlos Aguirre, R. Vilela Mendes

    Abstract: Tomograms, a generalization of the Radon transform to arbitrary pairs of non-commuting operators, are positive bilinear transforms with a rigorous probabilistic interpretation which provide a full characterization of the signal and are robust in the presence of noise. Tomograms based on the time-frequency operator pair, were used in the past for component separation and denoising. Here we show how… ▽ More

    Submitted 26 November, 2012; originally announced November 2012.

    Comments: 19 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1107.0929

    Journal ref: IET Signal Processing 8 (2014) 67 - 75