Skip to main content

Showing 1–13 of 13 results for author: Schrouff, J

.
  1. arXiv:2406.17433  [pdf, other

    cs.LG

    Mind the Graph When Balancing Data for Fairness or Robustness

    Authors: Jessica Schrouff, Alexis Bellot, Amal Rannen-Triki, Alan Malek, Isabela Albuquerque, Arthur Gretton, Alexander D'Amour, Silvia Chiappa

    Abstract: Failures of fairness or robustness in machine learning predictive settings can be due to undesired dependencies between covariates, outcomes and auxiliary factors of variation. A common strategy to mitigate these failures is data balancing, which attempts to remove those undesired dependencies. In this work, we define conditions on the training distribution for data balancing to lead to fair or ro… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.04824  [pdf, other

    cs.LG stat.ML

    FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch

    Authors: Virginia Aglietti, Ira Ktena, Jessica Schrouff, Eleni Sgouritsa, Francisco J. R. Ruiz, Alan Malek, Alexis Bellot, Silvia Chiappa

    Abstract: The sample efficiency of Bayesian optimization algorithms depends on carefully crafted acquisition functions (AFs) guiding the sequential collection of function evaluations. The best-performing AF can vary significantly across optimization problems, often requiring ad-hoc and problem-specific choices. This work tackles the challenge of designing novel AFs that perform well across a variety of expe… ▽ More

    Submitted 1 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2212.11254  [pdf, other

    stat.ML cs.AI cs.LG

    Adapting to Latent Subgroup Shifts via Concepts and Proxies

    Authors: Ibrahim Alabdulmohsin, Nicole Chiou, Alexander D'Amour, Arthur Gretton, Sanmi Koyejo, Matt J. Kusner, Stephen R. Pfohl, Olawale Salaudeen, Jessica Schrouff, Katherine Tsai

    Abstract: We address the problem of unsupervised domain adaptation when the source domain differs from the target domain because of a shift in the distribution of a latent subgroup. When this subgroup confounds all observed data, neither covariate shift nor label shift assumptions apply. We show that the optimal target predictor can be non-parametrically identified with the help of concept and proxy variabl… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: Authors listed in alphabetical order

  4. Detecting Shortcut Learning for Fair Medical AI using Shortcut Testing

    Authors: Alexander Brown, Nenad Tomasev, Jan Freyberg, Yuan Liu, Alan Karthikesalingam, Jessica Schrouff

    Abstract: Machine learning (ML) holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities. An important step is to characterize the (un)fairness of ML models - their tendency to perform differently across subgroups of the population - and to understand its underlying mechanisms. One potential driver of algorithmic unfairness, sho… ▽ More

    Submitted 16 June, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

  5. arXiv:2205.15860  [pdf, other

    cs.LG

    A Reduction to Binary Approach for Debiasing Multiclass Datasets

    Authors: Ibrahim Alabdulmohsin, Jessica Schrouff, Oluwasanmi Koyejo

    Abstract: We propose a novel reduction-to-binary (R2B) approach that enforces demographic parity for multiclass classification with non-binary sensitive attributes via a reduction to a sequence of binary debiasing tasks. We prove that R2B satisfies optimality and bias guarantees and demonstrate empirically that it can lead to an improvement over two baselines: (1) treating multiclass problems as multi-label… ▽ More

    Submitted 10 October, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: 18 pages, 5 figures

    ACM Class: I.2.6; I.2.10

    Journal ref: In Neural Information Processing Systems (NeurIPS), 2022

  6. arXiv:2204.03969  [pdf, other

    cs.LG

    Disability prediction in multiple sclerosis using performance outcome measures and demographic data

    Authors: Subhrajit Roy, Diana Mincu, Lev Proleev, Negar Rostamzadeh, Chintan Ghate, Natalie Harris, Christina Chen, Jessica Schrouff, Nenad Tomasev, Fletcher Lee Hartsell, Katherine Heller

    Abstract: Literature on machine learning for multiple sclerosis has primarily focused on the use of neuroimaging data such as magnetic resonance imaging and clinical laboratory tests for disease identification. However, studies have shown that these modalities are not consistent with disease activity such as symptoms or disease progression. Furthermore, the cost of collecting data from these modalities is h… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

  7. arXiv:2202.13028  [pdf, ps, other

    cs.AI cs.HC

    Healthsheet: Development of a Transparency Artifact for Health Datasets

    Authors: Negar Rostamzadeh, Diana Mincu, Subhrajit Roy, Andrew Smart, Lauren Wilcox, Mahima Pushkarna, Jessica Schrouff, Razvan Amironesei, Nyalleng Moorosi, Katherine Heller

    Abstract: Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in develo** ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Develo** guidelines to imp… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

  8. arXiv:2202.01034  [pdf, other

    cs.LG cs.CY stat.ML

    Diagnosing failures of fairness transfer across distribution shift in real-world medical settings

    Authors: Jessica Schrouff, Natalie Harris, Oluwasanmi Koyejo, Ibrahim Alabdulmohsin, Eva Schnider, Krista Opsahl-Ong, Alex Brown, Subhrajit Roy, Diana Mincu, Christina Chen, Awa Dieng, Yuan Liu, Vivek Natarajan, Alan Karthikesalingam, Katherine Heller, Silvia Chiappa, Alexander D'Amour

    Abstract: Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the structure of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is enco… ▽ More

    Submitted 10 February, 2023; v1 submitted 2 February, 2022; originally announced February 2022.

    Journal ref: Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  9. arXiv:2106.08641  [pdf, other

    cs.LG

    Best of both worlds: local and global explanations with human-understandable concepts

    Authors: Jessica Schrouff, Sebastien Baur, Shaobo Hou, Diana Mincu, Eric Loreaux, Ralph Blanes, James Wexler, Alan Karthikesalingam, Been Kim

    Abstract: Interpretability techniques aim to provide the rationale behind a model's decision, typically by explaining either an individual prediction (local explanation, e.g. 'why is this patient diagnosed with this condition') or a class of predictions (global explanation, e.g. 'why is this set of patients diagnosed with this condition in general'). While there are many methods focused on either one, few f… ▽ More

    Submitted 31 January, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

  10. Concept-based model explanations for Electronic Health Records

    Authors: Diana Mincu, Eric Loreaux, Shaobo Hou, Sebastien Baur, Ivan Protsyuk, Martin G Seneviratne, Anne Mottram, Nenad Tomasev, Alan Karthikesanlingam, Jessica Schrouff

    Abstract: Recurrent Neural Networks (RNNs) are often used for sequential modeling of adverse outcomes in electronic health records (EHRs) due to their ability to encode past clinical states. These deep, recurrent architectures have displayed increased performance compared to other modeling approaches in a number of tasks, fueling the interest in deploying deep models in clinical settings. One of the key ele… ▽ More

    Submitted 8 March, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Journal ref: CHIL '21: Proceedings of the Conference on Health, Inference, and Learning, 2021

  11. arXiv:2011.03395  [pdf, other

    cs.LG stat.ML

    Underspecification Presents Challenges for Credibility in Modern Machine Learning

    Authors: Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne , et al. (15 additional authors not shown)

    Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict… ▽ More

    Submitted 24 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Updates: Updated statistical analysis in Section 6; Additional citations

  12. arXiv:1905.06707  [pdf, other

    cs.LG cs.PL cs.SE stat.ML

    Inferring Javascript types using Graph Neural Networks

    Authors: Jessica Schrouff, Kai Wohlfahrt, Bruno Marnette, Liam Atkinson

    Abstract: The recent use of `Big Code' with state-of-the-art deep learning methods offers promising avenues to ease program source code writing and correction. As a first step towards automatic code repair, we implemented a graph neural network model that predicts token types for Javascript programs. The predictions achieve an accuracy above $90\%$, which improves on previous similar work.

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: Published at the Representation Learning on Graphs and Manifolds ICLR 2019 workshop (https://rlgm.github.io/papers/)

  13. Interpreting weight maps in terms of cognitive or clinical neuroscience: nonsense?

    Authors: Jessica Schrouff, Janaina Mourao-Miranda

    Abstract: Since machine learning models have been applied to neuroimaging data, researchers have drawn conclusions from the derived weight maps. In particular, weight maps of classifiers between two conditions are often described as a proxy for the underlying signal differences between the conditions. Recent studies have however suggested that such weight maps could not reliably recover the source of the ne… ▽ More

    Submitted 30 April, 2018; originally announced April 2018.

    Comments: conference article

    Journal ref: 2018 International Workshop on Pattern Recognition in Neuroimaging (PRNI), Singapore, Singapore, 2018, pp. 1-4