Skip to main content

Showing 1–13 of 13 results for author: Hoffman, S C

.
  1. arXiv:2405.04912  [pdf, other

    q-bio.BM cs.LG physics.chem-ph

    GP-MoLFormer: A Foundation Model For Molecular Generation

    Authors: Jerret Ross, Brian Belgodere, Samuel C. Hoffman, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

    Abstract: Transformer-based models trained on large and general purpose datasets consisting of molecular strings have recently emerged as a powerful tool for successfully modeling various structure-property relations. Inspired by this success, we extend the paradigm of training chemical language transformers on large-scale chemical datasets to generative tasks in this work. Specifically, we propose GP-MoLFo… ▽ More

    Submitted 4 April, 2024; originally announced May 2024.

  2. arXiv:2302.09190  [pdf, other

    cs.LG cs.CY

    Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions

    Authors: Manish Nagireddy, Moninder Singh, Samuel C. Hoffman, Evaline Ju, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Ensuring trustworthiness in machine learning (ML) models is a multi-dimensional task. In addition to the traditional notion of predictive performance, other notions such as privacy, fairness, robustness to distribution shift, adversarial robustness, interpretability, explainability, and uncertainty quantification are important considerations to evaluate and improve (if deficient). However, these s… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  3. arXiv:2210.05594  [pdf, other

    cs.LG cs.CY

    Navigating Ensemble Configurations for Algorithmic Fairness

    Authors: Michael Feffer, Martin Hirzel, Samuel C. Hoffman, Kiran Kate, Parikshit Ram, Avraham Shinnar

    Abstract: Bias mitigators can improve algorithmic fairness in machine learning models, but their effect on fairness is often not stable across data splits. A popular approach to train more stable models is ensemble learning, but unfortunately, it is unclear how to combine ensembles with mitigators to best navigate trade-offs between fairness and predictive performance. To that end, we built an open-source l… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: arXiv admin note: text overlap with arXiv:2202.00751

  4. arXiv:2207.07174  [pdf, other

    cs.LG stat.ML

    Causal Graphs Underlying Generative Models: Path to Learning with Limited Data

    Authors: Samuel C. Hoffman, Kahini Wadhawan, Payel Das, Prasanna Sattigeri, Karthikeyan Shanmugam

    Abstract: Training generative models that capture rich semantics of the data and interpreting the latent representations encoded by such models are very important problems in unsupervised learning. In this work, we provide a simple algorithm that relies on perturbation experiments on latent codes of a pre-trained generative autoencoder to uncover a causal graph that is implied by the generative model. We le… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  5. Accelerating Material Design with the Generative Toolkit for Scientific Discovery

    Authors: Matteo Manica, Jannis Born, Joris Cadow, Dimitrios Christofidellis, Ashish Dave, Dean Clarke, Yves Gaetan Nana Teukam, Giorgio Giannone, Samuel C. Hoffman, Matthew Buchan, Vijil Chenthamarakshan, Timothy Donovan, Hsiang Han Hsu, Federico Zipoli, Oliver Schilter, Akihiro Kishimoto, Lisa Hamada, Inkit Padhi, Karl Wehden, Lauren McHugh, Alexy Khrabrov, Payel Das, Seiji Takeda, John R. Smith

    Abstract: With the growing availability of data within various scientific domains, generative models hold enormous potential to accelerate scientific discovery. They harness powerful representations learned from datasets to speed up the formulation of novel hypotheses with the potential to impact material discovery broadly. We present the Generative Toolkit for Scientific Discovery (GT4SD). This extensible… ▽ More

    Submitted 31 January, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: 15 pages, 2 figures

    Journal ref: Nature Partner Journals (npj) Computational Materials 9, 69 (2023)

  6. arXiv:2204.09042  [pdf, other

    q-bio.QM cs.LG q-bio.BM stat.ML

    Accelerating Inhibitor Discovery With A Deep Generative Foundation Model: Validation for SARS-CoV-2 Drug Targets

    Authors: Vijil Chenthamarakshan, Samuel C. Hoffman, C. David Owen, Petra Lukacik, Claire Strain-Damerell, Daren Fearon, Tika R. Malla, Anthony Tumber, Christopher J. Schofield, Helen M. E. Duyvesteyn, Wanwisa Dejnirattisai, Loic Carrique, Thomas S. Walter, Gavin R. Screaton, Tetiana Matviiuk, Aleksandra Mojsilovic, Jason Crain, Martin A. Walsh, David I. Stuart, Payel Das

    Abstract: The discovery of novel inhibitor molecules for emerging drug-target proteins is widely acknowledged as a challenging inverse design problem: Exhaustive exploration of the vast chemical search space is impractical, especially when the target structure or active molecules are unknown. Here we validate experimentally the broad utility of a deep generative framework trained at-scale on protein sequenc… ▽ More

    Submitted 14 October, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Revised title, abstract, and text; additional figures

  7. arXiv:2202.00751  [pdf, other

    cs.LG cs.CY

    An Empirical Study of Modular Bias Mitigators and Ensembles

    Authors: Michael Feffer, Martin Hirzel, Samuel C. Hoffman, Kiran Kate, Parikshit Ram, Avraham Shinnar

    Abstract: There are several bias mitigators that can reduce algorithmic bias in machine learning models but, unfortunately, the effect of mitigators on fairness is often not stable when measured across different data splits. A popular approach to train more stable models is ensemble learning. Ensembles, such as bagging, boosting, voting, or stacking, have been successful at making predictive performance mor… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  8. arXiv:2112.01625  [pdf, other

    cs.LG physics.chem-ph

    Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model

    Authors: Samuel C. Hoffman, Vijil Chenthamarakshan, Dmitry Yu. Zubarev, Daniel P. Sanders, Payel Das

    Abstract: Photo-acid generators (PAGs) are compounds that release acids ($H^+$ ions) when exposed to light. These compounds are critical components of the photolithography processes that are used in the manufacture of semiconductor logic and memory chips. The exponential increase in the demand for semiconductors has highlighted the need for discovering novel photo-acid generators. While de novo molecule des… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  9. arXiv:2109.12151  [pdf, other

    cs.LG cs.AI

    AI Explainability 360: Impact and Design

    Authors: Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilovic, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

    Abstract: As artificial intelligence and machine learning algorithms become increasingly prevalent in society, multiple stakeholders are calling for these algorithms to provide explanations. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, have different explanation needs. To address these needs, in 2019, we created AI Expl… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:1909.03012

    Journal ref: IAAI 2022

  10. arXiv:2004.01215  [pdf, other

    cs.LG q-bio.QM stat.ML

    CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

    Authors: Vijil Chenthamarakshan, Payel Das, Samuel C. Hoffman, Hendrik Strobelt, Inkit Padhi, Kar Wai Lim, Benjamin Hoover, Matteo Manica, Jannis Born, Teodoro Laino, Aleksandra Mojsilovic

    Abstract: The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Au… ▽ More

    Submitted 23 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

  11. arXiv:1909.03012  [pdf, other

    cs.AI cs.CV cs.HC stat.ML

    One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

    Authors: Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilović, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

    Abstract: As artificial intelligence and machine learning algorithms make further inroads into society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, present different requirements for explanations. Toward addressing these need… ▽ More

    Submitted 14 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

  12. arXiv:1810.01943  [pdf, other

    cs.AI

    AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

    Authors: Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, Yunfeng Zhang

    Abstract: Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license {https://github.com/ibm/aif360). The main objectives of this… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: 20 pages

  13. arXiv:1805.09910  [pdf, other

    stat.ML cs.CY cs.LG

    Fairness GAN

    Authors: Prasanna Sattigeri, Samuel C. Hoffman, Vijil Chenthamarakshan, Kush R. Varshney

    Abstract: In this paper, we introduce the Fairness GAN, an approach for generating a dataset that is plausibly similar to a given multimedia dataset, but is more fair with respect to protected attributes in allocative decision making. We propose a novel auxiliary classifier GAN that strives for demographic parity or equality of opportunity and show empirical results on several datasets, including the CelebF… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.