Skip to main content

Showing 1–17 of 17 results for author: Schiff, Y

.
  1. arXiv:2406.07524  [pdf, other

    cs.CL cs.AI cs.LG

    Simple and Effective Masked Diffusion Language Models

    Authors: Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov

    Abstract: While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple masked discrete diffusion is more performant than previously thought. We apply an effective training recipe that improves the performance of masked diffusion models and derive a sim… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Report number: cr07

  2. arXiv:2403.03234  [pdf, other

    q-bio.GN cs.LG

    Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

    Authors: Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

    Abstract: Large-scale sequence modeling has sparked rapid advances that now extend into biology and genomics. However, modeling genomic sequences introduces challenges such as the need to model long-range token interactions, the effects of upstream and downstream regions of the genome, and the reverse complementarity (RC) of DNA. Here, we propose an architecture motivated by these challenges that builds off… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: ICML 2024; Code to reproduce our experiments is available at https://github.com/kuleshov-group/caduceus

  3. arXiv:2402.04467  [pdf, other

    cs.LG math.DS

    DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems

    Authors: Yair Schiff, Zhong Yi Wan, Jeffrey B. Parker, Stephan Hoyer, Volodymyr Kuleshov, Fei Sha, Leonardo Zepeda-Núñez

    Abstract: Learning dynamics from dissipative chaotic systems is notoriously difficult due to their inherent instability, as formalized by their positive Lyapunov exponents, which exponentially amplify errors in the learned dynamics. However, many of these systems exhibit ergodicity and an attractor: a compact and highly complex manifold, to which trajectories converge in finite-time, that supports an invari… ▽ More

    Submitted 5 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: ICML 2024; Code to reproduce our experiments is available at https://github.com/google-research/swirl-dynamics/tree/main/swirl_dynamics/projects/ergodic

  4. arXiv:2306.08757  [pdf, other

    cs.LG cs.CV

    InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

    Authors: Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov

    Abstract: While diffusion models excel at generating high-quality samples, their latent variables typically lack semantic meaning and are not suitable for representation learning. Here, we propose InfoDiffusion, an algorithm that augments diffusion models with low-dimensional latent variables that capture high-level factors of variation in the data. InfoDiffusion relies on a learning objective regularized w… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  5. arXiv:2304.10819  [pdf, other

    cs.LG cs.AI stat.ML

    Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

    Authors: Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young

    Abstract: Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining fidelity to the original data. However, assessing the trustworthiness of synthetic datasets and models is a critical challenge. We introduce a holistic auditing framew… ▽ More

    Submitted 9 June, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: submitted

  6. arXiv:2208.06665  [pdf, other

    cs.LG

    Cloud-Based Real-Time Molecular Screening Platform with MolFormer

    Authors: Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young

    Abstract: With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The pla… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: Paper accepted at ECML PKDD 2022 demo track

  7. arXiv:2206.06672  [pdf, other

    cs.LG stat.ML

    Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows

    Authors: Phillip Si, Zeyi Chen, Subham Sekhar Sahoo, Yair Schiff, Volodymyr Kuleshov

    Abstract: Training normalizing flow generative models can be challenging due to the need to calculate computationally expensive determinants of Jacobians. This paper studies the likelihood-free training of flows and proposes the energy objective, an alternative sample-based loss based on proper scoring rules. The energy objective is determinant-free and supports flexible model architectures that are not eas… ▽ More

    Submitted 22 June, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: 9 pages, 3 figures, 8 tables, 11 pages appendix

    MSC Class: 68T37 (Primary) 68T07 (Secondary)

  8. arXiv:2205.13684  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Learning with Stochastic Orders

    Authors: Carles Domingo-Enrich, Yair Schiff, Youssef Mroueh

    Abstract: Learning high-dimensional distributions is often done with explicit likelihood modeling or implicit modeling via minimizing integral probability metrics (IPMs). In this paper, we expand this learning paradigm to stochastic orders, namely, the convex or Choquet order between probability measures. Towards this end, exploiting the relation between convex orders and optimal transport, we introduce the… ▽ More

    Submitted 9 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Code available at https://github.com/yair-schiff/stochastic-orders-ICMN

  9. arXiv:2205.11718  [pdf, other

    cs.LG

    Semi-Parametric Inducing Point Networks and Neural Processes

    Authors: Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov

    Abstract: We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner. Semi-parametric architectures are typically more compact than parametric models, but their computational complexity is often quadratic. In contrast, SPIN attains linear complexity via a cross-attention mechanism between datapoi… ▽ More

    Submitted 30 March, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: ICLR 2023 conference paper

  10. arXiv:2106.04765  [pdf, other

    cs.LG cs.AI

    Predicting Deep Neural Network Generalization with Perturbation Response Curves

    Authors: Yair Schiff, Brian Quanz, Payel Das, Pin-Yu Chen

    Abstract: The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of prediction tasks. However, despite these successes, the recent Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition suggests that there is a need for more robust and efficient measures of network generalization. In this work, we propose a new framework for evaluating the gener… ▽ More

    Submitted 26 October, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  11. arXiv:2106.04464  [pdf, other

    physics.chem-ph cs.LG math.AT

    Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations

    Authors: Yair Schiff, Vijil Chenthamarakshan, Samuel Hoffman, Karthikeyan Natesan Ramamurthy, Payel Das

    Abstract: Deep generative models have emerged as a powerful tool for learning useful molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. However, most existing deep generative models are restricted due to lack of spatial information. Here we propose augmentation of deep generative models with topological data analysis (TDA… ▽ More

    Submitted 15 February, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted to ICASSP, 2022

  12. arXiv:2106.00774  [pdf, other

    stat.ML cs.LG math.NA

    Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks

    Authors: David Alvarez-Melis, Yair Schiff, Youssef Mroueh

    Abstract: Gradient flows are a powerful tool for optimizing functionals in general metric spaces, including the space of probabilities endowed with the Wasserstein metric. A typical approach to solving this optimization problem relies on its connection to the dynamic formulation of optimal transport and the celebrated Jordan-Kinderlehrer-Otto (JKO) scheme. However, this formulation involves optimization ove… ▽ More

    Submitted 30 November, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

  13. arXiv:2104.03469  [pdf, other

    cs.LG

    Gi and Pal Scores: Deep Neural Network Generalization Statistics

    Authors: Yair Schiff, Brian Quanz, Payel Das, Pin-Yu Chen

    Abstract: The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of regression, classification, and control tasks. However, despite these successes, the field lacks strong theoretical error bounds and consistent measures of network generalization and learned invariances. In this work, we introduce two new measures, the Gi-score and Pal-score, that capture a deep ne… ▽ More

    Submitted 9 June, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted to RobustML Workshop at ICLR 2021

    ACM Class: I.2.6; G.3; I.5.1

  14. arXiv:2012.11696  [pdf, other

    cs.CV cs.LG

    Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere

    Abstract: Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on the… ▽ More

    Submitted 18 June, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: In submission to JAIR. Copyright may be transferred without notice, after which this version may no longer be accessible

  15. arXiv:2012.11691  [pdf, other

    cs.CV cs.LG

    Alleviating Noisy Data in Image Captioning with Cooperative Distillation

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff

    Abstract: Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images. Unfortunately, scarce availability of such cleanly labeled data results in trained algorithms producing captions that can be terse and idiosyncratically specific to details in the image. We propose… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: CVPR 2020 VizWiz Challenge

  16. arXiv:2011.01843  [pdf, other

    cs.LG cs.AI

    Tabular Transformers for Modeling Multivariate Time Series

    Authors: Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman

    Abstract: Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential. Here we propose neural network models that represent tabular time series that can optionally leverage their hierarchical structure. This results in two architectures for tabular time series: one for learn… ▽ More

    Submitted 11 February, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: Accepted to ICASSP, 2021; https://github.com/IBM/TabFormer

  17. arXiv:2010.08548  [pdf, other

    q-bio.BM cs.LG

    Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics

    Authors: Yair Schiff, Vijil Chenthamarakshan, Karthikeyan Natesan Ramamurthy, Payel Das

    Abstract: Deep generative models are increasingly becoming integral parts of the in silico molecule design pipeline and have dual goals of learning the chemical and structural features that render candidate molecules viable while also being flexible enough to generate novel designs. Specifically, Variational Auto Encoders (VAEs) are generative models in which encoder-decoder network pairs are trained to rec… ▽ More

    Submitted 7 June, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

    Comments: Accepted to and presented as spotlight poster at the Topological Data Analysis and Beyond Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)