Skip to main content

Showing 1–50 of 548 results for author: Bengio, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08506  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    RGFN: Synthesizable Molecular Generation Using GFlowNets

    Authors: Michał Koziarski, Andrei Rekesh, Dmytro Shevchuk, Almer van der Sloot, Piotr Gaiński, Yoshua Bengio, Cheng-Hao Liu, Mike Tyers, Robert A. Batey

    Abstract: Generative models hold great promise for small molecule discovery, significantly increasing the size of search space compared to traditional in silico screening libraries. However, most existing machine learning methods for small molecule generation suffer from poor synthesizability of candidate compounds, making experimental validation difficult. In this paper we propose Reaction-GFlowNet (RGFN),… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  2. arXiv:2406.07529  [pdf, other

    cs.LG

    MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

    Authors: Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui Wu, Jiang Bian, Yong Chen, Yoshua Bengio

    Abstract: Model merging has emerged as an effective approach to combine multiple single-task models, fine-tuned from the same pre-trained model, into a multitask model. This process typically involves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the ob… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2406.06462  [pdf, other

    cs.CV cs.LG

    VCR: Visual Caption Restoration

    Authors: Tianyu Zhang, Suyuchen Wang, Lu Li, Ge Zhang, Perouz Taslakian, Sai Rajeswar, Jie Fu, Bang Liu, Yoshua Bengio

    Abstract: We introduce Visual Caption Restoration (VCR), a novel vision-language task that challenges models to accurately restore partially obscured texts using pixel-level hints within images. This task stems from the observation that text embedded in images is intrinsically different from common visual elements and natural language due to the need to align the modalities of vision, text, and text embedde… ▽ More

    Submitted 24 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 17 pages, 2 figures

  4. arXiv:2406.05426  [pdf, other

    cs.LG

    Baking Symmetry into GFlowNets

    Authors: George Ma, Emmanuel Bengio, Yoshua Bengio, Dinghuai Zhang

    Abstract: GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects incrementally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  5. arXiv:2405.20971  [pdf, other

    cs.LG cs.CV

    Amortizing intractable inference in diffusion models for vision, language, and control

    Authors: Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin

    Abstract: Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generat… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Code: https://github.com/GFNOrg/diffusion-finetuning

  6. arXiv:2405.18540  [pdf, other

    cs.CL cs.CR cs.LG

    Learning diverse attacks on large language models for robust red-teaming and safety tuning

    Authors: Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain

    Abstract: Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs). Develo** effective protection against many modes of attack prompts requires discovering diverse attacks. Automated red-teaming typically uses reinforcement learning to fine-tune an attacker language model to generate prompts that e… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  7. arXiv:2405.13956  [pdf, other

    cs.LG

    Attention as an RNN

    Authors: Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Mohamed Osama Ahmed, Yoshua Bengio, Greg Mori

    Abstract: The advent of Transformers marked a significant breakthrough in sequence modelling, providing a highly performant architecture capable of leveraging GPU parallelism. However, Transformers are computationally expensive at inference time, limiting their applications, particularly in low-resource settings (e.g., mobile and embedded devices). Addressing this, we (1) begin by showing that attention can… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  8. arXiv:2405.13012  [pdf

    cs.CL cs.AI

    Divergent Creativity in Humans and Large Language Models

    Authors: Antoine Bellemare-Pepin, François Lespinasse, Philipp Thölke, Yann Harel, Kory Mathewson, Jay A. Olson, Yoshua Bengio, Karim Jerbi

    Abstract: The recent surge in the capabilities of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLM creativity, particularly in comparison to human divergent thinking. To brid… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: First two and last listed authors are corresponding authors. The first two listed authors contributed equally to this work

  9. arXiv:2405.12205  [pdf, other

    cs.AI cs.LG

    Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

    Authors: Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora

    Abstract: Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, develo** a prompt-guided interac… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review

  10. arXiv:2405.06624  [pdf, other

    cs.AI

    Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

    Authors: David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum

    Abstract: Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these appro… ▽ More

    Submitted 17 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  11. arXiv:2405.01616  [pdf, other

    q-bio.BM cs.AI cs.LG

    Generative Active Learning for the Search of Small-molecule Protein Binders

    Authors: Maksym Korablyov, Cheng-Hao Liu, Moksh Jain, Almer M. van der Sloot, Eric Jolicoeur, Edward Ruediger, Andrei Cristian Nica, Emmanuel Bengio, Kostiantyn Lapchevskyi, Daniel St-Cyr, Doris Alexandra Schuetz, Victor Ion Butoi, Jarrid Rector-Brooks, Simon Blackburn, Leo Feng, Hadi Nekoei, SaiKrishna Gottipati, Priyesh Vijayan, Prateek Gupta, Ladislav Rampášek, Sasikanth Avancha, Pierre-Luc Bacon, William L. Hamilton, Brooks Paige, Sanchit Misra , et al. (9 additional authors not shown)

    Abstract: Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecu… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  12. arXiv:2404.10094  [pdf, other

    cs.LG q-bio.QM

    Towards DNA-Encoded Library Generation with GFlowNets

    Authors: Michał Koziarski, Mohammed Abukalam, Vedant Shah, Louis Vaillancourt, Doris Alexandra Schuetz, Moksh Jain, Almer van der Sloot, Mathieu Bourgey, Anne Marinier, Yoshua Bengio

    Abstract: DNA-encoded libraries (DELs) are a powerful approach for rapidly screening large numbers of diverse compounds. One of the key challenges in using DELs is library design, which involves choosing the building blocks that will be combinatorially combined to produce the final library. In this paper we consider the task of protein-protein interaction (PPI) biased DEL design. To this end, we evaluate se… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  13. arXiv:2404.09932  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Foundational Challenges in Assuring Alignment and Safety of Large Language Models

    Authors: Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi , et al. (13 additional authors not shown)

    Abstract: This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

    Submitted 15 April, 2024; originally announced April 2024.

  14. arXiv:2403.14443  [pdf, other

    cs.AI cs.CL cs.GT cs.LG cs.MA cs.SI

    Language Models Can Reduce Asymmetry in Information Markets

    Authors: Nasim Rahaman, Martin Weiss, Manuel Wüthrich, Yoshua Bengio, Li Erran Li, Chris Pal, Bernhard Schölkopf

    Abstract: This work addresses the buyer's inspection paradox for information markets. The paradox is that buyers need to access information to determine its value, while sellers need to limit access to prevent theft. To study this, we introduce an open-source simulated digital marketplace where intelligent agents, powered by language models, buy and sell information on behalf of external participants. The c… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  15. arXiv:2403.07041  [pdf, other

    cs.LG cs.NE

    Ant Colony Sampling with GFlowNets for Combinatorial Optimization

    Authors: Minsu Kim, Sanghyeok Choi, Hyeonah Kim, Jiwoo Son, **kyoo Park, Yoshua Bengio

    Abstract: This paper introduces the Generative Flow Ant Colony Sampler (GFACS), a neural-guided probabilistic search algorithm for solving combinatorial optimization (CO). GFACS integrates generative flow networks (GFlowNets), an emerging amortized inference method, with ant colony optimization (ACO), a promising probabilistic search algorithm. Specifically, we use GFlowNets to learn a constructive policy i… ▽ More

    Submitted 22 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 23 pages, 5 figures

  16. arXiv:2403.04571  [pdf, ps, other

    cs.AI

    Machine learning and information theory concepts towards an AI Mathematician

    Authors: Yoshua Bengio, Nikolay Malkin

    Abstract: The current state-of-the-art in artificial intelligence is impressive, especially in terms of mastery of language, but not so much in terms of mathematical reasoning. What could be missing? Can we learn something useful about that gap from how the brains of mathematicians go about their craft? This essay builds on the idea that current deep learning mostly succeeds at system 1 abilities -- which c… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: To appear in the Bulletin of the AMS, 2024

  17. arXiv:2402.10309  [pdf, other

    cs.LG

    Discrete Probabilistic Inference as Control in Multi-path Environments

    Authors: Tristan Deleu, Padideh Nouri, Nikolay Malkin, Doina Precup, Yoshua Bengio

    Abstract: We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward. While we could use maximum entropy Reinforcement Learning (MaxEnt RL) to solve this problem for some distributions, it has be… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  18. arXiv:2402.08797  [pdf, other

    cs.CY

    Computing Power and the Governance of Artificial Intelligence

    Authors: Girish Sastry, Lennart Heim, Haydn Belfield, Markus Anderljung, Miles Brundage, Julian Hazell, Cullen O'Keefe, Gillian K. Hadfield, Richard Ngo, Konstantin Pilz, George Gor, Emma Bluemke, Sarah Shoker, Janet Egan, Robert F. Trager, Shahar Avin, Adrian Weller, Yoshua Bengio, Diane Coyle

    Abstract: Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. Howe… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Figures can be accessed at: https://github.com/lheim/CPGAI-Figures

  19. arXiv:2402.06121  [pdf, other

    cs.LG stat.ML

    Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

    Authors: Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

    Abstract: Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and… ▽ More

    Submitted 26 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Published at ICML 2024. Code for iDEM is available at https://github.com/jarridrb/dem

  20. arXiv:2402.05098  [pdf, other

    cs.LG stat.ML

    Improved off-policy training of diffusion samplers

    Authors: Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin

    Abstract: We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into… ▽ More

    Submitted 26 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 24 pages; changed title from v2; code: https://github.com/GFNOrg/gfn-diffusion

  21. arXiv:2402.01207  [pdf, other

    cs.LG cs.AI stat.ME

    Efficient Causal Graph Discovery Using Large Language Models

    Authors: Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, Yoshua Bengio

    Abstract: We propose a novel framework that leverages LLMs for full causal graph discovery. While previous LLM-based methods have used a pairwise query approach, this requires a quadratic number of queries which quickly becomes impractical for larger causal graphs. In contrast, the proposed framework uses a breadth-first search (BFS) approach which allows it to use only a linear number of queries. We also s… ▽ More

    Submitted 13 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  22. arXiv:2312.07511  [pdf, other

    cs.LG cs.AI q-bio.QM stat.ML

    A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

    Authors: Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

    Abstract: Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations.… ▽ More

    Submitted 13 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  23. arXiv:2312.03911  [pdf, other

    cs.LG stat.CO stat.ME stat.ML

    Improving Gradient-guided Nested Sampling for Posterior Inference

    Authors: Pablo Lemos, Nikolay Malkin, Will Handley, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur

    Abstract: We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization. This unique combination allows ${\tt GGNS}$ to scale well with dimensionality and perform competitively on a variety of synthetic and rea… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures. Code available at https://github.com/Pablo-Lemos/GGNS

  24. arXiv:2311.16176  [pdf, other

    cs.LG cs.AI cs.CV

    Mitigating Biases with Diverse Ensembles and Diffusion Models

    Authors: Luca Scimeca, Alexander Rubinstein, Damien Teney, Seong Joon Oh, Armand Mihai Nicolicioiu, Yoshua Bengio

    Abstract: Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut learning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs) to mitigate this form of bias. We show that at particular… ▽ More

    Submitted 6 March, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: text overlap with arXiv:2310.02230

  25. arXiv:2311.15268  [pdf, other

    cs.LG cs.AI

    Unlearning via Sparse Representations

    Authors: Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal

    Abstract: Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible by existing techniques. We propose a nearly compute-free zero-shot unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's p… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  26. arXiv:2311.00936  [pdf, other

    cs.LG cs.CV q-bio.PE

    SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data

    Authors: Mélisande Teng, Amna Elmustafa, Benjamin Akera, Yoshua Bengio, Hager Radi Abdelwahed, Hugo Larochelle, David Rolnick

    Abstract: Biodiversity is declining at an unprecedented rate, impacting ecosystem services necessary to ensure food, water, and human health and well-being. Understanding the distribution of species and their habitats is crucial for conservation policy planning. However, traditional methods in ecology for species distribution models (SDMs) generally focus either on narrow sets of species or narrow geographi… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks

  27. arXiv:2310.19054  [pdf, other

    cs.LG

    Object-centric architectures enable efficient causal representation learning

    Authors: Amin Mansouri, Jason Hartford, Yan Zhang, Yoshua Bengio

    Abstract: Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class). Common to all of these approaches is the assumption that (1) the latent variables are represented as $d$-dimensional vectors, and (2) that the observations are the output of some injective generative function of thes… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  28. arXiv:2310.18807  [pdf, other

    cs.AI cs.CV

    OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning

    Authors: Rim Assouel, Pau Rodriguez, Perouz Taslakian, David Vazquez, Yoshua Bengio

    Abstract: A key aspect of human intelligence is the ability to imagine -- composing learned concepts in novel ways -- to make sense of new scenarios. Such capacity is not yet attained for machine learning systems. In this work, in the context of visual reasoning, we show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination. Our method, denoted Object-… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  29. arXiv:2310.18780  [pdf, other

    cs.LG cs.AI eess.SP

    Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

    Authors: Stefano Massaroli, Michael Poli, Daniel Y. Fu, Hermann Kumbong, Rom N. Parnichkun, Aman Timalsina, David W. Romero, Quinn McIntyre, Beidi Chen, Atri Rudra, Ce Zhang, Christopher Re, Stefano Ermon, Yoshua Bengio

    Abstract: Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers. In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads -- naively requiring a full pass (or caching of activations) over the input se… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  30. arXiv:2310.17688  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Managing extreme AI risks amid rapid progress

    Authors: Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Trevor Darrell, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

    Abstract: Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to develo** generalist AI systems that can autonomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although rese… ▽ More

    Submitted 22 May, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Published in Science: https://www.science.org/doi/10.1126/science.adn0117

  31. arXiv:2310.14935  [pdf

    cs.LG q-bio.GN

    Causal machine learning for single-cell genomics

    Authors: Alejandro Tejada-Lapuerta, Paul Bertin, Stefan Bauer, Hananeh Aliee, Yoshua Bengio, Fabian J. Theis

    Abstract: Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells. When combined with large-scale perturbation screens, through which specific biological mechanisms can be targeted, these technologies allow for measuring the effect of targeted perturbations on the whole transcriptome. These advances provide an opportunity to better understand the ca… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 35 pages, 7 figures, 3 tables, 1 box

  32. arXiv:2310.14782  [pdf, other

    cs.LG cs.AI

    Towards equilibrium molecular conformation generation with GFlowNets

    Authors: Alexandra Volokhova, Michał Koziarski, Alex Hernández-García, Cheng-Hao Liu, Santiago Miret, Pablo Lemos, Luca Thiede, Zichao Yan, Alán Aspuru-Guzik, Yoshua Bengio

    Abstract: Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this paper we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and disc… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  33. arXiv:2310.08774  [pdf, other

    q-bio.PE cs.LG stat.ML

    PhyloGFN: Phylogenetic inference with generative flow networks

    Authors: Mingyang Zhou, Zichao Yan, Elliot Layne, Nikolay Malkin, Dinghuai Zhang, Moksh Jain, Mathieu Blanchette, Yoshua Bengio

    Abstract: Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt… ▽ More

    Submitted 24 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  34. arXiv:2310.08338  [pdf

    eess.AS cs.SD q-bio.NC

    A cry for help: Early detection of brain injury in newborns

    Authors: Charles C. Onu, Samantha Latremouille, Arsenii Gorin, Junhao Wang, Innocent Udeogu, Uchenna Ekwochi, Peter O. Ubuane, Omolara A. Kehinde, Muhammad A. Salisu, Datonye Briggs, Yoshua Bengio, Doina Precup

    Abstract: Since the 1960s, neonatal clinicians have known that newborns suffering from certain neurological conditions exhibit altered crying patterns such as the high-pitched cry in birth asphyxia. Despite an annual burden of over 1.5 million infant deaths and disabilities, early detection of neonatal brain injuries due to asphyxia remains a challenge, particularly in develo** countries where the majorit… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  35. arXiv:2310.06682  [pdf, other

    cs.LG

    On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

    Authors: Alvaro Carbonero, Alexandre Duval, Victor Schmidt, Santiago Miret, Alex Hernandez-Garcia, Yoshua Bengio, David Rolnick

    Abstract: The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorporate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predic… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  36. arXiv:2310.04925  [pdf, other

    cs.LG

    Crystal-GFN: sampling crystals with desirable properties and constraints

    Authors: Mila AI4Science, Alex Hernandez-Garcia, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Yasmine Benabed, Michał Koziarski, Victor Schmidt

    Abstract: Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state materials such as electrocatalysts, super-ionic conductors or photovoltaic materials can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFN, a generative model of crystal struct… ▽ More

    Submitted 13 December, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: Main paper (10 pages) + references + appendix

  37. arXiv:2310.04363  [pdf, other

    cs.LG cs.CL

    Amortizing intractable inference in large language models

    Authors: Edward J. Hu, Moksh Jain, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin

    Abstract: Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distribu… ▽ More

    Submitted 13 March, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: ICLR 2024; 23 pages; code: https://github.com/GFNOrg/gfn-lm-tuning

  38. arXiv:2310.03579  [pdf, other

    cs.AI q-bio.MN

    Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

    Authors: Trang Nguyen, Alexander Tong, Kanika Madan, Yoshua Bengio, Dianbo Liu

    Abstract: Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular processes. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (a… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  39. arXiv:2310.03419  [pdf, other

    cs.LG cs.AI

    Pre-Training and Fine-Tuning Generative Flow Networks

    Authors: Ling Pan, Moksh Jain, Kanika Madan, Yoshua Bengio

    Abstract: Generative Flow Networks (GFlowNets) are amortized samplers that learn stochastic policies to sequentially generate compositional objects from a given unnormalized reward distribution. They can generate diverse sets of high-reward objects, which is an important consideration in scientific discovery tasks. However, as they are typically trained from a given extrinsic reward function, it remains an… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  40. arXiv:2310.02823  [pdf, other

    cs.LG stat.ML

    Learning to Scale Logits for Temperature-Conditional GFlowNets

    Authors: Minsu Kim, Joohwan Ko, Taeyoung Yun, Dinghuai Zhang, Ling Pan, Woochang Kim, **kyoo Park, Emmanuel Bengio, Yoshua Bengio

    Abstract: GFlowNets are probabilistic models that sequentially generate compositional structures through a stochastic policy. Among GFlowNets, temperature-conditional GFlowNets can introduce temperature-based controllability for exploration and exploitation. We propose \textit{Logit-scaling GFlowNets} (Logit-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional… ▽ More

    Submitted 2 June, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: ICML 2024, 23 pages, 21 figures

  41. arXiv:2310.02779  [pdf, other

    cs.LG cs.GT

    Expected flow networks in stochastic environments and two-player zero-sum games

    Authors: Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin

    Abstract: Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution. GFlowNets have been successfully applied to various structured object generation tasks, sampling a diverse set of high-reward objects quickly. We propose expected flow networks (EFlowNets), which extend GFlowNets to stochastic environments. We show that EFlowNets outperform other GFlowNet for… ▽ More

    Submitted 13 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: ICLR 2024; code: https://github.com/GFNOrg/AdversarialFlowNetworks

  42. arXiv:2310.02710  [pdf, other

    cs.LG stat.ML

    Local Search GFlowNets

    Authors: Minsu Kim, Taeyoung Yun, Emmanuel Bengio, Dinghuai Zhang, Yoshua Bengio, Sungsoo Ahn, **kyoo Park

    Abstract: Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which… ▽ More

    Submitted 22 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 (Spotlight paper), 18 pages, 17 figures

  43. arXiv:2310.02679  [pdf, other

    cs.LG cs.AI stat.CO stat.ME stat.ML

    Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

    Authors: Dinghuai Zhang, Ricky T. Q. Chen, Cheng-Hao Liu, Aaron Courville, Yoshua Bengio

    Abstract: We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajector… ▽ More

    Submitted 9 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted by ICLR 2024

  44. arXiv:2310.02423  [pdf, other

    cs.LG stat.ML

    Delta-AI: Local objectives for amortized inference in sparse graphical models

    Authors: Jean-Pierre Falet, Hae Beom Lee, Nikolay Malkin, Chen Sun, Dragos Secrieru, Thomas Jiralerspong, Dinghuai Zhang, Guillaume Lajoie, Yoshua Bengio

    Abstract: We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $Δ$-amortized inference ($Δ$-AI). Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective. This yields a local… ▽ More

    Submitted 13 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ICLR 2024; 19 pages, code: https://github.com/GFNOrg/Delta-AI/

  45. arXiv:2310.02230  [pdf, other

    cs.CV cs.AI

    Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks

    Authors: Luca Scimeca, Alexander Rubinstein, Armand Mihai Nicolicioiu, Damien Teney, Yoshua Bengio

    Abstract: Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to shortcut learning phenomena, where a model may rely on erroneous, easy-to-learn, cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs). We discover tha… ▽ More

    Submitted 18 November, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted at Neural Information Processing Systems(NeurIPS) 2023 - Workshop on Diffusion Models

  46. arXiv:2310.01807  [pdf, other

    cs.AI cs.LG

    Discrete, compositional, and symbolic representations through attractor dynamics

    Authors: Andrew Nam, Eric Elmoznino, Nikolay Malkin, Chen Sun, Yoshua Bengio, Guillaume Lajoie

    Abstract: Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite capacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantiz… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  47. arXiv:2310.00229  [pdf, other

    cs.AI cs.LG

    Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

    Authors: Mingde Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio

    Abstract: Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstractions to generalize better in novel situations. It automatically decomposes the given task into smaller, more manageable subtasks, and thus enables sparse decision-making and focused computation on the relevant parts of the environment. The decomposition relies… ▽ More

    Submitted 16 March, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: ICLR 2024 Camera-Ready

  48. arXiv:2309.17388  [pdf, other

    cs.LG

    Tree Cross Attention

    Authors: Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

    Abstract: Cross Attention is a popular method for retrieving information from a set of context tokens for making predictions. At inference time, for each prediction, Cross Attention scans the full set of $\mathcal{O}(N)$ tokens. In practice, however, often only a small subset of tokens are required for good performance. Methods such as Perceiver IO are cheap at inference as they distill the information to a… ▽ More

    Submitted 1 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted by ICLR 2024

  49. arXiv:2308.08708  [pdf, other

    cs.AI cs.CY cs.LG q-bio.NC

    Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

    Authors: Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji, Ryota Kanai, Colin Klein, Grace Lindsay, Matthias Michel, Liad Mudrik, Megan A. K. Peters, Eric Schwitzgebel, Jonathan Simon, Rufin VanRullen

    Abstract: Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argues for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of con… ▽ More

    Submitted 22 August, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

  50. arXiv:2307.06951   

    cs.AI cs.LG

    AI For Global Climate Cooperation 2023 Competition Proceedings

    Authors: Yoshua Bengio, Prateek Gupta, Lu Li, Soham Phade, Sunil Srinivasa, Andrew Williams, Tianyu Zhang, Yang Zhang, Stephan Zheng

    Abstract: The international community must collaborate to mitigate climate change and sustain economic growth. However, collaboration is hard to achieve, partly because no global authority can ensure compliance with international climate agreements. Combining AI with climate-economic simulations offers a promising solution to design international frameworks, including negotiation protocols and climate agree… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.