-
Acoustic Mode Hybridization in a Single Dimer of Gold Nanoparticles
Authors:
Adrien Girard,
Hélène Gehan,
Alain Mermet,
Jean Lermé,
Alice Berthelot,
Emmanuel Cottancin,
Aurelien Crut,
Jeremie Margueritat,
Christophe Bonnet
Abstract:
The acoustic vibrations of single monomers and dimers of gold nanoparticles were investigated by measuring for the first time their ultra-low frequency micro-Raman scattering. This experiment provides access not only to the frequency of the detected vibrational modes, but also to their dam** rate, which is obscured by inhomogeneous effects in measurements on ensembles of nano-objects. This allow…
▽ More
The acoustic vibrations of single monomers and dimers of gold nanoparticles were investigated by measuring for the first time their ultra-low frequency micro-Raman scattering. This experiment provides access not only to the frequency of the detected vibrational modes, but also to their dam** rate, which is obscured by inhomogeneous effects in measurements on ensembles of nano-objects. This allows a detailed analysis of the mechanical coupling occurring between two close nanoparticles (mediated by the polymer surrounding them) in the dimer case. Such coupling induces the hybridization of the vibrational modes of each nanoparticle, leading to the appearance in the Raman spectra of two ultra-low frequency modes corresponding to the out-of-phase longitudinal and transverse (with respect to the dimer axis) quasi-translations of the nanoparticles. Additionally, it is also shown to shift the frequency of the quadrupolar modes of the nanoparticles. Experimental results are interpreted using finite-element simulations, which enable the unambiguous identification of the detected modes and, despite the simplifications made, lead to a reasonable reproduction of their measured frequencies and quality factors. The demonstrated feasibility of low frequency Raman scattering experiments on single nano-objects opens up new possibilities to improve the understanding of nanoscale vibrations, this technique being complementary with single nano-object time-resolved spectroscopy as it gives access to different vibrational modes.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Combinatorial Optimization with Policy Adaptation using Latent Space Search
Authors:
Felix Chalumeau,
Shikha Surana,
Clement Bonnet,
Nathan Grinsztajn,
Arnu Pretorius,
Alexandre Laterre,
Thomas D. Barrett
Abstract:
Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial…
▽ More
Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial solvers as the go-to solution. Current approaches emphasize pre-training heuristics that construct solutions but often rely on search procedures with limited variance, such as stochastically sampling numerous solutions from a single policy or employing computationally expensive fine-tuning of the policy on individual problem instances. Building on the intuition that performant search at inference time should be anticipated during pre-training, we propose COMPASS, a novel RL approach that parameterizes a distribution of diverse and specialized policies conditioned on a continuous latent space. We evaluate COMPASS across three canonical problems - Travelling Salesman, Capacitated Vehicle Routing, and Job-Shop Scheduling - and demonstrate that our search strategy (i) outperforms state-of-the-art approaches on 11 standard benchmarking tasks and (ii) generalizes better, surpassing all other approaches on a set of 18 procedurally transformed instance distributions.
△ Less
Submitted 28 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX
Authors:
Clément Bonnet,
Daniel Luo,
Donal Byrne,
Shikha Surana,
Sasha Abramowitz,
Paul Duckworth,
Vincent Coyette,
Laurence I. Midgley,
Elshadai Tegegn,
Tristan Kalloniatis,
Omayma Mahjoub,
Matthew Macfarlane,
Andries P. Smit,
Nathan Grinsztajn,
Raphael Boige,
Cemlyn N. Waters,
Mohamed A. Mimouni,
Ulrich A. Mbou Sob,
Ruan de Kock,
Siddarth Singh,
Daniel Furelos-Blanco,
Victor Le,
Arnu Pretorius,
Alexandre Laterre
Abstract:
Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms. In modern RL research, there is a need for simulated environments that are performant, scalable, and modular to enable their utilization in a wider range of potential real-world applications. Therefore, we present Jumanji, a suite of diverse RL environments speci…
▽ More
Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms. In modern RL research, there is a need for simulated environments that are performant, scalable, and modular to enable their utilization in a wider range of potential real-world applications. Therefore, we present Jumanji, a suite of diverse RL environments specifically designed to be fast, flexible, and scalable. Jumanji provides a suite of environments focusing on combinatorial problems frequently encountered in industry, as well as challenging general decision-making tasks. By leveraging the efficiency of JAX and hardware accelerators like GPUs and TPUs, Jumanji enables rapid iteration of research ideas and large-scale experimentation, ultimately empowering more capable agents. Unlike existing RL environment suites, Jumanji is highly customizable, allowing users to tailor the initial state distribution and problem complexity to their needs. Furthermore, we provide actor-critic baselines for each environment, accompanied by preliminary findings on scaling and generalization scenarios. Jumanji aims to set a new standard for speed, adaptability, and scalability of RL environments.
△ Less
Submitted 15 March, 2024; v1 submitted 16 June, 2023;
originally announced June 2023.
-
A PDMP to model the stochastic influence of quiescence dynamics in blood cancers
Authors:
Céline Bonnet
Abstract:
In this article, we will see a new approach to study the impact of a small microscopic population of cancer cells on a macroscopic population of healthy cells, with an example inspired by pathological hematopoiesis. Hematopoiesis is the biological phenomenon of blood cells production by differentiation of cells called hematopoietic stem cells (HSCs). We will study the dynamics of a stochastic $4$-…
▽ More
In this article, we will see a new approach to study the impact of a small microscopic population of cancer cells on a macroscopic population of healthy cells, with an example inspired by pathological hematopoiesis. Hematopoiesis is the biological phenomenon of blood cells production by differentiation of cells called hematopoietic stem cells (HSCs). We will study the dynamics of a stochastic $4$-dimensional process describing the evolution over time of the number of healthy and cancer stem cells and the number of healthy and mutant red blood cells. The model takes into account the amplification between stem cells and red blood cells as well as the regulation of this amplification as a function of the number of red blood cells (healthy and mutant). A single cancer HSC is considered while other populations are in large numbers. We assume that the unique cancer HSC randomly switches between an active and a quiescent state. We show the convergence in law of this process towards a piecewise deterministic Markov process (PDMP), when the population size goes to infinity. We then study the long time behaviour of this limit process. We show the existence and uniqueness of an absolutely continuous invariant probability measure with respect to the Lebesgue's measure for the limit PDMP, previously gathered. We describe the support of the invariant probability and show that the process converges in total variation towards it, using theory develop by M. Benaim et al. We finally identify the invariant probability using its infinitesimal generator. Thanks to this probabilistic approach, we obtain a stationary system of partial differential equation describing the impact of cancer HSC quiescent phases and regulation on the cell density of the hematopoietic system studied.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
Site frequency spectrum of a rescued population under rare resistant mutations
Authors:
Céline Bonnet,
Hélène Leman
Abstract:
The aim of this article is to study the impact of resistance acquisition on the distribution of neutral mutations in a cell population under therapeutic pressure. The cell population is modeled by a bi-type branching process. Initially, the cells all carry type 0, associated with a negative growth rate. Mutations towards type $1$ are assumed to be rare and random, and lead to the survival of cells…
▽ More
The aim of this article is to study the impact of resistance acquisition on the distribution of neutral mutations in a cell population under therapeutic pressure. The cell population is modeled by a bi-type branching process. Initially, the cells all carry type 0, associated with a negative growth rate. Mutations towards type $1$ are assumed to be rare and random, and lead to the survival of cells under treatment, i.e. type $1$ is associated with a positive growth rate, and thus models the acquisition of a resistance. Cells also carry neutral mutations, acquired at birth and accumulated by inheritance, that do not affect their type. We describe the expectation of the "Site Frequency Spectrum" (SFS), which is an index of neutral mutation distribution in a population, under the asymptotic of rare events of resistance acquisition and of large initial population. Precisely, we give asymptotically-equivalent expressions of the expected number of neutral mutations shared by both a small and a large number of cells. To identify the influence of relatives on the SFS, our work also lead us to study in detail subcritical binary Galton-Watson trees, where each leaf is marked with a small probability. As a by-product of this study, we thus provide the law of the generation of a randomly chosen leaf in such a Galton-Watson tree conditioned on the number of marks.
△ Less
Submitted 24 April, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
Authors:
Clément Bonnet,
Laurence Midgley,
Alexandre Laterre
Abstract:
Meta-gradient Reinforcement Learning (RL) allows agents to self-tune their hyper-parameters in an online fashion during training. In this paper, we identify a bias in the meta-gradient of current meta-gradient RL approaches. This bias comes from using the critic that is trained using the meta-learned discount factor for the advantage estimation in the outer objective which requires a different dis…
▽ More
Meta-gradient Reinforcement Learning (RL) allows agents to self-tune their hyper-parameters in an online fashion during training. In this paper, we identify a bias in the meta-gradient of current meta-gradient RL approaches. This bias comes from using the critic that is trained using the meta-learned discount factor for the advantage estimation in the outer objective which requires a different discount factor. Because the meta-learned discount factor is typically lower than the one used in the outer objective, the resulting bias can cause the meta-gradient to favor myopic policies. We propose a simple solution to this issue: we eliminate this bias by using an alternative, \emph{outer} value function in the estimation of the outer loss. To obtain this outer value function we add a second head to the critic network and train it alongside the classic critic, using the outer loss discount factor. On an illustrative toy problem, we show that the bias can cause catastrophic failure of current meta-gradient RL approaches, and show that our proposed solution fixes it. We then apply our method to a more complex environment and demonstrate that fixing the meta-gradient bias can significantly improve performance.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization
Authors:
Nathan Grinsztajn,
Daniel Furelos-Blanco,
Shikha Surana,
Clément Bonnet,
Thomas D. Barrett
Abstract:
Applying reinforcement learning (RL) to combinatorial optimization problems is attractive as it removes the need for expert knowledge or pre-solved instances. However, it is unrealistic to expect an agent to solve these (often NP-)hard problems in a single shot at inference due to their inherent complexity. Thus, leading approaches often implement additional search strategies, from stochastic samp…
▽ More
Applying reinforcement learning (RL) to combinatorial optimization problems is attractive as it removes the need for expert knowledge or pre-solved instances. However, it is unrealistic to expect an agent to solve these (often NP-)hard problems in a single shot at inference due to their inherent complexity. Thus, leading approaches often implement additional search strategies, from stochastic sampling and beam search to explicit fine-tuning. In this paper, we argue for the benefits of learning a population of complementary policies, which can be simultaneously rolled out at inference. To this end, we introduce Poppy, a simple training procedure for populations. Instead of relying on a predefined or hand-crafted notion of diversity, Poppy induces an unsupervised specialization targeted solely at maximizing the performance of the population. We show that Poppy produces a set of complementary policies, and obtains state-of-the-art RL results on four popular NP-hard problems: traveling salesman, capacitated vehicle routing, 0-1 knapsack, and job-shop scheduling.
△ Less
Submitted 13 November, 2023; v1 submitted 7 October, 2022;
originally announced October 2022.
-
YALTAPy and YALTAPy_Online: Python toolboxes for the $H_\infty$-stability analysis of classical and fractional systems with commensurate delays
Authors:
Hugo Cavalera,
Jayvir Raj,
Guilherme Mazanti,
Catherine Bonnet
Abstract:
The aim of this paper is to give a presentation of the Python toolbox YALTAPy dedicated to the stability study of standard and fractional delay systems as well as its online version YALTAPy_Online. Both toolboxes are derived from YALTA whose functionalities will be recalled here. Examples will be given to show how these toolboxes may be used.
The aim of this paper is to give a presentation of the Python toolbox YALTAPy dedicated to the stability study of standard and fractional delay systems as well as its online version YALTAPy_Online. Both toolboxes are derived from YALTA whose functionalities will be recalled here. Examples will be given to show how these toolboxes may be used.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient Reinforcement Learning
Authors:
Clément Bonnet,
Paul Caron,
Thomas Barrett,
Ian Davies,
Alexandre Laterre
Abstract:
Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning. Among all the methods available, meta-gradients have emerged as a promising approach. They leverage the differentiability of the learning rule with respect to some hyper-parameters to adapt them in an online fashion. Although meta-gradients can be accumulated over multiple learning steps to…
▽ More
Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning. Among all the methods available, meta-gradients have emerged as a promising approach. They leverage the differentiability of the learning rule with respect to some hyper-parameters to adapt them in an online fashion. Although meta-gradients can be accumulated over multiple learning steps to avoid myopic updates, this is rarely used in practice. In this work, we demonstrate that whilst multi-step meta-gradients do provide a better learning signal in expectation, this comes at the cost of a significant increase in variance, hindering performance. In the light of this analysis, we introduce a novel method mixing multiple inner steps that enjoys a more accurate and robust meta-gradient signal, essentially trading off bias and variance in meta-gradient estimation. When applied to the Snake game, the mixing meta-gradient algorithm can cut the variance by a factor of 3 while achieving similar or higher performance.
△ Less
Submitted 30 October, 2021;
originally announced November 2021.
-
$L_2$ and BIBO stability of systems with variable delays
Authors:
Catherine Bonnet,
Jonathan R. Partington
Abstract:
This paper considers $L_2$ and BIBO stability and stabilization issues for systems with time-varying delays which can be of retarded or neutral type. An important role is played by a nominal system with fixed delays which are close to the time-varying ones. Under stability or stabilizability conditions of this nominal system, sufficient conditions are given in order to ensure similar properties fo…
▽ More
This paper considers $L_2$ and BIBO stability and stabilization issues for systems with time-varying delays which can be of retarded or neutral type. An important role is played by a nominal system with fixed delays which are close to the time-varying ones. Under stability or stabilizability conditions of this nominal system, sufficient conditions are given in order to ensure similar properties for the system with time-varying delays.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
Privacy in Data Service Composition
Authors:
Mahmoud Barhamgi,
Charith Perera,
Chia-Mu Yu,
Djamal Benslimane,
David Camacho,
Christine Bonnet
Abstract:
In modern information systems different information features, about the same individual, are often collected and managed by autonomous data collection services that may have different privacy policies. Answering many end-users' legitimate queries requires the integration of data from multiple such services. However, data integration is often hindered by the lack of a trusted entity, often called a…
▽ More
In modern information systems different information features, about the same individual, are often collected and managed by autonomous data collection services that may have different privacy policies. Answering many end-users' legitimate queries requires the integration of data from multiple such services. However, data integration is often hindered by the lack of a trusted entity, often called a mediator, with which the services can share their data and delegate the enforcement of their privacy policies. In this paper, we propose a flexible privacy-preserving data integration approach for answering data integration queries without the need for a trusted mediator. In our approach, services are allowed to enforce their privacy policies locally. The mediator is considered to be untrusted, and only has access to encrypted information to allow it to link data subjects across the different services. Services, by virtue of a new privacy requirement, dubbed k-Protection, limiting privacy leaks, cannot infer information about the data held by each other. End-users, in turn, have access to privacy-sanitized data only. We evaluated our approach using an example and a real dataset from the healthcare application domain. The results are promising from both the privacy preservation and the performance perspectives.
△ Less
Submitted 3 January, 2020;
originally announced January 2020.
-
Large fluctuations in multi-scale modeling for rest erythropoiesis
Authors:
Celine Bonnet,
Sylvie Méléard
Abstract:
Erythropoiesis is a mechanism for the production of red blood cells by cellular differentiation. It is based on amplification steps due to an interplay between renewal and differentiation in the successive cell compartments from stem cells to red blood cells. We will study this mechanism with a stochastic point of view to explain unexpected fluctuations on the red blood cell numbers, as surprising…
▽ More
Erythropoiesis is a mechanism for the production of red blood cells by cellular differentiation. It is based on amplification steps due to an interplay between renewal and differentiation in the successive cell compartments from stem cells to red blood cells. We will study this mechanism with a stochastic point of view to explain unexpected fluctuations on the red blood cell numbers, as surprisingly observed by biologists and medical doctors in a rest erythropoiesis. We consider three compartments: stem cells, progenitors and red blood cells. The dynamics of each cell type is characterized by its division rate and by the renewal and differentiation probabilities at each division event. We model the global population dynamics by a three-dimensional stochastic decomposable branching process. We show that the amplification mechanism is given by the inverse of the small difference between the differentiation and renewal probabilities. Introducing a parameter $K$ which scales simultaneously the size of the first component, the differentiation and renewal probabilities and the red blood cell death rate, we describe the asymptotic behavior of the process for large $K$. We show that each compartment has its own size scale and its own time scale. Focussing on the third component, we prove that the red blood cell population size, conveniently renormalized (in time and size), is expanded in an usual way inducing large fluctuations. The proofs are based on a fine study of the different scales involved in the model and on the use of different convergence and average techniques in the proofs.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Multiscale population dynamics in reproductive biology: singular perturbation reduction in deterministic and stochastic models
Authors:
Celine Bonnet,
Keltoum Chahour,
Frédérique Clément,
Marie Postel,
Romain Yvinec
Abstract:
In this study, we describe different modeling approaches for ovarian follicle population dynamics, based on either ordinary (ODE), partial (PDE) or stochastic (SDE) differential equations, and accounting for interactions between follicles. We put a special focus on representing the population-level feedback exerted by growing ovarian follicles onto the activation of quiescent follicles. We take ad…
▽ More
In this study, we describe different modeling approaches for ovarian follicle population dynamics, based on either ordinary (ODE), partial (PDE) or stochastic (SDE) differential equations, and accounting for interactions between follicles. We put a special focus on representing the population-level feedback exerted by growing ovarian follicles onto the activation of quiescent follicles. We take advantage of the timescale difference existing between the growth and activation processes to apply model reduction techniques in the framework of singular perturbations. We first study the linear versions of the models to derive theoretical results on the convergence to the limit models. In the nonlinear cases, we provide detailed numerical evidence of convergence to the limit behavior. We reproduce the main semi-quantitative features characterizing the ovarian follicle pool, namely a bimodal distribution of the whole population, and a slope break in the decay of the quiescent pool with aging.
△ Less
Submitted 4 March, 2019;
originally announced March 2019.