Data-Driven Discovery of Molecular Photoswitches with Multioutput Gaussian Processes
Authors:
Ryan-Rhys Griffiths,
Jake L. Greenfield,
Aditya R. Thawani,
Arian R. Jamasb,
Henry B. Moss,
Anthony Bourached,
Penelope Jones,
William McCorkindale,
Alexander A. Aldrick,
Matthew J. Fuchter Alpha A. Lee
Abstract:
Photoswitchable molecules display two or more isomeric forms that may be accessed using light. Separating the electronic absorption bands of these isomers is key to selectively addressing a specific isomer and achieving high photostationary states whilst overall red-shifting the absorption bands serves to limit material damage due to UV-exposure and increases penetration depth in photopharmacologi…
▽ More
Photoswitchable molecules display two or more isomeric forms that may be accessed using light. Separating the electronic absorption bands of these isomers is key to selectively addressing a specific isomer and achieving high photostationary states whilst overall red-shifting the absorption bands serves to limit material damage due to UV-exposure and increases penetration depth in photopharmacological applications. Engineering these properties into a system through synthetic design however, remains a challenge. Here, we present a data-driven discovery pipeline for molecular photoswitches underpinned by dataset curation and multitask learning with Gaussian processes. In the prediction of electronic transition wavelengths, we demonstrate that a multioutput Gaussian process (MOGP) trained using labels from four photoswitch transition wavelengths yields the strongest predictive performance relative to single-task models as well as operationally outperforming time-dependent density functional theory (TD-DFT) in terms of the wall-clock time for prediction. We validate our proposed approach experimentally by screening a library of commercially available photoswitchable molecules. Through this screen, we identified several motifs that displayed separated electronic absorption bands of their isomers, exhibited red-shifted absorptions, and are suited for information transfer and photopharmacological applications. Our curated dataset, code, as well as all models are made available at https://github.com/Ryan-Rhys/The-Photoswitch-Dataset
△ Less
Submitted 7 August, 2022; v1 submitted 28 June, 2020;
originally announced August 2020.
Achieving Robustness to Aleatoric Uncertainty with Heteroscedastic Bayesian Optimisation
Authors:
Ryan-Rhys Griffiths,
Alexander A. Aldrick,
Miguel Garcia-Ortegon,
Vidhi R. Lalchand,
Alpha A. Lee
Abstract:
Bayesian optimisation is a sample-efficient search methodology that holds great promise for accelerating drug and materials discovery programs. A frequently-overlooked modelling consideration in Bayesian optimisation strategies however, is the representation of heteroscedastic aleatoric uncertainty. In many practical applications it is desirable to identify inputs with low aleatoric noise, an exam…
▽ More
Bayesian optimisation is a sample-efficient search methodology that holds great promise for accelerating drug and materials discovery programs. A frequently-overlooked modelling consideration in Bayesian optimisation strategies however, is the representation of heteroscedastic aleatoric uncertainty. In many practical applications it is desirable to identify inputs with low aleatoric noise, an example of which might be a material composition which consistently displays robust properties in response to a noisy fabrication process. In this paper, we propose a heteroscedastic Bayesian optimisation scheme capable of representing and minimising aleatoric noise across the input space. Our scheme employs a heteroscedastic Gaussian process (GP) surrogate model in conjunction with two straightforward adaptations of existing acquisition functions. First, we extend the augmented expected improvement (AEI) heuristic to the heteroscedastic setting and second, we introduce the aleatoric noise-penalised expected improvement (ANPEI) heuristic. Both methodologies are capable of penalising aleatoric noise in the suggestions and yield improved performance relative to homoscedastic Bayesian optimisation and random sampling on toy problems as well as on two real-world scientific datasets. Code is available at: \url{https://github.com/Ryan-Rhys/Heteroscedastic-BO}
△ Less
Submitted 20 May, 2022; v1 submitted 17 October, 2019;
originally announced October 2019.