Search | arXiv e-print repository

arXiv:2403.12000 [pdf, other]

doi 10.5281/zenodo.7088404

Notochord: a Flexible Probabilistic Model for Real-Time MIDI Performance

Authors: Victor Shepardson, Jack Armitage, Thor Magnusson

Abstract: Deep learning-based probabilistic models of musical data are producing increasingly realistic results and promise to enter creative workflows of many kinds. Yet they have been little-studied in a performance setting, where the results of user actions typically ought to feel instantaneous. To enable such study, we designed Notochord, a deep probabilistic model for sequences of structured events, an… ▽ More Deep learning-based probabilistic models of musical data are producing increasingly realistic results and promise to enter creative workflows of many kinds. Yet they have been little-studied in a performance setting, where the results of user actions typically ought to feel instantaneous. To enable such study, we designed Notochord, a deep probabilistic model for sequences of structured events, and trained an instance of it on the Lakh MIDI dataset. Our probabilistic formulation allows interpretable interventions at a sub-event level, which enables one model to act as a backbone for diverse interactive musical functions including steerable generation, harmonization, machine improvisation, and likelihood-based interfaces. Notochord can generate polyphonic and multi-track MIDI, and respond to inputs with latency below ten milliseconds. Training code, model checkpoints and interactive examples are provided as open source software. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 12 pages, 6 figures. Proceedings of the 3rd Conference on AI Music Creativity (2022, September 17)

arXiv:2309.04491 [pdf, ps, other]

A Context-Sensitive Approach to XAI in Music Performance

Authors: Nicola Privato, Jack Armitage

Abstract: The rapidly evolving field of Explainable Artificial Intelligence (XAI) has generated significant interest in develo** methods to make AI systems more transparent and understandable. However, the problem of explainability cannot be exhaustively solved in the abstract, as there is no single approach that can be universally applied to generate adequate explanations for any given AI system, and thi… ▽ More The rapidly evolving field of Explainable Artificial Intelligence (XAI) has generated significant interest in develo** methods to make AI systems more transparent and understandable. However, the problem of explainability cannot be exhaustively solved in the abstract, as there is no single approach that can be universally applied to generate adequate explanations for any given AI system, and this is especially true in the arts. In this position paper, we propose an Explanatory Pragmatism (EP) framework for XAI in music performance, emphasising the importance of context and audience in the development of explainability requirements. By tailoring explanations to specific audiences and continuously refining them based on feedback, EP offers a promising direction for enhancing the transparency and interpretability of AI systems in broad artistic applications and more specifically to music performance. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2308.05496 [pdf, other]

Exploring XAI for the Arts: Explaining Latent Space in Generative Music

Authors: Nick Bryan-Kinns, Berker Banar, Corey Ford, Courtney N. Reed, Yixiao Zhang, Simon Colton, Jack Armitage

Abstract: Explainable AI has the potential to support more interactive and fluid co-creative AI systems which can creatively collaborate with people. To do this, creative AI models need to be amenable to debugging by offering eXplainable AI (XAI) features which are inspectable, understandable, and modifiable. However, currently there is very little XAI for the arts. In this work, we demonstrate how a latent… ▽ More Explainable AI has the potential to support more interactive and fluid co-creative AI systems which can creatively collaborate with people. To do this, creative AI models need to be amenable to debugging by offering eXplainable AI (XAI) features which are inspectable, understandable, and modifiable. However, currently there is very little XAI for the arts. In this work, we demonstrate how a latent variable model for music generation can be made more explainable; specifically we extend MeasureVAE which generates measures of music. We increase the explainability of the model by: i) using latent space regularisation to force some specific dimensions of the latent space to map to meaningful musical attributes, ii) providing a user interface feedback loop to allow people to adjust dimensions of the latent space and observe the results of these changes in real-time, iii) providing a visualisation of the musical attributes in the latent space to help people understand and predict the effect of changes to latent space dimensions. We suggest that in doing so we bridge the gap between the latent space and the generated musical outcomes in a meaningful way which makes the model and its outputs more explainable and more debuggable. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2211.04160 [pdf, other]

A Neural Network Subgrid Model of the Early Stages of Planet Formation

Authors: Thomas Pfeil, Miles Cranmer, Shirley Ho, Philip J. Armitage, Tilman Birnstiel, Hubert Klahr

Abstract: Planet formation is a multi-scale process in which the coagulation of $\mathrm{μm}$-sized dust grains in protoplanetary disks is strongly influenced by the hydrodynamic processes on scales of astronomical units ($\approx 1.5\times 10^8 \,\mathrm{km}$). Studies are therefore dependent on subgrid models to emulate the micro physics of dust coagulation on top of a large scale hydrodynamic simulation.… ▽ More Planet formation is a multi-scale process in which the coagulation of $\mathrm{μm}$-sized dust grains in protoplanetary disks is strongly influenced by the hydrodynamic processes on scales of astronomical units ($\approx 1.5\times 10^8 \,\mathrm{km}$). Studies are therefore dependent on subgrid models to emulate the micro physics of dust coagulation on top of a large scale hydrodynamic simulation. Numerical simulations which include the relevant physical effects are complex and computationally expensive. Here, we present a fast and accurate learned effective model for dust coagulation, trained on data from high resolution numerical coagulation simulations. Our model captures details of the dust coagulation process that were so far not tractable with other dust coagulation prescriptions with similar computational efficiency. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: 6 pages, 4 figures, accepted at the Machine Learning and the Physical Sciences workshop, NeurIPS 2022

arXiv:2210.02339 [pdf, other]

doi 10.3847/1538-4357/ad088c

Particle clustering in turbulence: Prediction of spatial and statistical properties with deep learning

Authors: Yan-Mong Chan, Natascha Manger, Yin Li, Chao-Chin Yang, Zhaohuan Zhu, Philip J. Armitage, Shirley Ho

Abstract: We investigate the utility of deep learning for modeling the clustering of particles that are aerodynamically coupled to turbulent fluids. Using a Lagrangian particle module within the Athena++ hydrodynamics code, we simulate the dynamics of particles in the Epstein drag regime within a periodic domain of isotropic forced hydrodynamic turbulence. This setup is an idealized model relevant to the co… ▽ More We investigate the utility of deep learning for modeling the clustering of particles that are aerodynamically coupled to turbulent fluids. Using a Lagrangian particle module within the Athena++ hydrodynamics code, we simulate the dynamics of particles in the Epstein drag regime within a periodic domain of isotropic forced hydrodynamic turbulence. This setup is an idealized model relevant to the collisional growth of micron to mm-sized dust particles in early stage planet formation. The simulation data are used to train a U-Net deep learning model to predict gridded three-dimensional representations of the particle density and velocity fields, given as input the corresponding fluid fields. The trained model qualitatively captures the filamentary structure of clustered particles in a highly non-linear regime. We assess model fidelity by calculating metrics of the density field (the radial distribution function) and of the velocity field (the relative velocity and the relative radial velocity between particles). Although trained only on the spatial fields, the model predicts these statistical quantities with errors that are typically <10%. Our results suggest that, given appropriately expanded training data, deep learning could complement direct numerical simulations in predicting particle clustering within turbulent flows. △ Less

Submitted 6 January, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: 20 pages, 13 figures; Accepted to ApJ on 2023 October 30

Journal ref: ApJ 960 19 (2024)

arXiv:2207.11717 [pdf, other]

A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues

Authors: Jason Armitage, Leonardo Impett, Rico Sennrich

Abstract: In a busy city street, a pedestrian surrounded by distractions can pick out a single sign if it is relevant to their route. Artificial agents in outdoor Vision-and-Language Navigation (VLN) are also confronted with detecting supervisory signal on environment features and location in inputs. To boost the prominence of relevant features in transformer-based architectures without costly preprocessing… ▽ More In a busy city street, a pedestrian surrounded by distractions can pick out a single sign if it is relevant to their route. Artificial agents in outdoor Vision-and-Language Navigation (VLN) are also confronted with detecting supervisory signal on environment features and location in inputs. To boost the prominence of relevant features in transformer-based architectures without costly preprocessing and pretraining, we take inspiration from priority maps - a mechanism described in neuropsychological studies. We implement a novel priority map module and pretrain on auxiliary tasks using low-sample datasets with high-level representations of routes and environment-related references to urban features. A hierarchical process of trajectory planning - with subsequent parameterised visual boost filtering on visual inputs and prediction of corresponding textual spans - addresses the core challenges of cross-modal alignment and feature-level localisation. The priority map module is integrated into a feature-location framework that doubles the task completion rates of standalone transformers and attains state-of-the-art performance on the Touchdown benchmark for VLN. Code and data are referenced in Appendix C. △ Less

Submitted 18 November, 2022; v1 submitted 24 July, 2022; originally announced July 2022.

Comments: Accepted to WACV 2023

ACM Class: I.2

arXiv:2101.04117 [pdf, other]

doi 10.1073/pnas.2026053118

A Bayesian neural network predicts the dissolution of compact planetary systems

Authors: Miles Cranmer, Daniel Tamayo, Hanno Rein, Peter Battaglia, Samuel Hadden, Philip J. Armitage, Shirley Ho, David N. Spergel

Abstract: Despite over three hundred years of effort, no solutions exist for predicting when a general planetary configuration will become unstable. We introduce a deep learning architecture to push forward this problem for compact systems. While current machine learning algorithms in this area rely on scientist-derived instability metrics, our new technique learns its own metrics from scratch, enabled by a… ▽ More Despite over three hundred years of effort, no solutions exist for predicting when a general planetary configuration will become unstable. We introduce a deep learning architecture to push forward this problem for compact systems. While current machine learning algorithms in this area rely on scientist-derived instability metrics, our new technique learns its own metrics from scratch, enabled by a novel internal structure inspired from dynamics theory. Our Bayesian neural network model can accurately predict not only if, but also when a compact planetary system with three or more planets will go unstable. Our model, trained directly from short N-body time series of raw orbital elements, is more than two orders of magnitude more accurate at predicting instability times than analytical estimators, while also reducing the bias of existing machine learning algorithms by nearly a factor of three. Despite being trained on compact resonant and near-resonant three-planet configurations, the model demonstrates robust generalization to both non-resonant and higher multiplicity configurations, in the latter case outperforming models fit to that specific set of integrations. The model computes instability estimates up to five orders of magnitude faster than a numerical integrator, and unlike previous efforts provides confidence intervals on its predictions. Our inference model is publicly available in the SPOCK package, with training code open-sourced. △ Less

Submitted 11 January, 2021; originally announced January 2021.

Comments: 8 content pages, 7 appendix and references. 8 figures. Source code at: https://github.com/MilesCranmer/bnn_chaos_model inference code at https://github.com/dtamayo/spock

arXiv:2008.11450 [pdf, other]

Training Multimodal Systems for Classification with Multiple Objectives

Authors: Jason Armitage, Shramana Thakur, Rishi Tripathi, Jens Lehmann, Maria Maleshkova

Abstract: We learn about the world from a diverse range of sensory information. Automated systems lack this ability as investigation has centred on processing information presented in a single form. Adapting architectures to learn from multiple modalities creates the potential to learn rich representations of the world - but current multimodal systems only deliver marginal improvements on unimodal approache… ▽ More We learn about the world from a diverse range of sensory information. Automated systems lack this ability as investigation has centred on processing information presented in a single form. Adapting architectures to learn from multiple modalities creates the potential to learn rich representations of the world - but current multimodal systems only deliver marginal improvements on unimodal approaches. Neural networks learn sampling noise during training with the result that performance on unseen data is degraded. This research introduces a second objective over the multimodal fusion process learned with variational inference. Regularisation methods are implemented in the inner training loop to control variance and the modular structure stabilises performance as additional neurons are added to layers. This framework is evaluated on a multilabel classification task with textual and visual inputs to demonstrate the potential for multiple objectives and probabilistic methods to lower variance and improve generalisation. △ Less

Submitted 26 August, 2020; originally announced August 2020.

Journal ref: Proceedings of the 1st International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 17th Extended Semantic Web Conference (ESWC 2020)

arXiv:2008.06376 [pdf, other]

doi 10.1145/3340531.3412783

MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities

Authors: Jason Armitage, Endri Kacupaj, Golsa Tahmasebzadeh, Swati, Maria Maleshkova, Ralph Ewerth, Jens Lehmann

Abstract: In this paper, we introduce the MLM (Multiple Languages and Modalities) dataset - a new resource to train and evaluate multitask systems on samples in multiple modalities and three languages. The generation process and inclusion of semantic data provide a resource that further tests the ability for multitask systems to learn relationships between entities. The dataset is designed for researchers a… ▽ More In this paper, we introduce the MLM (Multiple Languages and Modalities) dataset - a new resource to train and evaluate multitask systems on samples in multiple modalities and three languages. The generation process and inclusion of semantic data provide a resource that further tests the ability for multitask systems to learn relationships between entities. The dataset is designed for researchers and developers who build applications that perform multiple tasks on data encountered on the web and in digital archives. A second version of MLM provides a geo-representative subset of the data with weighted samples for countries of the European Union. We demonstrate the value of the resource in develo** novel applications in the digital humanities with a motivating use case and specify a benchmark set of tasks to retrieve modalities and locate entities in the dataset. Evaluation of baseline multitask and single task systems on the full and geo-representative versions of MLM demonstrate the challenges of generalising on diverse data. In addition to the digital humanities, we expect the resource to contribute to research in multimodal representation learning, location estimation, and scene understanding. △ Less

Submitted 4 September, 2020; v1 submitted 14 August, 2020; originally announced August 2020.

Journal ref: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 2967-2974. 2020

arXiv:1910.13325 [pdf]

Fragment Graphical Variational AutoEncoding for Screening Molecules with Small Data

Authors: John Armitage, Leszek J. Spalek, Malgorzata Nguyen, Mark Nikolka, Ian E. Jacobs, Lorena Marañón, Iyad Nasrallah, Guillaume Schweicher, Ivan Dimov, Dimitrios Simatos, Iain McCulloch, Christian B. Nielsen, Gareth Conduit, Henning Sirringhaus

Abstract: In the majority of molecular optimization tasks, predictive machine learning (ML) models are limited due to the unavailability and cost of generating big experimental datasets on the specific task. To circumvent this limitation, ML models are trained on big theoretical datasets or experimental indicators of molecular suitability that are either publicly available or inexpensive to acquire. These a… ▽ More In the majority of molecular optimization tasks, predictive machine learning (ML) models are limited due to the unavailability and cost of generating big experimental datasets on the specific task. To circumvent this limitation, ML models are trained on big theoretical datasets or experimental indicators of molecular suitability that are either publicly available or inexpensive to acquire. These approaches produce a set of candidate molecules which have to be ranked using limited experimental data or expert knowledge. Under the assumption that structure is related to functionality, here we use a molecular fragment-based graphical autoencoder to generate unique structural fingerprints to efficiently search through the candidate set. We demonstrate that fragment-based graphical autoencoding reduces the error in predicting physical characteristics such as the solubility and partition coefficient in the small data regime compared to other extended circular fingerprints and string based approaches. We further demonstrate that this approach is capable of providing insight into real world molecular optimization problems, such as searching for stabilization additives in organic semiconductors by accurately predicting 92% of test molecules given 69 training examples. This task is a model example of black box molecular optimization as there is minimal theoretical and experimental knowledge to accurately predict the suitability of the additives. △ Less

Submitted 30 October, 2019; v1 submitted 21 October, 2019; originally announced October 2019.

Showing 1–10 of 10 results for author: Armitage, J