-
Hardware Aware Evolutionary Neural Architecture Search using Representation Similarity Metric
Authors:
Nilotpal Sinha,
Abd El Rahman Shabayek,
Anis Kacem,
Peyman Rostami,
Carl Shneider,
Djamila Aouada
Abstract:
Hardware-aware Neural Architecture Search (HW-NAS) is a technique used to automatically design the architecture of a neural network for a specific task and target hardware. However, evaluating the performance of candidate architectures is a key challenge in HW-NAS, as it requires significant computational resources. To address this challenge, we propose an efficient hardware-aware evolution-based…
▽ More
Hardware-aware Neural Architecture Search (HW-NAS) is a technique used to automatically design the architecture of a neural network for a specific task and target hardware. However, evaluating the performance of candidate architectures is a key challenge in HW-NAS, as it requires significant computational resources. To address this challenge, we propose an efficient hardware-aware evolution-based NAS approach called HW-EvRSNAS. Our approach re-frames the neural architecture search problem as finding an architecture with performance similar to that of a reference model for a target hardware, while adhering to a cost constraint for that hardware. This is achieved through a representation similarity metric known as Representation Mutual Information (RMI) employed as a proxy performance evaluator. It measures the mutual information between the hidden layer representations of a reference model and those of sampled architectures using a single training batch. We also use a penalty term that penalizes the search process in proportion to how far an architecture's hardware cost is from the desired hardware cost threshold. This resulted in a significantly reduced search time compared to the literature that reached up to 8000x speedups resulting in lower CO2 emissions. The proposed approach is evaluated on two different search spaces while using lower computational resources. Furthermore, our approach is thoroughly examined on six different edge devices under various hardware cost constraints.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Impact of Disentanglement on Pruning Neural Networks
Authors:
Carl Shneider,
Peyman Rostami,
Anis Kacem,
Nilotpal Sinha,
Abd El Rahman Shabayek,
Djamila Aouada
Abstract:
Deploying deep learning neural networks on edge devices, to accomplish task specific objectives in the real-world, requires a reduction in their memory footprint, power consumption, and latency. This can be realized via efficient model compression. Disentangled latent representations produced by variational autoencoder (VAE) networks are a promising approach for achieving model compression because…
▽ More
Deploying deep learning neural networks on edge devices, to accomplish task specific objectives in the real-world, requires a reduction in their memory footprint, power consumption, and latency. This can be realized via efficient model compression. Disentangled latent representations produced by variational autoencoder (VAE) networks are a promising approach for achieving model compression because they mainly retain task-specific information, discarding useless information for the task at hand. We make use of the Beta-VAE framework combined with a standard criterion for pruning to investigate the impact of forcing the network to learn disentangled representations on the pruning process for the task of classification. In particular, we perform experiments on MNIST and CIFAR10 datasets, examine disentanglement challenges, and propose a path forward for future works.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
A Survey on Deep Learning-Based Monocular Spacecraft Pose Estimation: Current State, Limitations and Prospects
Authors:
Leo Pauly,
Wassim Rharbaoui,
Carl Shneider,
Arunkumar Rathinam,
Vincent Gaudilliere,
Djamila Aouada
Abstract:
Estimating the pose of an uncooperative spacecraft is an important computer vision problem for enabling the deployment of automatic vision-based systems in orbit, with applications ranging from on-orbit servicing to space debris removal. Following the general trend in computer vision, more and more works have been focusing on leveraging Deep Learning (DL) methods to address this problem. However a…
▽ More
Estimating the pose of an uncooperative spacecraft is an important computer vision problem for enabling the deployment of automatic vision-based systems in orbit, with applications ranging from on-orbit servicing to space debris removal. Following the general trend in computer vision, more and more works have been focusing on leveraging Deep Learning (DL) methods to address this problem. However and despite promising research-stage results, major challenges preventing the use of such methods in real-life missions still stand in the way. In particular, the deployment of such computation-intensive algorithms is still under-investigated, while the performance drop when training on synthetic and testing on real images remains to mitigate. The primary goal of this survey is to describe the current DL-based methods for spacecraft pose estimation in a comprehensive manner. The secondary goal is to help define the limitations towards the effective deployment of DL-based spacecraft pose estimation solutions for reliable autonomous vision-based applications. To this end, the survey first summarises the existing algorithms according to two approaches: hybrid modular pipelines and direct end-to-end regression methods. A comparison of algorithms is presented not only in terms of pose accuracy but also with a focus on network architectures and models' sizes kee** potential deployment in mind. Then, current monocular spacecraft pose estimation datasets used to train and test these methods are discussed. The data generation methods: simulators and testbeds, the domain gap and the performance drop between synthetically generated and lab/space collected images and the potential solutions are also discussed. Finally, the paper presents open research questions and future directions in the field, drawing parallels with other computer vision applications.
△ Less
Submitted 17 May, 2023; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Probabilistic prediction of Dst storms one-day-ahead using Full-Disk SoHO Images
Authors:
A. Hu,
C. Shneider,
A. Tiwari,
E. Camporeale
Abstract:
We present a new model for the probability that the Disturbance storm time (Dst) index exceeds -100 nT, with a lead time between 1 and 3 days. $Dst$ provides essential information about the strength of the ring current around the Earth caused by the protons and electrons from the solar wind, and it is routinely used as a proxy for geomagnetic storms. The model is developed using an ensemble of Con…
▽ More
We present a new model for the probability that the Disturbance storm time (Dst) index exceeds -100 nT, with a lead time between 1 and 3 days. $Dst$ provides essential information about the strength of the ring current around the Earth caused by the protons and electrons from the solar wind, and it is routinely used as a proxy for geomagnetic storms. The model is developed using an ensemble of Convolutional Neural Networks (CNNs) that are trained using SoHO images (MDI, EIT and LASCO). The relationship between the SoHO images and the solar wind has been investigated by many researchers, but these studies have not explicitly considered using SoHO images to predict the $Dst$ index.
This work presents a novel methodology to train the individual models and to learn the optimal ensemble weights iteratively, by using a customized class-balanced mean square error (CB-MSE) loss function tied to a least-squares (LS) based ensemble.
The proposed model can predict the probability that Dst<-100 nT 24 hours ahead with a True Skill Statistic (TSS) of 0.62 and Matthews Correlation Coefficient (MCC) of 0.37. The weighted TSS and MCC from Guastavino et al. (2021) is 0.68 and 0.47, respectively. An additional validation during non-Earth-directed CME periods is also conducted which yields a good TSS and MCC score.
△ Less
Submitted 29 March, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
The data-driven future of high energy density physics
Authors:
Peter W. Hatfield,
Jim A. Gaffney,
Gemma J. Anderson,
Suzanne Ali,
Luca Antonelli,
Suzan Başeğmez du Pree,
Jonathan Citrin,
Marta Fajardo,
Patrick Knapp,
Brendan Kettle,
Bogdan Kustowski,
Michael J. MacDonald,
Derek Mariscal,
Madison E. Martin,
Taisuke Nagayama,
Charlotte A. J. Palmer,
J. Luc Peterson,
Steven Rose,
J J Ruby,
Carl Shneider,
Matt J. V. Streeter,
Will Trickey,
Ben Williams
Abstract:
The study of plasma physics under conditions of extreme temperatures, densities and electromagnetic field strengths is significant for our understanding of astrophysics, nuclear fusion and fundamental physics. These extreme physical systems are strongly non-linear and very difficult to understand theoretically or optimize experimentally. Here, we argue that machine learning models and data-driven…
▽ More
The study of plasma physics under conditions of extreme temperatures, densities and electromagnetic field strengths is significant for our understanding of astrophysics, nuclear fusion and fundamental physics. These extreme physical systems are strongly non-linear and very difficult to understand theoretically or optimize experimentally. Here, we argue that machine learning models and data-driven methods are in the process of resha** our exploration of these extreme systems that have hitherto proven far too non-linear for human researchers. From a fundamental perspective, our understanding can be helped by the way in which machine learning models can rapidly discover complex interactions in large data sets. From a practical point of view, the newest generation of extreme physics facilities can perform experiments multiple times a second (as opposed to ~daily), moving away from human-based control towards automatic control based on real-time interpretation of diagnostic data and updates of the physics model. To make the most of these emerging opportunities, we advance proposals for the community in terms of research design, training, best practices, and support for synthetic diagnostics and data analysis.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
A Machine-Learning-Ready Dataset Prepared from the Solar and Heliospheric Observatory Mission
Authors:
Carl Shneider,
Andong Hu,
Ajay K. Tiwari,
Monica G. Bobra,
Karl Battams,
Jannis Teunissen,
Enrico Camporeale
Abstract:
We present a Python tool to generate a standard dataset from solar images that allows for user-defined selection criteria and a range of pre-processing steps. Our Python tool works with all image products from both the Solar and Heliospheric Observatory (SoHO) and Solar Dynamics Observatory (SDO) missions. We discuss a dataset produced from the SoHO mission's multi-spectral images which is free of…
▽ More
We present a Python tool to generate a standard dataset from solar images that allows for user-defined selection criteria and a range of pre-processing steps. Our Python tool works with all image products from both the Solar and Heliospheric Observatory (SoHO) and Solar Dynamics Observatory (SDO) missions. We discuss a dataset produced from the SoHO mission's multi-spectral images which is free of missing or corrupt data as well as planetary transits in coronagraph images, and is temporally synced making it ready for input to a machine learning system. Machine-learning-ready images are a valuable resource for the community because they can be used, for example, for forecasting space weather parameters. We illustrate the use of this data with a 3-5 day-ahead forecast of the north-south component of the interplanetary magnetic field (IMF) observed at Lagrange point one (L1). For this use case, we apply a deep convolutional neural network (CNN) to a subset of the full SoHO dataset and compare with baseline results from a Gaussian Naive Bayes classifier.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
Ideas for Improving the Field of Machine Learning: Summarizing Discussion from the NeurIPS 2019 Retrospectives Workshop
Authors:
Shagun Sodhani,
Mayoore S. Jaiswal,
Lauren Baker,
Koustuv Sinha,
Carl Shneider,
Peter Henderson,
Joel Lehman,
Ryan Lowe
Abstract:
This report documents ideas for improving the field of machine learning, which arose from discussions at the ML Retrospectives workshop at NeurIPS 2019. The goal of the report is to disseminate these ideas more broadly, and in turn encourage continuing discussion about how the field could improve along these axes. We focus on topics that were most discussed at the workshop: incentives for encourag…
▽ More
This report documents ideas for improving the field of machine learning, which arose from discussions at the ML Retrospectives workshop at NeurIPS 2019. The goal of the report is to disseminate these ideas more broadly, and in turn encourage continuing discussion about how the field could improve along these axes. We focus on topics that were most discussed at the workshop: incentives for encouraging alternate forms of scholarship, re-structuring the review process, participation from academia and industry, and how we might better train computer scientists as scientists. Videos from the workshop can be accessed at https://slideslive.com/neurips/west-114-115-retrospectives-a-venue-for-selfreflection-in-ml-research
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics \& Losses
Authors:
Anna Jungbluth,
Xavier Gitiaux,
Shane A. Maloney,
Carl Shneider,
Paul J. Wright,
Alfredo Kalaitzis,
Michel Deudon,
Atılım Güneş Baydin,
Yarin Gal,
Andrés Muñoz-Jaramillo
Abstract:
Breakthroughs in our understanding of physical phenomena have traditionally followed improvements in instrumentation. Studies of the magnetic field of the Sun, and its influence on the solar dynamo and space weather events, have benefited from improvements in resolution and measurement frequency of new instruments. However, in order to fully understand the solar cycle, high-quality data across tim…
▽ More
Breakthroughs in our understanding of physical phenomena have traditionally followed improvements in instrumentation. Studies of the magnetic field of the Sun, and its influence on the solar dynamo and space weather events, have benefited from improvements in resolution and measurement frequency of new instruments. However, in order to fully understand the solar cycle, high-quality data across time-scales longer than the typical lifespan of a solar instrument are required. At the moment, discrepancies between measurement surveys prevent the combined use of all available data. In this work, we show that machine learning can help bridge the gap between measurement surveys by learning to \textbf{super-resolve} low-resolution magnetic field images and \textbf{translate} between characteristics of contemporary instruments in orbit. We also introduce the notion of physics-based metrics and losses for super-resolution to preserve underlying physics and constrain the solution space of possible super-resolution outputs.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties
Authors:
Xavier Gitiaux,
Shane A. Maloney,
Anna Jungbluth,
Carl Shneider,
Paul J. Wright,
Atılım Güneş Baydin,
Michel Deudon,
Yarin Gal,
Alfredo Kalaitzis,
Andrés Muñoz-Jaramillo
Abstract:
Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test…
▽ More
Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test the validity of our approach by super-resolving images of the Sun's magnetic field and by generating maps measuring the range of possible high resolution explanations compatible with a given low resolution magnetogram.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Constraining regular and turbulent magnetic field strengths in M51 via Faraday depolarization
Authors:
Carl Shneider,
Marijke Haverkorn,
Andrew Fletcher,
Anvar Shukurov
Abstract:
We employ an analytical model that incorporates both wavelength-dependent and wavelength-independent depolarization to describe radio polarimetric observations of polarization at $λλλ\, 3.5, 6.2, 20.5$ cm in M51 (NGC 5194). The aim is to constrain both the regular and turbulent magnetic field strengths in the disk and halo, modeled as a two- or three-layer magneto-ionic medium, via differential Fa…
▽ More
We employ an analytical model that incorporates both wavelength-dependent and wavelength-independent depolarization to describe radio polarimetric observations of polarization at $λλλ\, 3.5, 6.2, 20.5$ cm in M51 (NGC 5194). The aim is to constrain both the regular and turbulent magnetic field strengths in the disk and halo, modeled as a two- or three-layer magneto-ionic medium, via differential Faraday rotation and internal Faraday dispersion, along with wavelength-independent depolarization arising from turbulent magnetic fields. A reduced chi-squared analysis is used for the statistical comparison of predicted to observed polarization maps to determine the best-fit magnetic field configuration at each of four radial rings spanning $2.4 - 7.2$ kpc in $1.2$ kpc increments. We find that a two-layer modeling approach provides a better fit to the observations than a three-layer model, where the near and far sides of the halo are taken to be identical, although the resulting best-fit magnetic field strengths are comparable. This implies that all of the signal from the far halo is depolarized at these wavelengths. We find a total magnetic field in the disk of approximately $18~μ$G and a total magnetic field strength in the halo of $\sim 4-6~μ$G. Both turbulent and regular magnetic field strengths in the disk exceed those in the halo by a factor of a few. About half of the turbulent magnetic field in the disk is anisotropic, but in the halo all turbulence is only isotropic.
△ Less
Submitted 22 June, 2014;
originally announced June 2014.
-
Depolarization of synchrotron radiation in a multilayer magneto-ionic medium
Authors:
Carl Shneider,
Marijke Haverkorn,
Andrew Fletcher,
Anvar Shukurov
Abstract:
Depolarization of diffuse radio synchrotron emission is classified in terms of wavelength-independent and wavelength-dependent depolarization in the context of regular magnetic fields and of both isotropic and anisotropic turbulent magnetic fields. Previous analytical formulas for depolarization due to differential Faraday rotation are extended to include internal Faraday dispersion concomitantly,…
▽ More
Depolarization of diffuse radio synchrotron emission is classified in terms of wavelength-independent and wavelength-dependent depolarization in the context of regular magnetic fields and of both isotropic and anisotropic turbulent magnetic fields. Previous analytical formulas for depolarization due to differential Faraday rotation are extended to include internal Faraday dispersion concomitantly, for a multilayer synchrotron emitting and Faraday rotating magneto-ionic medium. In particular, depolarization equations for a two- and three-layer system (disk-halo, halo-disk-halo) are explicitly derived. To both serve as a `user's guide' to the theoretical machinery and as an approach for disentangling line-of-sight depolarization contributions in face-on galaxies, the analytical framework is applied to data from a small region in the face-on grand-design spiral galaxy M51. The effectiveness of the multiwavelength observations in constraining the pool of physical depolarization scenarios is illustrated for a two- and three-layer model along with a Faraday screen system for an observationally motivated magnetic field configuration.
△ Less
Submitted 14 May, 2014;
originally announced May 2014.