-
Spectral Determinants of Almost Equilateral Quantum Graphs
Authors:
Jonathan Harrison,
Tracy Weyand
Abstract:
Kirchoff's matrix tree theorem of 1847 connects the number of spanning trees of a graph to the spectral determinant of the discrete Laplacian [6]. Recently an analogue was obtained for quantum graphs relating the number of spanning trees to the spectral determinant of a Laplacian acting on functions on a metric graph with standard (Neumann-like) vertex conditions [11]. This result holds for quantu…
▽ More
Kirchoff's matrix tree theorem of 1847 connects the number of spanning trees of a graph to the spectral determinant of the discrete Laplacian [6]. Recently an analogue was obtained for quantum graphs relating the number of spanning trees to the spectral determinant of a Laplacian acting on functions on a metric graph with standard (Neumann-like) vertex conditions [11]. This result holds for quantum graphs where the edge lengths are close together. A quantum graph where the edge lengths are all equal is called equilateral. Here we consider equilateral graphs where we perturb the length of a single edge (almost equilateral graphs). We analyze the spectral determinant of almost equilateral complete graphs, complete bipartite graphs, and circulant graphs. This provides a measure of how fast the spectral determinant changes with respect to changes in an edge length. We apply these results to estimate the width of a window of edge lengths where the connection between the number of spanning trees and the spectral determinant can be observed. The results suggest the connection holds for a much wider window of edge lengths than is required in [11].
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
$\bar{b}c$ susceptibilities from fully relativistic lattice QCD
Authors:
Judd Harrison
Abstract:
We compute the $\bar{h}c$ (pseudo)scalar, (axial-)vector and (axial-)tensor susceptibilities as a function of $u=m_c/m_h$ between $u=m_c/m_b$ and $u=0.8$ using fully relativistic lattice QCD, employing nonperturbative current renormalisation and using the second generation 2+1+1 MILC HISQ gluon field configurations. We include ensembles with $a\approx 0.09\mathrm{fm}$, $0.06\mathrm{fm}$,…
▽ More
We compute the $\bar{h}c$ (pseudo)scalar, (axial-)vector and (axial-)tensor susceptibilities as a function of $u=m_c/m_h$ between $u=m_c/m_b$ and $u=0.8$ using fully relativistic lattice QCD, employing nonperturbative current renormalisation and using the second generation 2+1+1 MILC HISQ gluon field configurations. We include ensembles with $a\approx 0.09\mathrm{fm}$, $0.06\mathrm{fm}$, $0.045\mathrm{fm}$ and $0.033\mathrm{fm}$ and we are able to reach the physical $b$-quark on the two finest ensembles. At the physical $m_h=m_b$ point we find $\overline{m}_b^2 χ_{1^+}={0.720(34)\times 10^{-2}}$, $\overline{m}_b^2 χ_{1^-}={1.161(54)\times 10^{-2}}$, $χ_{0^-}={2.374(33)\times 10^{-2}}$, $χ_{0^+}={0.609(14)\times 10^{-2}}$. Our results for the (pseudo)scalar, vector and axial-vector are compatible with the expected small size of nonperturbative effects at $u=m_c/m_b$. We also give the first nonperturbative determination of the tensor susceptibilities, finding $\overline{m}_b^2 χ_{T}={0.891(44)\times 10^{-2}}$ and $\overline{m}_b^2 χ_{AT}={0.441(33)\times 10^{-2}}$. Our value of $\overline{m}_b^2χ_{AT}$ is in good agreement with the $\mathcal{O}(α_s)$ perturbation theory, while our result for $\overline{m}_b^2χ_{T}$ is in tension with the $\mathcal{O}(α_s)$ perturbation theory at the level of $2σ$. These results will allow for dispersively bounded parameterisations to be employed using lattice inputs for the full set of $h\to c$ semileptonic form factors in future calculations, for heavy-quark masses in the range $1.25\times m_c \leq m_h \leq m_b$.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
First 2D electron density measurements using Coherence Imaging Spectroscopy in the MAST-U Super-X divertor
Authors:
N. Lonigro,
R. Doyle,
J. S. Allcock,
B. Lipschultz,
K. Verhaegh,
C. Bowman,
D. Brida,
J. Harrison,
O. Myatra,
S. Silburn,
C. Theiler,
T. A. Wijkamp,
MAST-U Team,
the EUROfusion Tokamak Exploitation Team
Abstract:
2D profiles of electron density and neutral temperature are inferred from multi-delay Coherence Imaging Spectroscopy data of divertor plasmas using a non-linear inversion technique. The inference is based on imaging the spectral line-broadening of Balmer lines and can differentiate between the Doppler and Stark broadening components by measuring the fringe contrast at multiple interferometric dela…
▽ More
2D profiles of electron density and neutral temperature are inferred from multi-delay Coherence Imaging Spectroscopy data of divertor plasmas using a non-linear inversion technique. The inference is based on imaging the spectral line-broadening of Balmer lines and can differentiate between the Doppler and Stark broadening components by measuring the fringe contrast at multiple interferometric delays simultaneously. The model has been applied to images generated from simulated density profiles to evaluate its performance. Typical mean absolute errors of 30 percent are achieved, which are consistent with Monte Carlo uncertainty propagation accounting for noise, uncertainties in the calibrations, and in the model inputs. The analysis has been tested on experimental data from the MAST-U Super-X divertor, where it infers typical electron densities of 2-3 $10^{19}$ m$^{-3}$ and neutral temperatures of 0-2 eV during beam-heated L-mode discharges. The results are shown to be in reasonable agreement with the other available diagnostics.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Variational Bayesian Last Layers
Authors:
James Harrison,
John Willes,
Jasper Snoek
Abstract:
We introduce a deterministic variational formulation for training Bayesian last layer neural networks. This yields a sampling-free, single-pass model and loss that effectively improves uncertainty estimation. Our variational Bayesian last layer (VBLL) can be trained and evaluated with only quadratic complexity in last layer width, and is thus (nearly) computationally free to add to standard archit…
▽ More
We introduce a deterministic variational formulation for training Bayesian last layer neural networks. This yields a sampling-free, single-pass model and loss that effectively improves uncertainty estimation. Our variational Bayesian last layer (VBLL) can be trained and evaluated with only quadratic complexity in last layer width, and is thus (nearly) computationally free to add to standard architectures. We experimentally investigate VBLLs, and show that they improve predictive accuracy, calibration, and out of distribution detection over baselines across both regression and classification. Finally, we investigate combining VBLL layers with variational Bayesian feature learning, yielding a lower variance collapsed variational inference method for Bayesian neural networks.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Scalability in Building Component Data Annotation: Enhancing Facade Material Classification with Synthetic Data
Authors:
Josie Harrison,
Alexander Hollberg,
Yinan Yu
Abstract:
Computer vision models trained on Google Street View images can create material cadastres. However, current approaches need manually annotated datasets that are difficult to obtain and often have class imbalance. To address these challenges, this paper fine-tuned a Swin Transformer model on a synthetic dataset generated with DALL-E and compared the performance to a similar manually annotated datas…
▽ More
Computer vision models trained on Google Street View images can create material cadastres. However, current approaches need manually annotated datasets that are difficult to obtain and often have class imbalance. To address these challenges, this paper fine-tuned a Swin Transformer model on a synthetic dataset generated with DALL-E and compared the performance to a similar manually annotated dataset. Although manual annotation remains the gold standard, the synthetic dataset performance demonstrates a reasonable alternative. The findings will ease annotation needed to develop material cadastres, offering architects insights into opportunities for material reuse, thus contributing to the reduction of demolition waste.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Two-dimensional inference of divertor plasma characteristics: advancements to a multi-instrument Bayesian analysis system
Authors:
Daniel Greenhouse,
Chris Bowman,
Bruce Lipschultz,
Kevin Verhaegh,
James Harrison,
Alexandre Fil
Abstract:
An integrated data analysis system based on Bayesian inference has been developed for application to data from multiple diagnostics over the two-dimensional cross-section of tokamak divertors. Tests of the divertor multi-instrument Bayesian analysis system (D-MIBAS) on a synthetic data set (including realistic experimental uncertainties) generated from SOLPS-ITER predictions of the MAST-U divertor…
▽ More
An integrated data analysis system based on Bayesian inference has been developed for application to data from multiple diagnostics over the two-dimensional cross-section of tokamak divertors. Tests of the divertor multi-instrument Bayesian analysis system (D-MIBAS) on a synthetic data set (including realistic experimental uncertainties) generated from SOLPS-ITER predictions of the MAST-U divertor have been performed. The resulting inference was within 6\%, 5\% and 30\% median absolute percentage error of the SOLPS-predicted electron temperature, electron density and neutral atomic hydrogen density, respectively, across a two-dimensional poloidal cross-section of the MAST-U Super-X outer divertor.
To accommodate molecular contributions to Balmer emission, an advanced emission model has been developed which is shown to be crucial for inference accuracy. Our D-MIBAS system utilises a mesh aligned to poloidal magnetic flux-surfaces, throughout the divertor, with plasma parameters assigned to each mesh vertex and collectively considered in the inference. This allowed comprehensive forward models to multiple diagnostics and the inclusion of expected physics. This is shown to be important for inference precision when including molecular contributions to Balmer emission. These developments pave the way for accurate two-dimensional electron temperature, electron density and neutral atomic hydrogen density inferences for MAST-U divertor experimental data for the first time.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning under Distribution Shifts
Authors:
Tobias Enders,
James Harrison,
Maximilian Schiffer
Abstract:
We study the robustness of deep reinforcement learning algorithms against distribution shifts within contextual multi-stage stochastic combinatorial optimization problems from the operations research domain. In this context, risk-sensitive algorithms promise to learn robust policies. While this field is of general interest to the reinforcement learning community, most studies up-to-date focus on t…
▽ More
We study the robustness of deep reinforcement learning algorithms against distribution shifts within contextual multi-stage stochastic combinatorial optimization problems from the operations research domain. In this context, risk-sensitive algorithms promise to learn robust policies. While this field is of general interest to the reinforcement learning community, most studies up-to-date focus on theoretical results rather than real-world performance. With this work, we aim to bridge this gap by formally deriving a novel risk-sensitive deep reinforcement learning algorithm while providing numerical evidence for its efficacy. Specifically, we introduce discrete Soft Actor-Critic for the entropic risk measure by deriving a version of the Bellman equation for the respective Q-values. We establish a corresponding policy improvement result and infer a practical algorithm. We introduce an environment that represents typical contextual multi-stage stochastic combinatorial optimization problems and perform numerical experiments to empirically validate our algorithm's robustness against realistic distribution shifts, without compromising performance on the training distribution. We show that our algorithm is superior to risk-neutral Soft Actor-Critic as well as to two benchmark approaches for robust deep reinforcement learning. Thereby, we provide the first structured analysis on the robustness of reinforcement learning under distribution shifts in the realm of contextual multi-stage stochastic combinatorial optimization problems.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Universal Neural Functionals
Authors:
Allan Zhou,
Chelsea Finn,
James Harrison
Abstract:
A challenging problem in many modern machine learning tasks is to process weight-space features, i.e., to transform or extract information from the weights and gradients of a neural network. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. However, they are not applicable to general architectures, since the…
▽ More
A challenging problem in many modern machine learning tasks is to process weight-space features, i.e., to transform or extract information from the weights and gradients of a neural network. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. However, they are not applicable to general architectures, since the permutation symmetries of a weight space can be complicated by recurrence or residual connections. This work proposes an algorithm that automatically constructs permutation equivariant models, which we refer to as universal neural functionals (UNFs), for any weight space. Among other applications, we demonstrate how UNFs can be substituted into existing learned optimizer designs, and find promising improvements over prior methods when optimizing small image classifiers and language models. Our results suggest that learned optimizers can benefit from considering the (symmetry) structure of the weight space they optimize. We open-source our library for constructing UNFs at https://github.com/AllanYangZhou/universal_neural_functional.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Singular Control of (Reflected) Brownian Motion: A Computational Method Suitable for Queueing Applications
Authors:
Baris Ata,
J. Michael Harrison,
Nian Si
Abstract:
Motivated by applications in queueing theory, we consider a class of singular stochastic control problems whose state space is the d-dimensional positive orthant. The original problem is approximated by a drift control problem, to which we apply a recently developed computational method that is feasible for dimensions up to d=30 or more. To show that nearly optimal solutions are obtainable using t…
▽ More
Motivated by applications in queueing theory, we consider a class of singular stochastic control problems whose state space is the d-dimensional positive orthant. The original problem is approximated by a drift control problem, to which we apply a recently developed computational method that is feasible for dimensions up to d=30 or more. To show that nearly optimal solutions are obtainable using this method, we present computational results for a variety of examples, including queueing network examples that have appeared previously in the literature.
△ Less
Submitted 16 April, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Authors:
Avi Singh,
John D. Co-Reyes,
Rishabh Agarwal,
Ankesh Anand,
Piyush Patil,
Xavier Garcia,
Peter J. Liu,
James Harrison,
Jaehoon Lee,
Kelvin Xu,
Aaron Parisi,
Abhishek Kumar,
Alex Alemi,
Alex Rizkowsky,
Azade Nova,
Ben Adlam,
Bernd Bohnet,
Gamaleldin Elsayed,
Hanie Sedghi,
Igor Mordatch,
Isabelle Simpson,
Izzeddin Gur,
Jasper Snoek,
Jeffrey Pennington,
Jiri Hron
, et al. (16 additional authors not shown)
Abstract:
Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investig…
▽ More
Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times. Testing on advanced MATH reasoning and APPS coding benchmarks using PaLM-2 models, we find that ReST$^{EM}$ scales favorably with model size and significantly surpasses fine-tuning only on human data. Overall, our findings suggest self-training with feedback can substantially reduce dependence on human-generated data.
△ Less
Submitted 17 April, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Benchmarking Pathology Feature Extractors for Whole Slide Image Classification
Authors:
Georg Wölflein,
Dyke Ferber,
Asier R. Meneghetti,
Omar S. M. El Nahhas,
Daniel Truhn,
Zunamys I. Carrero,
David J. Harrison,
Ognjen Arandjelović,
Jakob Nikolas Kather
Abstract:
Weakly supervised whole slide image classification is a key task in computational pathology, which involves predicting a slide-level label from a set of image patches constituting the slide. Constructing models to solve this task involves multiple design choices, often made without robust empirical or conclusive theoretical justification. To address this, we conduct a comprehensive benchmarking of…
▽ More
Weakly supervised whole slide image classification is a key task in computational pathology, which involves predicting a slide-level label from a set of image patches constituting the slide. Constructing models to solve this task involves multiple design choices, often made without robust empirical or conclusive theoretical justification. To address this, we conduct a comprehensive benchmarking of feature extractors to answer three critical questions: 1) Is stain normalisation still a necessary preprocessing step? 2) Which feature extractors are best for downstream slide-level classification? 3) How does magnification affect downstream performance? Our study constitutes the most comprehensive evaluation of publicly available pathology feature extractors to date, involving more than 10,000 training runs across 14 feature extractors, 9 tasks, 5 datasets, 3 downstream architectures, 2 levels of magnification, and various preprocessing setups. Our findings challenge existing assumptions: 1) We observe empirically, and by analysing the latent space, that skip** stain normalisation and image augmentations does not degrade performance, while significantly reducing memory and computational demands. 2) We develop a novel evaluation metric to compare relative downstream performance, and show that the choice of feature extractor is the most consequential factor for downstream performance. 3) We find that lower-magnification slides are sufficient for accurate slide-level classification. Contrary to previous patch-level benchmarking studies, our approach emphasises clinical relevance by focusing on slide-level biomarker prediction tasks in a weakly supervised setting with external validation cohorts. Our findings stand to streamline digital pathology workflows by minimising preprocessing needs and informing the selection of feature extractors.
△ Less
Submitted 21 June, 2024; v1 submitted 20 November, 2023;
originally announced November 2023.
-
DenseNet and Support Vector Machine classifications of major depressive disorder using vertex-wise cortical features
Authors:
Vladimir Belov,
Tracy Erwin-Grabner,
Ling-Li Zeng,
Christopher R. K. Ching,
Andre Aleman,
Alyssa R. Amod,
Zeynep Basgoze,
Francesco Benedetti,
Bianca Besteher,
Katharina Brosch,
Robin Bülow,
Romain Colle,
Colm G. Connolly,
Emmanuelle Corruble,
Baptiste Couvy-Duchesne,
Kathryn Cullen,
Udo Dannlowski,
Christopher G. Davey,
Annemiek Dols,
Jan Ernsting,
Jennifer W. Evans,
Lukas Fisch,
Paola Fuentes-Claramonte,
Ali Saffet Gonul,
Ian H. Gotlib
, et al. (63 additional authors not shown)
Abstract:
Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, h…
▽ More
Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. In this study, we used globally representative data from the ENIGMA-MDD working group containing an extensive sample of people with MDD (N=2,772) and HC (N=4,240), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. As we analyzed a multi-site sample, we additionally applied the ComBat harmonization tool to remove potential nuisance effects of site. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of features and classifiers is unfeasible.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Long-legged Divertors and Neutral Baffling as a Solution to the Tokamak Power Exhaust Challenge
Authors:
K. Verhaegh,
J. R. Harrison,
D. Moulton,
B. Lipschultz,
N. Lonigro,
N. Osborne,
P. Ryan,
C. Theiler,
T. Wijkamp,
D. Brida,
C. Cowley,
G. Derks,
R. Doyle,
F. Federici,
B. Kool,
O. Février,
A. Hakola,
S. Henderson,
H. Reimerdes,
A. J. Thornton,
N. Vianello,
M. Wischmeier,
L. Xiang
Abstract:
Exhausting the power from the hot fusion core to the plasma facing components is one of fusion's biggest challenges. The MAST Upgrade tokamak uniquely integrates strong containment of neutrals within the exhaust area (divertor), away from the hot fusion core, with extreme divertor sha**. This enables improving power exhaust through long-legged divertors with a high magnetic field gradient (total…
▽ More
Exhausting the power from the hot fusion core to the plasma facing components is one of fusion's biggest challenges. The MAST Upgrade tokamak uniquely integrates strong containment of neutrals within the exhaust area (divertor), away from the hot fusion core, with extreme divertor sha**. This enables improving power exhaust through long-legged divertors with a high magnetic field gradient (total flux expansion). This study shows compelling MAST-U results for the improved power exhaust of long-legged, totally flux expanded, divertors, without any adverse impact to the hot fusion core, representing a significant step forward in addressing the fusion power exhaust challenge. Our comparative analysis of various divertor shapes demonstrates that even modest adjustments can significantly enhance exhaust performance while preserving core plasma performance. Through novel analysis, we attribute the reductions in particle and power loads to the expanded plasma-neutral interaction volume within long-legged divertors, in agreement with reduced models and simulation results. Strong segregation of neutrals enables the benefits of long-legged, totally flux expanded, divertors to be retrieved. Our study underscores the critical role of strategic divertor sha** in enhancing exhaust performance, stability and core-edge integration; signifying an essential advancement towards sustainable fusion energy.
△ Less
Submitted 14 June, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
Investigations of atomic \& molecular processes of NBI-heated discharges in the MAST Upgrade Super-X divertor with implications for reactors
Authors:
K. Verhaegh,
J. R. Harrison,
B. Lipschultz,
N. Lonigro,
S. Kobussen,
D. Moulton,
N. Osborne,
P. Ryan,
C. Theiler,
T. Wijkamp,
D. Brida,
G. Derks,
R. Doyle,
F. Federici,
A. Hakola,
S. Henderson,
B. Kool,
S. Newton,
R. Osawa,
X. Pope,
H. Reimerdes,
N. Vianello,
M. Wischmeier
Abstract:
This experimental study presents an in-depth investigation of the performance of the MAST-U Super-X divertor during NBI-heated operation (up to 2.5 MW) focussing on volumetric ion sources and sinks as well as power losses during detachment.
The particle balance and power loss analysis revealed the crucial role of Molecular Activated Recombination and Dissociation (MAR and MAD) ion sinks in diver…
▽ More
This experimental study presents an in-depth investigation of the performance of the MAST-U Super-X divertor during NBI-heated operation (up to 2.5 MW) focussing on volumetric ion sources and sinks as well as power losses during detachment.
The particle balance and power loss analysis revealed the crucial role of Molecular Activated Recombination and Dissociation (MAR and MAD) ion sinks in divertor particle and power balance, which remain pronounced in the change from ohmic to higher power (NBI heated) L-mode conditions. The importance of MAR and MAD remains with double the absorbed NBI heating. MAD results in significant power dissipation (up to $\sim 20 \%$ of $P_{SOL}$), mostly in the cold ($T_e < 5$ eV) detached region. Theoretical and experimental evidence is found for the potential contribution of $D^-$ to MAR and MAD, which warrants further study.
These results suggest that MAR and MAD can be relevant in higher power conditions than the ohmic conditions studied previously. Post-processing reactor-scale simulations shows that MAR and MAD can play a significant role in divertor physics and synthetic diagnostic signals of reactor-scale devices, which are currently underestimated in exhaust simulations. This raises implications for the accuracy of reactor-scale divertor simulations of particularly tightly baffled (alternative) divertor configurations.
△ Less
Submitted 1 April, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
Ookami: An A64FX Computing Resource
Authors:
A. C. Calder,
E. Siegmann,
C. Feldman,
S. Chheda,
D. C. Smolarski,
F. D. Swesty,
A. Curtis,
J. Dey,
D. Carlson,
B. Michalowicz,
R. J. Harrison
Abstract:
We present a look at Ookami, a project providing community access to a testbed supercomputer with the ARM-based A64FX processors developed by a collaboration between RIKEN and Fujitsu and deployed in the Japanese supercomputer Fugaku. We describe the project, provide details about the user base and education/training program, and present highlights from performance studies of two astrophysical sim…
▽ More
We present a look at Ookami, a project providing community access to a testbed supercomputer with the ARM-based A64FX processors developed by a collaboration between RIKEN and Fujitsu and deployed in the Japanese supercomputer Fugaku. We describe the project, provide details about the user base and education/training program, and present highlights from performance studies of two astrophysical simulation codes.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Holographic imaging of antiferromagnetic domains with in-situ magnetic field
Authors:
Jack Harrison,
Hariom Jani,
Junxiong Hu,
Manohar Lal,
Jheng-Cyuan Lin,
Horia Popescu,
Jason Brown,
Nicolas Jaouen,
A. Ariando,
Paolo G. Radaelli
Abstract:
Lensless coherent x-ray imaging techniques have great potential for high-resolution imaging of magnetic systems with a variety of in-situ perturbations. Despite many investigations of ferromagnets, extending these techniques to the study of other magnetic materials, primarily antiferromagnets, is lacking. Here, we demonstrate the first (to our knowledge) study of an antiferromagnet using holograph…
▽ More
Lensless coherent x-ray imaging techniques have great potential for high-resolution imaging of magnetic systems with a variety of in-situ perturbations. Despite many investigations of ferromagnets, extending these techniques to the study of other magnetic materials, primarily antiferromagnets, is lacking. Here, we demonstrate the first (to our knowledge) study of an antiferromagnet using holographic imaging through the "holography with extended reference by autocorrelation linear differential operation" technique. Energy-dependent contrast with both linearly and circularly polarised x-rays are demonstrated. Antiferromagnetic domains and topological textures are studied in the presence of applied magnetic fields, demonstrating quasi-cyclic domain reconfiguration up to 500 mT.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
A multi-institutional pediatric dataset of clinical radiology MRIs by the Children's Brain Tumor Network
Authors:
Ariana M. Familiar,
Anahita Fathi Kazerooni,
Hannah Anderson,
Aliaksandr Lubneuski,
Karthik Viswanathan,
Rocky Breslow,
Nastaran Khalili,
Sina Bagheri,
Debanjan Haldar,
Meen Chul Kim,
Sherjeel Arif,
Rachel Madhogarhia,
Thinh Q. Nguyen,
Elizabeth A. Frenkel,
Zeinab Helili,
Jessica Harrison,
Keyvan Farahani,
Marius George Linguraru,
Ulas Bagci,
Yury Velichko,
Jeffrey Stevens,
Sarah Leary,
Robert M. Lober,
Stephani Campion,
Amy A. Smith
, et al. (15 additional authors not shown)
Abstract:
Pediatric brain and spinal cancers remain the leading cause of cancer-related death in children. Advancements in clinical decision-support in pediatric neuro-oncology utilizing the wealth of radiology imaging data collected through standard care, however, has significantly lagged other domains. Such data is ripe for use with predictive analytics such as artificial intelligence (AI) methods, which…
▽ More
Pediatric brain and spinal cancers remain the leading cause of cancer-related death in children. Advancements in clinical decision-support in pediatric neuro-oncology utilizing the wealth of radiology imaging data collected through standard care, however, has significantly lagged other domains. Such data is ripe for use with predictive analytics such as artificial intelligence (AI) methods, which require large datasets. To address this unmet need, we provide a multi-institutional, large-scale pediatric dataset of 23,101 multi-parametric MRI exams acquired through routine care for 1,526 brain tumor patients, as part of the Children's Brain Tumor Network. This includes longitudinal MRIs across various cancer diagnoses, with associated patient-level clinical information, digital pathology slides, as well as tissue genotype and omics data. To facilitate downstream analysis, treatment-naïve images for 370 subjects were processed and released through the NCI Childhood Cancer Data Initiative via the Cancer Data Service. Through ongoing efforts to continuously build these imaging repositories, our aim is to accelerate discovery and translational AI models with real-world data, to ultimately empower precision medicine for children.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Drift Control of High-Dimensional RBM: A Computational Method Based on Neural Networks
Authors:
Baris Ata,
J. Michael Harrison,
Nian Si
Abstract:
Motivated by applications in queueing theory, we consider a stochastic control problem whose state space is the $d$-dimensional positive orthant. The controlled process $Z$ evolves as a reflected Brownian motion whose covariance matrix is exogenously specified, as are its directions of reflection from the orthant's boundary surfaces. A system manager chooses a drift vector $θ(t)$ at each time $t$…
▽ More
Motivated by applications in queueing theory, we consider a stochastic control problem whose state space is the $d$-dimensional positive orthant. The controlled process $Z$ evolves as a reflected Brownian motion whose covariance matrix is exogenously specified, as are its directions of reflection from the orthant's boundary surfaces. A system manager chooses a drift vector $θ(t)$ at each time $t$ based on the history of $Z$, and the cost rate at time $t$ depends on both $Z(t)$ and $θ(t)$. In our initial problem formulation, the objective is to minimize expected discounted cost over an infinite planning horizon, after which we treat the corresponding ergodic control problem. Extending earlier work by Han et al. (Proceedings of the National Academy of Sciences, 2018, 8505-8510), we develop and illustrate a simulation-based computational method that relies heavily on deep neural network technology. For test problems studied thus far, our method is accurate to within a fraction of one percent, and is computationally feasible in dimensions up to at least $d=30$.
△ Less
Submitted 16 April, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.
-
A Further Study of Linux Kernel Hugepages on A64FX with FLASH, an Astrophysical Simulation Code
Authors:
Catherine Feldman,
Smeet Chheda,
Alan C. Calder,
Eva Siegmann,
John Dey,
Tony Curtis,
Robert J. Harrison
Abstract:
We present an expanded study of the performance of FLASH when using Linux Kernel Hugepages on Ookami, an HPE Apollo 80 A64FX platform. FLASH is a multi-scale, multi-physics simulation code written principally in modern Fortran and makes use of the PARAMESH library to manage a block-structured adaptive mesh. Our initial study used only the Fujitsu compiler to utilize standard hugepages (hp), but fu…
▽ More
We present an expanded study of the performance of FLASH when using Linux Kernel Hugepages on Ookami, an HPE Apollo 80 A64FX platform. FLASH is a multi-scale, multi-physics simulation code written principally in modern Fortran and makes use of the PARAMESH library to manage a block-structured adaptive mesh. Our initial study used only the Fujitsu compiler to utilize standard hugepages (hp), but further investigation allowed us to utilize hp for multiple compilers by linking to the Fujitsu library libmpg and transparent hugepages (thp) by enabling it at the node level. By comparing the results of hardware counters and in-code timers, we found that hp and thp do not significantly impact the runtime performance of FLASH. Interestingly, there is a significant reduction in the TLB misses, differences in cache and memory access counters, and strange behavior is observed when using thp.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards
Authors:
Joshua Harrison,
Ehsan Toreini,
Maryam Mehrnezhad
Abstract:
With recent developments in deep learning, the ubiquity of micro-phones and the rise in online services via personal devices, acoustic side channel attacks present a greater threat to keyboards than ever. This paper presents a practical implementation of a state-of-the-art deep learning model in order to classify laptop keystrokes, using a smartphone integrated microphone. When trained on keystrok…
▽ More
With recent developments in deep learning, the ubiquity of micro-phones and the rise in online services via personal devices, acoustic side channel attacks present a greater threat to keyboards than ever. This paper presents a practical implementation of a state-of-the-art deep learning model in order to classify laptop keystrokes, using a smartphone integrated microphone. When trained on keystrokes recorded by a nearby phone, the classifier achieved an accuracy of 95%, the highest accuracy seen without the use of a language model. When trained on keystrokes recorded using the video-conferencing software Zoom, an accuracy of 93% was achieved, a new best for the medium. Our results prove the practicality of these side channel attacks via off-the-shelf equipment and algorithms. We discuss a series of mitigation methods to protect users against these series of attacks.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Impacts of seasonality and parasitism on honey bee population dynamics
Authors:
Jun Chen,
Jordy O Rodriguez Rincon,
Gloria DeGrandi-Hoffman,
Jennifer Fewell,
Jon Harrison,
Yun Kang
Abstract:
The honeybee plays an extremely important role in ecosystem stability and diversity and in the production of bee pollinated crops. Honey bees and other pollinators are under threat from the combined effects of nutritional stress, parasitism, pesticides, and climate change that impact the timing, duration, and variability of seasonal events. To understand how parasitism and seasonality influence ho…
▽ More
The honeybee plays an extremely important role in ecosystem stability and diversity and in the production of bee pollinated crops. Honey bees and other pollinators are under threat from the combined effects of nutritional stress, parasitism, pesticides, and climate change that impact the timing, duration, and variability of seasonal events. To understand how parasitism and seasonality influence honey bee colonies separately and interactively, we developed a non-autonomous nonlinear honeybee-parasite interaction differential equation model that incorporates seasonality into the egg-laying rate of the queen. Our theoretical results show that parasitism negatively impacts the honey bee population either by decreasing colony size or destabilizing population dynamics through supercritical or subcritical Hopf-bifurcations depending on conditions. Our bifurcation analysis and simulations suggest that seasonality alone may have positive or negative impacts on the survival of honey bee colonies. More specifically, our study indicates that (1) the timing of the maximum egg-laying rate seems to determine when seasonality has positive or negative impacts; and (2) when the period of seasonality is large it can lead to the colony collapsing. Our study further suggests that the synergistic influences of parasitism and seasonality can lead to complicated dynamics that may positively and negatively impact the honey bee colony's survival. Our work partially uncovers the intrinsic effects of climate change and parasites, which potentially provide essential insights into how best to maintain or improve a honey bee colony's health.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Authors:
Georg Wölflein,
Lucie Charlotte Magister,
Pietro Liò,
David J. Harrison,
Ognjen Arandjelović
Abstract:
Traditional supervised learning tasks require a label for every instance in the training set, but in many real-world applications, labels are only available for collections (bags) of instances. This problem setting, known as multiple instance learning (MIL), is particularly relevant in the medical domain, where high-resolution images are split into smaller patches, but labels apply to the image as…
▽ More
Traditional supervised learning tasks require a label for every instance in the training set, but in many real-world applications, labels are only available for collections (bags) of instances. This problem setting, known as multiple instance learning (MIL), is particularly relevant in the medical domain, where high-resolution images are split into smaller patches, but labels apply to the image as a whole. Recent MIL models are able to capture correspondences between patches by employing self-attention, allowing them to weigh each patch differently based on all other patches in the bag. However, these approaches still do not consider the relative spatial relationships between patches within the larger image, which is especially important in computational pathology. To this end, we introduce a novel MIL model with distance-aware self-attention (DAS-MIL), which explicitly takes into account relative spatial information when modelling the interactions between patches. Unlike existing relative position representations for self-attention which are discrete, our approach introduces continuous distance-dependent terms into the computation of the attention weights, and is the first to apply relative position representations in the context of MIL. We evaluate our model on a custom MNIST-based MIL dataset that requires the consideration of relative spatial information, as well as on CAMELYON16, a publicly available cancer metastasis detection dataset, where we achieve a test AUROC score of 0.91. On both datasets, our model outperforms existing MIL approaches that employ absolute positional encodings, as well as existing relative position representation schemes applied to MIL. Our code is available at https://anonymous.4open.science/r/das-mil.
△ Less
Submitted 20 May, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Graph Reinforcement Learning for Network Control via Bi-Level Optimization
Authors:
Daniele Gammelli,
James Harrison,
Kaidi Yang,
Marco Pavone,
Filipe Rodrigues,
Francisco C. Pereira
Abstract:
Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven str…
▽ More
Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven strategies can automate this process and learn efficient algorithms without compromising optimality. To do so, we present network control problems through the lens of reinforcement learning and propose a graph network-based framework to handle a broad class of problems. Instead of naively computing actions over high-dimensional graph elements, e.g., edges, we propose a bi-level formulation where we (1) specify a desired next state via RL, and (2) solve a convex program to best achieve it, leading to drastically improved scalability and performance. We further highlight a collection of desirable features to system designers, investigate design decisions, and present experiments on real-world control problems showing the utility, scalability, and flexibility of our framework.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies
Authors:
Oscar Li,
James Harrison,
Jascha Sohl-Dickstein,
Virginia Smith,
Luke Metz
Abstract:
Unrolled computation graphs are prevalent throughout machine learning but present challenges to automatic differentiation (AD) gradient estimation methods when their loss functions exhibit extreme local sensitivtiy, discontinuity, or blackbox characteristics. In such scenarios, online evolution strategies methods are a more capable alternative, while being more parallelizable than vanilla evolutio…
▽ More
Unrolled computation graphs are prevalent throughout machine learning but present challenges to automatic differentiation (AD) gradient estimation methods when their loss functions exhibit extreme local sensitivtiy, discontinuity, or blackbox characteristics. In such scenarios, online evolution strategies methods are a more capable alternative, while being more parallelizable than vanilla evolution strategies (ES) by interleaving partial unrolls and gradient updates. In this work, we propose a general class of unbiased online evolution strategies methods. We analytically and empirically characterize the variance of this class of gradient estimators and identify the one with the least variance, which we term Noise-Reuse Evolution Strategies (NRES). Experimentally, we show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of unroll steps across a variety of applications, including learning dynamical systems, meta-training learned optimizers, and reinforcement learning.
△ Less
Submitted 9 December, 2023; v1 submitted 21 April, 2023;
originally announced April 2023.
-
The role of plasma-atom and molecule interactions on power \& particle balance during detachment on the MAST Upgrade Super-X divertor
Authors:
Kevin Verhaegh,
Bruce Lipschultz,
James Harrison,
Fabio Federici,
David Moulton,
Nicola Lonigro,
Stijn Kobussen,
Martin O'Mullane,
Nick Osborne,
Peter Ryan,
Tijs Wijkamp,
Bob Kool,
Effy Rose,
Christian Theiler,
Andrew Thornton
Abstract:
This paper shows first quantitative analysis of the detachment processes in the MAST Upgrade Super-X divertor (SXD). We identify an unprecedented impact of plasma-molecular interactions involving molecular ions (likely $D_2^+$), resulting in strong ion sinks (Molecular Activated Recombination - MAR), leading to a reduction of ion target flux. The MAR ion sinks exceed the divertor ion sources befor…
▽ More
This paper shows first quantitative analysis of the detachment processes in the MAST Upgrade Super-X divertor (SXD). We identify an unprecedented impact of plasma-molecular interactions involving molecular ions (likely $D_2^+$), resulting in strong ion sinks (Molecular Activated Recombination - MAR), leading to a reduction of ion target flux. The MAR ion sinks exceed the divertor ion sources before electron-ion recombination (EIR) starts to occur, suggesting that significant ionisation occurs outside of the divertor chamber. In the EIR region, $T_e \ll 0.2$ eV is observed and MAR remains significant in these deep detached phases. The total ion sink strength demonstrates the capability for particle (ion) exhaust in the Super-X Configuration.
Molecular Activated Dissociation (MAD) is the dominant volumetric neutral atom creation process can lead to an electron cooling of 20\% of $P_{SOL}$. The measured total radiative power losses \emph{in the divertor chamber} are consistent with inferred hydrogenic radiative power losses. This suggests that intrinsic divertor impurity radiation, despite the carbon walls, is minor in the divertor chamber. This contrasts previous TCV results, which may be associated with enhanced plasma-neutral interactions and reduced chemical erosion in the detached, tightly baffled SXD.
The above observations have also been observed in higher heat flux (narrower SOL width) type I ELMy H-mode discharges. This provides evidence that the characterisation in this paper may be general.
△ Less
Submitted 2 October, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
$B \rightarrow D^*$ vector, axial-vector and tensor form factors for the full $q^2$ range from lattice QCD
Authors:
Judd Harrison,
Christine T. H. Davies
Abstract:
We compute the complete set of SM and tensor $B_{(s)}\to D_{(s)}^*\ell\barν$ semileptonic form factors across the full kinematic range of the decay using second generation MILC $n_f=2+1+1$ HISQ gluon field configurations and HISQ valence quarks, with the heavy-HISQ method. Lattice spacings range from $0.09\mathrm{fm}$ to $0.044\mathrm{fm}$ with pion masses from $\approx 300\mathrm{MeV}$ down to th…
▽ More
We compute the complete set of SM and tensor $B_{(s)}\to D_{(s)}^*\ell\barν$ semileptonic form factors across the full kinematic range of the decay using second generation MILC $n_f=2+1+1$ HISQ gluon field configurations and HISQ valence quarks, with the heavy-HISQ method. Lattice spacings range from $0.09\mathrm{fm}$ to $0.044\mathrm{fm}$ with pion masses from $\approx 300\mathrm{MeV}$ down to the physical value and heavy quark masses ranging between $\approx 1.5 m_c$ and $4.1 m_c \approx 0.9 m_b$; currents are normalised nonperturbatively. Using the recent $B_{(s)}\to D^*_{(s)}\ell\barν_\ell$ data from Belle and LHCb together with our form factors we determine a model independent value of $V_{cb}=39.03(56)_\mathrm{exp}(67)_\mathrm{latt}\times 10^{-3}$, in agreement with previous exclusive determinations and in tension with the inclusive result at the level of $3.6σ$. We observe a $\approx 1σ$ tension between the shape of the differential decay rates computed using our form factors and those measured by Belle. We compute a lattice-only SM value for the ratio of semitauonic and semimuonic decay rates, $R(D^*)=0.273(15)$, which we find to be closer to the recent Belle measurement and HFLAV average than theory predictions using fits to experimental differential rate data for $B\to D^*\ell\barν_\ell$. Determining $V_{cb}$ using the total rate for $B\to D^*\ellν$ gives a value in agreement with inclusive results. We compute the longitudinal polarisation fraction for the semitauonic mode, $F_L^{D^*}=0.395(24)$, which is in tension at the level of $2.2σ$ with the recent Belle measurement. Our calculation combines $B\to D^*$ and $B_s\to D_s^*$ lattice results, and we provide an update which supersedes our previous lattice computation of the $B_s\to D_s^*$ form factors. We also give the chiral perturbation theory needed to analyse the tensor form factors.
△ Less
Submitted 26 January, 2024; v1 submitted 6 April, 2023;
originally announced April 2023.
-
Spatially reconfigurable topological textures in freestanding antiferromagnetic nanomembranes
Authors:
Hariom Jani,
Jack Harrison,
Sonu Hooda,
Saurav Prakash,
Proloy Nandi,
Junxiong Hu,
Zhiyang Zeng,
Jheng-Cyuan Lin,
Ganesh ji Omar,
Jörg Raabe,
Simone Finizio,
Aaron Voon-Yew Thean,
A Ariando,
Paolo G Radaelli
Abstract:
Antiferromagnets hosting real-space topological spin textures are promising platforms to model fundamental ultrafast phenomena and explore spintronics. However, to date, they have only been fabricated epitaxially on specific symmetry-matched crystalline substrates, to preserve their intrinsic magneto-crystalline order. This curtails their integration with dissimilar supports, markedly restricting…
▽ More
Antiferromagnets hosting real-space topological spin textures are promising platforms to model fundamental ultrafast phenomena and explore spintronics. However, to date, they have only been fabricated epitaxially on specific symmetry-matched crystalline substrates, to preserve their intrinsic magneto-crystalline order. This curtails their integration with dissimilar supports, markedly restricting the scope of fundamental and applied investigations. Here, we circumvent this limitation by designing detachable crystalline antiferromagnetic nanomembranes of $α$-Fe$_{2}$O$_{3}$, that can be transferred onto other desirable supports after growth. We develop transmission-based antiferromagnetic vector-map** to show that these nanomembranes harbour rich topological phenomenology at room temperature. Moreover, we exploit their extreme flexibility to demonstrate three-dimensional reconfiguration of antiferromagnetic properties, driven locally via flexure-induced strains. This allows us to spatially design antiferromagnetic states outside their typical thermal stability window. Integration of such freestanding antiferromagnetic layers with flat or curved nanostructures could enable spin texture designs tailored by magnetoelastic-/geometric-effects in the quasi-static and dynamical regimes, opening new explorations into curvilinear antiferromagnetism and unconventional computing.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Quantizing graphs, one way or two?
Authors:
Jon Harrison
Abstract:
Quantum graphs were introduced to model free electrons in organic molecules using a self-adjoint Hamiltonian on a network of intervals. A second graph quantization describes wave propagation on a graph by specifying scattering matrices at the vertices. A question that is frequently raised is the extent to which these models are the same or complementary. In particular, are all energy independent u…
▽ More
Quantum graphs were introduced to model free electrons in organic molecules using a self-adjoint Hamiltonian on a network of intervals. A second graph quantization describes wave propagation on a graph by specifying scattering matrices at the vertices. A question that is frequently raised is the extent to which these models are the same or complementary. In particular, are all energy independent unitary vertex scattering matrices associated with a self-adjoint Hamiltonian? Here we review results related to this issue. In addition, we observe that a self-adjoint Dirac operator with four component spinors produces a secular equation for the graph spectrum that matches the secular equation associated with wave propagation on the graph when the Dirac operator describes particles with zero mass and the vertex conditions do not allow spin rotation at the vertices.
△ Less
Submitted 16 February, 2024; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems
Authors:
Tobias Enders,
James Harrison,
Marco Pavone,
Maximilian Schiffer
Abstract:
We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we fac…
▽ More
We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator's otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.
△ Less
Submitted 10 May, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
General-Purpose In-Context Learning by Meta-Learning Transformers
Authors:
Louis Kirsch,
James Harrison,
Jascha Sohl-Dickstein,
Luke Metz
Abstract:
Modern machine learning requires system designers to specify aspects of the learning pipeline, such as losses, architectures, and optimizers. Meta-learning, or learning-to-learn, instead aims to learn those aspects, and promises to unlock greater capabilities with less manual effort. One particularly ambitious goal of meta-learning is to train general-purpose in-context learning algorithms from sc…
▽ More
Modern machine learning requires system designers to specify aspects of the learning pipeline, such as losses, architectures, and optimizers. Meta-learning, or learning-to-learn, instead aims to learn those aspects, and promises to unlock greater capabilities with less manual effort. One particularly ambitious goal of meta-learning is to train general-purpose in-context learning algorithms from scratch, using only black-box models with minimal inductive bias. Such a model takes in training data, and produces test-set predictions across a wide range of problems, without any explicit definition of an inference model, training loss, or optimization algorithm. In this paper we show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners. We characterize transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all, induced by changes in model size, number of tasks, and meta-optimization. We further show that the capabilities of meta-trained algorithms are bottlenecked by the accessible state size (memory) determining the next prediction, unlike standard models which are thought to be bottlenecked by parameter count. Finally, we propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose in-context learning algorithms.
△ Less
Submitted 9 January, 2024; v1 submitted 8 December, 2022;
originally announced December 2022.
-
Adaptive Robust Model Predictive Control via Uncertainty Cancellation
Authors:
Rohan Sinha,
James Harrison,
Spencer M. Richards,
Marco Pavone
Abstract:
We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. We optimize over a class of nonlinear feedback policies inspired by certainty…
▽ More
We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. We optimize over a class of nonlinear feedback policies inspired by certainty equivalent "estimate-and-cancel" control laws pioneered in classical adaptive control to achieve significant performance improvements in the presence of uncertainties of large magnitude, a setting in which existing learning-based predictive control algorithms often struggle to guarantee safety. In contrast to previous work in robust adaptive MPC, our approach allows us to take advantage of structure (i.e., the numerical predictions) in the a priori unknown dynamics learned online through function approximation. Our approach also extends typical nonlinear adaptive control methods to systems with state and input constraints even when we cannot directly cancel the additive uncertain function from the dynamics. We apply contemporary statistical estimation techniques to certify the system's safety through persistent constraint satisfaction with high probability. Moreover, we propose using Bayesian meta-learning algorithms that learn calibrated model priors to help satisfy the assumptions of the control design in challenging settings. Finally, we show in simulation that our method can accommodate more significant unknown dynamics terms than existing methods and that the use of Bayesian meta-learning allows us to adapt to the test environments more rapidly.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
VeLO: Training Versatile Learned Optimizers by Scaling Up
Authors:
Luke Metz,
James Harrison,
C. Daniel Freeman,
Amil Merchant,
Lucas Beyer,
James Bradbury,
Naman Agrawal,
Ben Poole,
Igor Mordatch,
Adam Roberts,
Jascha Sohl-Dickstein
Abstract:
While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. M…
▽ More
While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. Meta-trained with approximately four thousand TPU-months of compute on a wide variety of optimization tasks, our optimizer not only exhibits compelling performance, but optimizes in interesting and unexpected ways. It requires no hyperparameter tuning, instead automatically adapting to the specifics of the problem being optimized. We open source our learned optimizer, meta-training code, the associated train and test data, and an extensive optimizer benchmark suite with baselines at velo-code.github.io.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
In-silico analysis of the influence of pulmonary vein configuration on left atrial haemodynamics and thrombus formation in a large cohort
Authors:
Jordi Mill,
Josquin Harrison,
Benoit Legghe,
Andy L. Olivares,
Xabier Morales,
Jerome Noailly,
Xavier Iriart,
Hubert Cochet,
Maxime Sermesant,
Oscar Camara
Abstract:
Atrial fibrillation (AF) is considered the most common human arrhythmia. Around 99\% of thrombi in non-valvular AF are formed in the left atrial appendage (LAA). Studies suggest that abnormal LAA haemodynamics and the subsequently stagnated flow are the factors triggering clot formation. However, the relation between LAA morphology, the blood pattern and the triggering is not fully understood. Mor…
▽ More
Atrial fibrillation (AF) is considered the most common human arrhythmia. Around 99\% of thrombi in non-valvular AF are formed in the left atrial appendage (LAA). Studies suggest that abnormal LAA haemodynamics and the subsequently stagnated flow are the factors triggering clot formation. However, the relation between LAA morphology, the blood pattern and the triggering is not fully understood. Moreover, the impact of structures such as the pulmonary veins (PVs) on LA haemodynamics has not been thoroughly studied due to the difficulties of acquiring appropriate data. On the other hand, in-silico studies and flow simulations allow a thorough analysis of haemodynamics, analysing the 4D nature of blood flow patterns under different boundary conditions. However, the reduced number of cases reported on the literature of these studies has been a limitation. The main goal of this work was to study the influence of PVs on left atrium (LA) and LAA haemodynamics. Computational fluid dynamics simulations were run on 52 patients, the largest cohort so far in the literature, where different parameters were individually studied: pulmonary veins orientation and configuration; LAA and LA volumes and its ratio; and flow velocities. Our computational analysis showed how the right pulmonary vein height and angulation have a great influence on LA haemodynamics. Additionally, we found that LAA with great bending with its tip pointing towards the mitral valve could contribute to favour flow stagnation.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
HoechstGAN: Virtual Lymphocyte Staining Using Generative Adversarial Networks
Authors:
Georg Wölflein,
In Hwa Um,
David J Harrison,
Ognjen Arandjelović
Abstract:
The presence and density of specific types of immune cells are important to understand a patient's immune response to cancer. However, immunofluorescence staining required to identify T cell subtypes is expensive, time-consuming, and rarely performed in clinical settings. We present a framework to virtually stain Hoechst images (which are cheap and widespread) with both CD3 and CD8 to identify T c…
▽ More
The presence and density of specific types of immune cells are important to understand a patient's immune response to cancer. However, immunofluorescence staining required to identify T cell subtypes is expensive, time-consuming, and rarely performed in clinical settings. We present a framework to virtually stain Hoechst images (which are cheap and widespread) with both CD3 and CD8 to identify T cell subtypes in clear cell renal cell carcinoma using generative adversarial networks. Our proposed method jointly learns both staining tasks, incentivising the network to incorporate mutually beneficial information from each task. We devise a novel metric to quantify the virtual staining quality, and use it to evaluate our method.
△ Less
Submitted 17 October, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Expanding the Deployment Envelope of Behavior Prediction via Adaptive Meta-Learning
Authors:
Boris Ivanovic,
James Harrison,
Marco Pavone
Abstract:
Learning-based behavior prediction methods are increasingly being deployed in real-world autonomous systems, e.g., in fleets of self-driving vehicles, which are beginning to commercially operate in major cities across the world. Despite their advancements, however, the vast majority of prediction systems are specialized to a set of well-explored geographic regions or operational design domains, co…
▽ More
Learning-based behavior prediction methods are increasingly being deployed in real-world autonomous systems, e.g., in fleets of self-driving vehicles, which are beginning to commercially operate in major cities across the world. Despite their advancements, however, the vast majority of prediction systems are specialized to a set of well-explored geographic regions or operational design domains, complicating deployment to additional cities, countries, or continents. Towards this end, we present a novel method for efficiently adapting behavior prediction models to new environments. Our approach leverages recent advances in meta-learning, specifically Bayesian regression, to augment existing behavior prediction models with an adaptive layer that enables efficient domain transfer via offline fine-tuning, online adaptation, or both. Experiments across multiple real-world datasets demonstrate that our method can efficiently adapt to a variety of unseen environments.
△ Less
Submitted 23 May, 2023; v1 submitted 23 September, 2022;
originally announced September 2022.
-
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
Authors:
James Harrison,
Luke Metz,
Jascha Sohl-Dickstein
Abstract:
Learned optimizers -- neural networks that are trained to act as optimizers -- have the potential to dramatically accelerate training of machine learning models. However, even when meta-trained across thousands of tasks at huge computational expense, blackbox learned optimizers often struggle with stability and generalization when applied to tasks unlike those in their meta-training set. In this p…
▽ More
Learned optimizers -- neural networks that are trained to act as optimizers -- have the potential to dramatically accelerate training of machine learning models. However, even when meta-trained across thousands of tasks at huge computational expense, blackbox learned optimizers often struggle with stability and generalization when applied to tasks unlike those in their meta-training set. In this paper, we use tools from dynamical systems to investigate the inductive biases and stability properties of optimization algorithms, and apply the resulting insights to designing inductive biases for blackbox optimizers. Our investigation begins with a noisy quadratic model, where we characterize conditions in which optimization is stable, in terms of eigenvalues of the training dynamics. We then introduce simple modifications to a learned optimizer's architecture and meta-training procedure which lead to improved stability, and improve the optimizer's inductive bias. We apply the resulting learned optimizer to a variety of neural network training tasks, where it outperforms the current state of the art learned optimizer -- at matched optimizer computational overhead -- with regard to optimization performance and meta-training speed, and is capable of generalization to tasks far different from those it was meta-trained on.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
Advertising Media and Target Audience Optimization via High-dimensional Bandits
Authors:
Wenjia Ba,
J. Michael Harrison,
Harikesh S. Nair
Abstract:
We present a data-driven algorithm that advertisers can use to automate their digital ad-campaigns at online publishers. The algorithm enables the advertiser to search across available target audiences and ad-media to find the best possible combination for its campaign via online experimentation. The problem of finding the best audience-ad combination is complicated by a number of distinctive chal…
▽ More
We present a data-driven algorithm that advertisers can use to automate their digital ad-campaigns at online publishers. The algorithm enables the advertiser to search across available target audiences and ad-media to find the best possible combination for its campaign via online experimentation. The problem of finding the best audience-ad combination is complicated by a number of distinctive challenges, including (a) a need for active exploration to resolve prior uncertainty and to speed the search for profitable combinations, (b) many combinations to choose from, giving rise to high-dimensional search formulations, and (c) very low success probabilities, typically just a fraction of one percent. Our algorithm (designated LRDL, an acronym for Logistic Regression with Debiased Lasso) addresses these challenges by combining four elements: a multiarmed bandit framework for active exploration; a Lasso penalty function to handle high dimensionality; an inbuilt debiasing kernel that handles the regularization bias induced by the Lasso; and a semi-parametric regression model for outcomes that promotes cross-learning across arms. The algorithm is implemented as a Thompson Sampler, and to the best of our knowledge, it is the first that can practically address all of the challenges above. Simulations with real and synthetic data show the method is effective and document its superior performance against several benchmarks from the recent high-dimensional bandit literature.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Can One Hear the Spanning Trees of a Quantum Graph?
Authors:
Jonathan Harrison,
Tracy Weyand
Abstract:
Kirchhoff showed that the number of spanning trees of a graph is the spectral determinant of the combinatorial Laplacian divided by the number of vertices; we reframe this result in the quantum graph setting. We prove that the spectral determinant of the Laplace operator on a finite connected metric graph with standard (Neummann-Kirchhoff) vertex conditions determines the number of spanning trees…
▽ More
Kirchhoff showed that the number of spanning trees of a graph is the spectral determinant of the combinatorial Laplacian divided by the number of vertices; we reframe this result in the quantum graph setting. We prove that the spectral determinant of the Laplace operator on a finite connected metric graph with standard (Neummann-Kirchhoff) vertex conditions determines the number of spanning trees when the lengths of the edges of the metric graph are sufficiently close together. To obtain this result, we analyze an equilateral quantum graph whose spectrum is closely related to spectra of discrete graph operators and then use the continuity of the spectral determinant under perturbations of the edge lengths.
△ Less
Submitted 3 March, 2023; v1 submitted 2 September, 2022;
originally announced September 2022.
-
On Using Linux Kernel Huge Pages with FLASH, an Astrophysical Simulation Code
Authors:
Alan C. Calder,
Catherine Feldman,
Eva Siegmann,
John Dey,
Anthony Curtis,
Smeet Chheda,
Robert J. Harrison
Abstract:
We present efforts at improving the performance of FLASH, a multi-scale, multi-physics simulation code principally for astrophysical applications, by using huge pages on Ookami, an HPE Apollo 80 A64FX platform. FLASH is written principally in modern Fortran and makes use of the PARAMESH library to manage a block-structured adaptive mesh. We explored options for enabling the use of huge pages with…
▽ More
We present efforts at improving the performance of FLASH, a multi-scale, multi-physics simulation code principally for astrophysical applications, by using huge pages on Ookami, an HPE Apollo 80 A64FX platform. FLASH is written principally in modern Fortran and makes use of the PARAMESH library to manage a block-structured adaptive mesh. We explored options for enabling the use of huge pages with several compilers, but we were only able to successfully use huge pages when compiling with the Fujitsu compiler. The use of huge pages substantially reduced the number of translation lookaside buffer misses, but overall performance gains were marginal.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Multi-site benchmark classification of major depressive disorder using machine learning on cortical and subcortical measures
Authors:
Vladimir Belov,
Tracy Erwin-Grabner,
Ali Saffet Gonul,
Alyssa R. Amod,
Amar Ojha,
Andre Aleman,
Annemiek Dols,
Anouk Scharntee,
Aslihan Uyar-Demir,
Ben J Harrison,
Benson M. Irungu,
Bianca Besteher,
Bonnie Klimes-Dougan,
Brenda W. J. H. Penninx,
Bryon A. Mueller,
Carlos Zarate,
Christopher G. Davey,
Christopher R. K. Ching,
Colm G. Connolly,
Cynthia H. Y. Fu,
Dan J. Stein,
Danai Dima,
David E. J. Linden,
David M. A. Mehler,
Edith Pomarol-Clotet
, et al. (41 additional authors not shown)
Abstract:
Machine learning (ML) techniques have gained popularity in the neuroimaging field due to their potential for classifying neuropsychiatric disorders. However, the diagnostic predictive power of the existing algorithms has been limited by small sample sizes, lack of representativeness, data leakage, and/or overfitting. Here, we overcome these limitations with the largest multi-site sample size to da…
▽ More
Machine learning (ML) techniques have gained popularity in the neuroimaging field due to their potential for classifying neuropsychiatric disorders. However, the diagnostic predictive power of the existing algorithms has been limited by small sample sizes, lack of representativeness, data leakage, and/or overfitting. Here, we overcome these limitations with the largest multi-site sample size to date (n=5,356) to provide a generalizable ML classification benchmark of major depressive disorder (MDD). Using brain measures from standardized ENIGMA analysis pipelines in FreeSurfer, we were able to classify MDD vs healthy controls (HC) with around 62% balanced accuracy, but when harmonizing the data using ComBat balanced accuracy dropped to approximately 52%. Similar results were observed in stratified groups according to age of onset, antidepressant use, number of episodes and sex. Future studies incorporating higher dimensional brain imaging/phenotype features, and/or using more advanced machine and deep learning methods may achieve more encouraging prospects.
△ Less
Submitted 25 October, 2022; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Spectroscopic investigations of detachment on the MAST Upgrade Super-X divertor
Authors:
Kevin Verhaegh,
Bruce Lipschultz,
James Harrison,
Nick Osborne,
Aelwyn Williams,
Peter Ryan,
James Clark,
Fabio Federici,
Bob Kool,
Tijs Wijkamp,
Alexandre Fil,
David Moulton,
Omkar Myatra,
Andrew Thornton,
Thomas Bosman,
Geof Cunningham,
Basil Duval,
Stuart Henderson,
Rory Scannell,
the MAST Upgrade team
Abstract:
We present the first analysis of the atomic and molecular processes at play during detachment in the MAST-U Super-X divertor using divertor spectroscopy data. Our analysis indicates detachment in the MAST-U Super-X divertor can be separated into four sequential phases: First, the ionisation region detaches from the target at detachment onset leaving a region of increased molecular densities downst…
▽ More
We present the first analysis of the atomic and molecular processes at play during detachment in the MAST-U Super-X divertor using divertor spectroscopy data. Our analysis indicates detachment in the MAST-U Super-X divertor can be separated into four sequential phases: First, the ionisation region detaches from the target at detachment onset leaving a region of increased molecular densities downstream. The plasma interacts with these molecules, resulting in molecular ions ($D_2^+$ and/or $D_2^- \rightarrow D + D^-$) that further react with the plasma leading to Molecular Activated Recombination and Dissociation (MAR and MAD), which results in excited atoms and significant Balmer line emission. Second, the MAR region detaches from the target leaving a sub-eV temperature region downstream. Third, an onset of strong emission from electron-ion recombination (EIR) ensues. Finally, the electron density decays near the target, resulting in a density front moving upstream.
The analysis in this paper indicates that plasma-molecule interactions have a larger impact than previously reported and play a critical role in the intensity and interpretation of hydrogen atomic line emission characteristics on MAST-U. Furthermore, we find that the Fulcher band emission profile in the divertor can be used as a proxy for the ionisation region and may also be employed as a plasma temperature diagnostic for improving the separation of hydrogenic emission arising from electron-impact excitation and that from plasma-molecular interactions.
We provide evidences for the presence of low electron temperatures ($<0.5$ eV) during detachment phases III-IV based on quantitative spectroscopy analysis, a Boltzmann relation of the high-n Balmer line transitions together with an analysis of the brightness of high-n Balmer lines.
△ Less
Submitted 18 October, 2022; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Practical tradeoffs between memory, compute, and performance in learned optimizers
Authors:
Luke Metz,
C. Daniel Freeman,
James Harrison,
Niru Maheswaranathan,
Jascha Sohl-Dickstein
Abstract:
Optimization plays a costly and crucial role in develo** machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric functions. The parameters of these functions are then optimized so that the resulting learned optimizer minimizes a target loss on a chosen class of models. Learned opti…
▽ More
Optimization plays a costly and crucial role in develo** machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric functions. The parameters of these functions are then optimized so that the resulting learned optimizer minimizes a target loss on a chosen class of models. Learned optimizers can both reduce the number of required training steps and improve the final test loss. However, they can be expensive to train, and once trained can be expensive to use due to computational and memory overhead for the optimizer itself. In this work, we identify and quantify the design features governing the memory, compute, and performance trade-offs for many learned and hand-designed optimizers. We further leverage our analysis to construct a learned optimizer that is both faster and more memory efficient than previous work. Our model and training code are open source.
△ Less
Submitted 16 July, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
FPIC: A Novel Semantic Dataset for Optical PCB Assurance
Authors:
Nathan Jessurun,
Olivia P. Dizon-Paradis,
Jacob Harrison,
Shajib Ghosh,
Mark M. Tehranipoor,
Damon L. Woodard,
Navid Asadizanjani
Abstract:
Outsourced printed circuit board (PCB) fabrication necessitates increased hardware assurance capabilities. Several assurance techniques based on automated optical inspection (AOI) have been proposed that leverage PCB images acquired using digital cameras. We review state-of-the-art AOI techniques and observe a strong, rapid trend toward machine learning (ML) solutions. These require significant am…
▽ More
Outsourced printed circuit board (PCB) fabrication necessitates increased hardware assurance capabilities. Several assurance techniques based on automated optical inspection (AOI) have been proposed that leverage PCB images acquired using digital cameras. We review state-of-the-art AOI techniques and observe a strong, rapid trend toward machine learning (ML) solutions. These require significant amounts of labeled ground truth data, which is lacking in the publicly available PCB data space. We contribute the FICS PCB Image Collection (FPIC) dataset to address this need. Additionally, we outline new hardware security methodologies enabled by our data set.
△ Less
Submitted 14 March, 2023; v1 submitted 16 February, 2022;
originally announced February 2022.
-
Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand
Authors:
Daniele Gammelli,
Kaidi Yang,
James Harrison,
Filipe Rodrigues,
Francisco C. Pereira,
Marco Pavone
Abstract:
Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based…
▽ More
Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
A route towards stable homochiral topological textures in A-type antiferromagnets
Authors:
Jack Harrison,
Hariom Jani,
Paolo G. Radaelli
Abstract:
Topologically protected whirling magnetic textures could emerge as data carriers in next-generation post-Moore computing. Such textures are abundantly observed in ferromagnets (FMs); however, their antiferromagnetic (AFM) counterparts are expected to be even more relevant for device applications, as they promise ultra-fast, deflection-free dynamics whilst being robust against external fields. Unfo…
▽ More
Topologically protected whirling magnetic textures could emerge as data carriers in next-generation post-Moore computing. Such textures are abundantly observed in ferromagnets (FMs); however, their antiferromagnetic (AFM) counterparts are expected to be even more relevant for device applications, as they promise ultra-fast, deflection-free dynamics whilst being robust against external fields. Unfortunately, they have remained elusive, hence identifying materials hosting such textures is key to develo** this technology. Here, we present comprehensive micromagnetic and analytical models investigating topological textures in the broad material class of A-type antiferromagnets, specifically focusing on the prototypical case of $α\text{-Fe}_2 \text{O}_3$,an emerging candidate for AFM spintronics. By exploiting a symmetry breaking interfacial Dzyaloshinskii-Moriya interaction (iDMI), it is possible to stabilize a wide topological family, including AFM (anti)merons and bimerons and the hitherto undiscovered AFM skyrmions. Whilst iDMI enforces homochirality and improves the stability of these textures, the widely tunable anisotropy and exchange interactions enable unprecedented control of their core dimensions. We then present a unifying framework to model the scaling of texture sizes based on a simple dimensional analysis. As the parameters required to host and tune homochiral AFM textures may be obtained by rational materials design of $α\text{-Fe}_2 \text{O}_3$, it could emerge as a promising platform to initiate AFM topological spintronics.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
Planets or asteroids? A geochemical method to constrain the masses of White Dwarf pollutants
Authors:
Andrew M. Buchan,
Amy Bonsor,
Oliver Shorttle,
Jon Wade,
John Harrison,
Lena Noack,
Detlev Koester
Abstract:
Polluted white dwarfs that have accreted planetary material provide a unique opportunity to probe the geology of exoplanetary systems. However, the nature of the bodies which pollute white dwarfs is not well understood: are they small asteroids, minor planets, or even terrestrial planets? We present a novel method to infer pollutant masses from detections of Ni, Cr and Si. During core--mantle diff…
▽ More
Polluted white dwarfs that have accreted planetary material provide a unique opportunity to probe the geology of exoplanetary systems. However, the nature of the bodies which pollute white dwarfs is not well understood: are they small asteroids, minor planets, or even terrestrial planets? We present a novel method to infer pollutant masses from detections of Ni, Cr and Si. During core--mantle differentiation, these elements exhibit variable preference for metal and silicate at different pressures (i.e., object masses), affecting their abundances in the core and mantle. We model core--mantle differentiation self-consistently using data from metal--silicate partitioning experiments. We place statistical constraints on the differentiation pressures, and hence masses, of bodies which pollute white dwarfs by incorporating this calculation into a Bayesian framework. We show that Ni observations are best suited to constraining pressure when pollution is mantle-like, while Cr and Si are better for core-like pollution. We find 3 systems (WD0449-259, WD1350-162 and WD2105-820) whose abundances are best explained by the accretion of fragments of small parent bodies ($<0.2M_\oplus$). For 2 systems (GD61 and WD0446-255), the best model suggests the accretion of fragments of Earth-sized bodies, although the observed abundances remain consistent ($<3σ$) with the accretion of undifferentiated material. This suggests that polluted white dwarfs potentially accrete planetary bodies of a range of masses. However, our results are subject to inevitable degeneracies and limitations given current data. To constrain pressure more confidently, we require serendipitous observation of (nearly) pure core and/or mantle material.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
On the Problem of Reformulating Systems with Uncertain Dynamics as a Stochastic Differential Equation
Authors:
Thomas Lew,
Apoorva Sharma,
James Harrison,
Edward Schmerling,
Marco Pavone
Abstract:
We identify an issue in recent approaches to learning-based control that reformulate systems with uncertain dynamics using a stochastic differential equation. Specifically, we discuss the approximation that replaces a model with fixed but uncertain parameters (a source of epistemic uncertainty) with a model subject to external disturbances modeled as a Brownian motion (corresponding to aleatoric u…
▽ More
We identify an issue in recent approaches to learning-based control that reformulate systems with uncertain dynamics using a stochastic differential equation. Specifically, we discuss the approximation that replaces a model with fixed but uncertain parameters (a source of epistemic uncertainty) with a model subject to external disturbances modeled as a Brownian motion (corresponding to aleatoric uncertainty).
△ Less
Submitted 11 November, 2021;
originally announced November 2021.
-
Bayesian Embeddings for Few-Shot Open World Recognition
Authors:
John Willes,
James Harrison,
Ali Harakeh,
Chelsea Finn,
Marco Pavone,
Steven Waslander
Abstract:
As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information. This stands in stark contrast to modern machine learning systems that are typically designed with a known set of classes…
▽ More
As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information. This stands in stark contrast to modern machine learning systems that are typically designed with a known set of classes and a large number of examples for each class. In this work we extend embedding-based few-shot learning algorithms to the open-world recognition setting. We combine Bayesian non-parametric class priors with an embedding-based pre-training scheme to yield a highly flexible framework which we refer to as few-shot learning for open world recognition (FLOWR). We benchmark our framework on open-world extensions of the common MiniImageNet and TieredImageNet few-shot learning datasets. Our results show, compared to prior methods, strong classification accuracy performance and up to a 12% improvement in H-measure (a measure of novel class detection) from our non-parametric open-world few-shot learning scheme.
△ Less
Submitted 5 October, 2022; v1 submitted 28 July, 2021;
originally announced July 2021.
-
Hoechst Is All You Need: Lymphocyte Classification with Deep Learning
Authors:
Jessica Cooper,
In Hwa Um,
Ognjen Arandjelović,
David J Harrison
Abstract:
Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and tim…
▽ More
Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and time consuming processes which require complex staining and imaging techniques by expert technicians. Hoechst staining is much cheaper and easier to perform, but is not typically used in this case as it binds to DNA rather than to the proteins targeted by immunofluorescent techniques, and it was not previously thought possible to differentiate cells expressing these proteins based only on DNA morphology. In this work we show otherwise, training a deep convolutional neural network to identify cells expressing three proteins (T lymphocyte markers CD3 and CD8, and the B lymphocyte marker CD20) with greater than 90% precision and recall, from Hoechst 33342 stained tissue only. Our model learns previously unknown morphological features associated with expression of these proteins which can be used to accurately differentiate lymphocyte subtypes for use in key prognostic metrics such as assessment of immune cell infiltration,and thereby predict and improve patient outcomes without the need for costly multiplex immunofluorescence.
△ Less
Submitted 16 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.
-
On computing bound states of the Dirac and Schrödinger Equations
Authors:
Gregory Beylkin,
Joel Anderson,
Robert J. Harrison
Abstract:
We cast the quantum chemistry problem of computing bound states as that of solving a set of auxiliary eigenvalue problems for a family of parameterized compact integral operators. The compactness of operators assures that their spectrum is discrete and bounded with the only possible accumulation point at zero. We show that, by changing the parameter, we can always find the bound states, i.e., the…
▽ More
We cast the quantum chemistry problem of computing bound states as that of solving a set of auxiliary eigenvalue problems for a family of parameterized compact integral operators. The compactness of operators assures that their spectrum is discrete and bounded with the only possible accumulation point at zero. We show that, by changing the parameter, we can always find the bound states, i.e., the eigenfunctions that satisfy the original equations and are normalizable. While for the non-relativistic equations these properties may not be surprising, it is remarkable that the same holds for the relativistic equations where the spectrum of the original relativistic operators does not have a lower bound. We demonstrate that starting from an arbitrary initialization of the iteration leads to the solution, as dictated by the properties of compact operators.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.