-
Vector-like Quark Stabilised Higgs Inflation: Implications for Particle Phenomenology, Primordial Gravitational Waves and the Hubble Tension
Authors:
John McDonald
Abstract:
The Standard Model Higgs potential is very likely to be metastable, in which case Higgs Inflation is likely to require an extension in order to stabilise the potential. Here we consider stabilisation by adding $n_{Q} \leq 3$ Vector-Like Quarks (VLQs) of mass $m_{Q}$: $T$ vector quarks transforming as $({\bf 3}, {\bf 1}, 2/3)$ and $B$ vector quarks transforming as $({\bf 3}, {\bf 1}, -1/3)$. Requir…
▽ More
The Standard Model Higgs potential is very likely to be metastable, in which case Higgs Inflation is likely to require an extension in order to stabilise the potential. Here we consider stabilisation by adding $n_{Q} \leq 3$ Vector-Like Quarks (VLQs) of mass $m_{Q}$: $T$ vector quarks transforming as $({\bf 3}, {\bf 1}, 2/3)$ and $B$ vector quarks transforming as $({\bf 3}, {\bf 1}, -1/3)$. Requiring stability of the finite-temperature effective potential, and assuming $m_{t}$ equals its mean value, we find that the upper bounds on $m_{Q}$ for $T$ quarks are 5.8 TeV (for $n_{Q} = 2$) and 55 TeV (for $n_{Q} = 3$). The corresponding absolute stability upper bounds are 4.4 TeV and 29 TeV. Small upper bounds are obtained for $B$ quarks. For renormalisation in the Einstein frame (Prescription I) the predictions are almost indistinguishable from the classical values: $n_s = 0.966$ and $r = 3.3 \times 10^{-3}$. Renormalisation in the Jordan frame (Prescription II) predicts larger values of $n_{s}$ and $r$, with $n_{s}$ generally in the range 0.980 to 0.990 and $r$ of the order of 0.01. The predicted range of $n_{s}$ is consistent with the CMB range obtained in Hubble tension solutions which modify the sound horizon at decoupling, whilst the predicted values of $r$ will be easily observable by forthcoming CMB experiments. The observational upper bound on $r$ generally imposes a stronger constraint on $m_{Q}$, with the $T$ quark upper bound equal to 2.4 TeV for $n_{Q} = 2$ and 13 TeV for $n_{Q} = 3$. We conclude that VLQ-stabilised Higgs Inflation with Prescription II renormalisation favours 1-10 TeV vector-like quarks that will be accessible to future colliders, and predicts a tensor-to-scalar ratio that will be observable in forthcoming CMB experiments and values of $n_{s}$ that favour an early-time solution to the Hubble tension.
△ Less
Submitted 8 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Resonant Conversion of Gravitational Waves in Neutron Star Magnetospheres
Authors:
Jamie I. McDonald,
Sebastian A. R. Ellis
Abstract:
High frequency gravitational waves are the subject of rapidly growing interest in the theoretical and experimental community. In this work we calculate the resonant conversion of gravitational waves into photons in the magnetospheres of neutron stars via the inverse Gertsenshtein mechanism. The resonance occurs in regions where the vacuum birefringence effects cancel the classical plasma contribut…
▽ More
High frequency gravitational waves are the subject of rapidly growing interest in the theoretical and experimental community. In this work we calculate the resonant conversion of gravitational waves into photons in the magnetospheres of neutron stars via the inverse Gertsenshtein mechanism. The resonance occurs in regions where the vacuum birefringence effects cancel the classical plasma contribution to the photon dispersion relation, leading to a massless photon in the medium which becomes kinematically matched to the graviton. We set limits on the amplitude of a possible stochastic background of gravitational waves using X-ray and IR flux measurements of neutron stars. Using Chandra ($2-8\,\text{keV}$) and NuSTAR ($3-79\,\text{keV}$) observations of RX J1856.6-3754, we set strain limits $h_c^{\rm lim} \simeq 10^{-26} - 10^{-24}$ in the frequency range $ 5\times 10^{17}\, {\rm Hz} \lesssim f \lesssim 2\times 10^{19}\,\text{Hz}$. Our limits are many orders of magnitude stronger than existing constrains from individual neutron stars at the same frequencies. We also use recent JWST observations of the Magnetar 4U 0142+61 in the range $2.7\times 10^{13}\, {\rm Hz} \lesssim f \lesssim 5.9\times 10^{13}\, {\rm Hz} $, setting a limit $h_{\rm c}^{\rm lim} \simeq 5 \times 10^{-19}$. These constraints are in complementary frequency ranges to laboratory searches with CAST, OSQAR and ALPS II. We expect these limits to be improved both in reach and breadth with a more exhaustive use of telescope data across the full spectrum of frequencies and targets.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
A reduction of the "cycles plus $K_4$'s" problem
Authors:
Aseem Dalal,
Jessica McDonald,
Songling Shan
Abstract:
Let $H$ be a 2-regular graph and let $G$ be obtained from $H$ by gluing in vertex-disjoint copies of $K_4$. The "cycles plus $K_4$'s" problem is to show that $G$ is 4-colourable; this is a special case of the \emph{Strong Colouring Conjecture}. In this paper we reduce the "cycles plus $K_4$'s" problem to a specific 3-colourability problem. In the 3-colourability problem, vertex-disjoint triangles…
▽ More
Let $H$ be a 2-regular graph and let $G$ be obtained from $H$ by gluing in vertex-disjoint copies of $K_4$. The "cycles plus $K_4$'s" problem is to show that $G$ is 4-colourable; this is a special case of the \emph{Strong Colouring Conjecture}. In this paper we reduce the "cycles plus $K_4$'s" problem to a specific 3-colourability problem. In the 3-colourability problem, vertex-disjoint triangles are glued (in a limited way) onto a disjoint union of triangles and paths of length at most 12, and we ask for 3-colourability of the resulting graph.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Monadic ortholattices: completions and duality
Authors:
John Harding,
Joseph McDonald,
Miguel Peinado
Abstract:
We show that the variety of monadic ortholattices is closed under MacNeille and canonical completions. In each case, the completion of $L$ is obtained by forming an associated dual space $X$ that is a monadic orthoframe. This is a set with an orthogonality relation and an additional binary relation satisfying certain conditions. For the MacNeille completion, $X$ is formed from the non-zero element…
▽ More
We show that the variety of monadic ortholattices is closed under MacNeille and canonical completions. In each case, the completion of $L$ is obtained by forming an associated dual space $X$ that is a monadic orthoframe. This is a set with an orthogonality relation and an additional binary relation satisfying certain conditions. For the MacNeille completion, $X$ is formed from the non-zero elements of $L$, and for the canonical completion, $X$ is formed from the proper filters of $L$. The corresponding completion of $L$ is then obtained as the ortholattice of bi-orthogonally closed subsets of $X$ with an additional operation defined through the binary relation of $X$.
With the introduction of a suitable topology on an orthoframe, as was done by Goldblatt and Bimbó, we obtain a dual adjunction between the categories of monadic ortholattices and monadic orthospaces. A restriction of this dual adjunction provides a dual equivalence.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
On orientations with forbidden out-degrees
Authors:
Owen Henderschedt,
Jessica McDonald
Abstract:
Let $G$ be a $d$-regular graph and let $F\subseteq\{0, 1, 2, \ldots, d\}$ be a list of forbidden out-degrees. Akbari, Dalirrooyfard, Ehsani, Ozeki, and Sherkati conjectured that if $|F|<\tfrac{1}{2}d$, then $G$ should admit an $F$-avoiding orientation, i.e., an orientation where no out-degrees are in the forbidden list $F$. The conjecture is known for $d\leq 4$ due to work of Ma and Lu, and here w…
▽ More
Let $G$ be a $d$-regular graph and let $F\subseteq\{0, 1, 2, \ldots, d\}$ be a list of forbidden out-degrees. Akbari, Dalirrooyfard, Ehsani, Ozeki, and Sherkati conjectured that if $|F|<\tfrac{1}{2}d$, then $G$ should admit an $F$-avoiding orientation, i.e., an orientation where no out-degrees are in the forbidden list $F$. The conjecture is known for $d\leq 4$ due to work of Ma and Lu, and here we extend this to $d\leq 6$. The conjecture has also been studied in a generalized version, where $d, F$ are changed from constant values to functions $d(v), F(v)$ that vary over all $v\in V(G)$. We provide support for this generalized version by verifying it for some new cases, including when $G$ is 2-degenerate and when every $F(v)$ has some specific structure.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Total coloring graphs with large maximum degree
Authors:
Aseem Dalal,
Jessica McDonald,
Songling Shan
Abstract:
We prove that for any graph $G$, the total chromatic number of $G$ is at most $Δ(G)+2\left\lceil \frac{|V(G)|}{Δ(G)+1} \right\rceil$. This saves one color in comparison with a result of Hind from 1992. In particular, our result says that if $Δ(G)\ge \frac{1}{2}|V(G)|$, then $G$ has a total coloring using at most $Δ(G)+4$ colors. When $G$ is regular and has a sufficient number of vertices, we can a…
▽ More
We prove that for any graph $G$, the total chromatic number of $G$ is at most $Δ(G)+2\left\lceil \frac{|V(G)|}{Δ(G)+1} \right\rceil$. This saves one color in comparison with a result of Hind from 1992. In particular, our result says that if $Δ(G)\ge \frac{1}{2}|V(G)|$, then $G$ has a total coloring using at most $Δ(G)+4$ colors. When $G$ is regular and has a sufficient number of vertices, we can actually save an additional two colors. Specifically, we prove that for any $0<\varepsilon <1$, there exists $n_0\in \mathbb{N}$ such that: if $G$ is an $r$-regular graph on $n \ge n_0$ vertices with $r\ge \frac{1}{2}(1+\varepsilon) n$, then $χ_T(G) \le Δ(G)+2$. This confirms the Total Coloring Conjecture for such graphs $G$.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Scalable and Efficient Hierarchical Visual Topological Map**
Authors:
Saravanabalagi Ramachandran,
Jonathan Horgan,
Ganesh Sistu,
John McDonald
Abstract:
Hierarchical topological representations can significantly reduce search times within map** and localization algorithms. Although recent research has shown the potential for such approaches, limited consideration has been given to the suitability and comparative performance of different global feature representations within this context. In this work, we evaluate state-of-the-art hand-crafted an…
▽ More
Hierarchical topological representations can significantly reduce search times within map** and localization algorithms. Although recent research has shown the potential for such approaches, limited consideration has been given to the suitability and comparative performance of different global feature representations within this context. In this work, we evaluate state-of-the-art hand-crafted and learned global descriptors using a hierarchical topological map** technique on benchmark datasets and present results of a comprehensive evaluation of the impact of the global descriptor used. Although learned descriptors have been incorporated into place recognition methods to improve retrieval accuracy and enhance overall recall, the problem of scalability and efficiency when applied to longer trajectories has not been adequately addressed in a majority of research studies. Based on our empirical analysis of multiple runs, we identify that continuity and distinctiveness are crucial characteristics for an optimal global descriptor that enable efficient and scalable hierarchical map**, and present a methodology for quantifying and contrasting these characteristics across different global descriptors. Our study demonstrates that the use of global descriptors based on an unsupervised learned Variational Autoencoder (VAE) excels in these characteristics and achieves significantly lower runtime. It runs on a consumer grade desktop, up to 2.3x faster than the second best global descriptor, NetVLAD, and up to 9.5x faster than the hand-crafted descriptor, PHOG, on the longest track evaluated (St Lucia, 17.6 km), without sacrificing overall recall performance.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study
Authors:
Jesse McDonald,
Maximilian Horzela,
Frédéric Suter,
Henri Casanova
Abstract:
Many parallel and distributed computing research results are obtained in simulation, using simulators that mimic real-world executions on some target system. Each such simulator is configured by picking values for parameters that define the behavior of the underlying simulation models it implements. The main concern for a simulator is accuracy: simulated behaviors should be as close as possible to…
▽ More
Many parallel and distributed computing research results are obtained in simulation, using simulators that mimic real-world executions on some target system. Each such simulator is configured by picking values for parameters that define the behavior of the underlying simulation models it implements. The main concern for a simulator is accuracy: simulated behaviors should be as close as possible to those observed in the real-world target system. This requires that values for each of the simulator's parameters be carefully picked, or "calibrated," based on ground-truth real-world executions. Examining the current state of the art shows that simulator calibration, at least in the field of parallel and distributed computing, is often undocumented (and thus perhaps often not performed) and, when documented, is described as a labor-intensive, manual process. In this work we evaluate the benefit of automating simulation calibration using simple algorithms. Specifically, we use a real-world case study from the field of High Energy Physics and compare automated calibration to calibration performed by a domain scientist. Our main finding is that automated calibration is on par with or significantly outperforms the calibration performed by the domain scientist. Furthermore, automated calibration makes it straightforward to operate desirable trade-offs between simulation accuracy and simulation speed.
△ Less
Submitted 1 July, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
Sustainable Supercomputing for AI: GPU Power Cap** at HPC Scale
Authors:
Dan Zhao,
Siddharth Samsi,
Joseph McDonald,
Baolin Li,
David Bestor,
Michael Jones,
Devesh Tiwari,
Vijay Gadepally
Abstract:
As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbo…
▽ More
As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbon emissions, and massive demand for GPUs and other hardware accelerators. However, this surge carries large implications for energy sustainability at the HPC/datacenter level. In this paper, we study the aggregate effect of power-cap** GPUs on GPU temperature and power draw at a research supercomputing center. With the right amount of power-cap**, we show significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span with minimal impact on job performance. While power-cap** reduces power draw by design, the aggregate system-wide effect on overall energy consumption is less clear; for instance, if users notice job performance degradation from GPU power-caps, they may request additional GPU-jobs to compensate, negating any energy savings or even worsening energy consumption. To our knowledge, our work is the first to conduct and make available a detailed analysis of the effects of GPU power-cap** at the supercomputing scale. We hope our work will inspire HPCs/datacenters to further explore, evaluate, and communicate the impact of power-cap** AI hardware accelerators for more sustainable AI.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
A gallery of maximum-entropy distributions: 14 and 21 moments
Authors:
Stefano Boccelli,
Fabien Giroux,
James G. McDonald
Abstract:
This work explores the different shapes that can be realized by the one-particle velocity distribution functions (VDFs) associated with the fourth-order maximum-entropy moment method. These distributions take the form of an exponential of a polynomial of the particle velocity, with terms up to the fourth-order. The 14- and 21-moment approximations are investigated. Various non-equilibrium gas stat…
▽ More
This work explores the different shapes that can be realized by the one-particle velocity distribution functions (VDFs) associated with the fourth-order maximum-entropy moment method. These distributions take the form of an exponential of a polynomial of the particle velocity, with terms up to the fourth-order. The 14- and 21-moment approximations are investigated. Various non-equilibrium gas states are probed throughout moment space. The resulting maximum-entropy distributions deviate strongly from the equilibrium VDF, and show a number of lobes and branches. The Maxwellian and the anisotropic Gaussian distributions are recovered as special cases. The eigenvalues associated with the maximum-entropy system of transport equations are also illustrated for some selected gas states. Anisotropic and/or asymmetric non-equilibrium states are seen to be associated with a non-uniform spacial propagation of perturbations.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
A Benchmark Dataset for Tornado Detection and Prediction using Full-Resolution Polarimetric Weather Radar Data
Authors:
Mark S. Veillette,
James M. Kurdzo,
Phillip M. Stepanian,
John Y. N. Cho,
Siddharth Samsi,
Joseph McDonald
Abstract:
Weather radar is the primary tool used by forecasters to detect and warn for tornadoes in near-real time. In order to assist forecasters in warning the public, several algorithms have been developed to automatically detect tornadic signatures in weather radar observations. Recently, Machine Learning (ML) algorithms, which learn directly from large amounts of labeled data, have been shown to be hig…
▽ More
Weather radar is the primary tool used by forecasters to detect and warn for tornadoes in near-real time. In order to assist forecasters in warning the public, several algorithms have been developed to automatically detect tornadic signatures in weather radar observations. Recently, Machine Learning (ML) algorithms, which learn directly from large amounts of labeled data, have been shown to be highly effective for this purpose. Since tornadoes are extremely rare events within the corpus of all available radar observations, the selection and design of training datasets for ML applications is critical for the performance, robustness, and ultimate acceptance of ML algorithms. This study introduces a new benchmark dataset, TorNet to support development of ML algorithms in tornado detection and prediction. TorNet contains full-resolution, polarimetric, Level-II WSR-88D data sampled from 10 years of reported storm events. A number of ML baselines for tornado detection are developed and compared, including a novel deep learning (DL) architecture capable of processing raw radar imagery without the need for manual feature extraction required for existing ML algorithms. Despite not benefiting from manual feature engineering or other preprocessing, the DL model shows increased detection performance compared to non-DL and operational baselines. The TorNet dataset, as well as source code and model weights of the DL baseline trained in this work, are made freely available.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Numerical simulation of rarefied supersonic flows using a fourth-order maximum-entropy moment method with interpolative closure
Authors:
Stefano Boccelli,
Willem Kaufmann,
Thierry E. Magin,
James G. McDonald
Abstract:
Maximum-entropy moment methods allow for the modelling of gases from the continuum regime to strongly rarefied conditions. The development of approximated solutions to the entropy maximization problem has made these methods computationally affordable. In this work, we apply a fourth-order maximum-entropy moment method to the study of supersonic rarefied flows. For such conditions, we compare the m…
▽ More
Maximum-entropy moment methods allow for the modelling of gases from the continuum regime to strongly rarefied conditions. The development of approximated solutions to the entropy maximization problem has made these methods computationally affordable. In this work, we apply a fourth-order maximum-entropy moment method to the study of supersonic rarefied flows. For such conditions, we compare the maximum-entropy solutions to results obtained from the kinetic theory of gases at different Knudsen numbers. The analysis is performed for both a simplified model of a gas with a single translational degree of freedom (5-moment system) and for a typical gas with three degrees of freedom (14-moment system). The maximum-entropy method is applied to the study of the Sod shock-tube problem at various rarefaction levels, and to the simulation of two-dimensional low-collisional crossed supersonic jets. We show that, in rarefied supersonic conditions, it is important to employ accurate estimates of the wave speeds. Since analytical expressions are not presently available, we propose an approximation, valid for the 14-moment system. In these conditions, the solution of the maximum-entropy system is shown to realize large degrees of non-equilibrium and to approach the Junk subspace, yet provides a good overall accuracy and agreement with the kinetic theory. Numerical procedures for reaching second-order accurate discretizations are discussed, as well as the implementation of the 14-moment solver on Graphics Processing Units (GPUs).
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge
Authors:
Saravanabalagi Ramachandran,
Nathaniel Cibik,
Ganesh Sistu,
John McDonald
Abstract:
Motion segmentation is a complex yet indispensable task in autonomous driving. The challenges introduced by the ego-motion of the cameras, radial distortion in fisheye lenses, and the need for temporal consistency make the task more complicated, rendering traditional and standard Convolutional Neural Network (CNN) approaches less effective. The consequent laborious data labeling, representation of…
▽ More
Motion segmentation is a complex yet indispensable task in autonomous driving. The challenges introduced by the ego-motion of the cameras, radial distortion in fisheye lenses, and the need for temporal consistency make the task more complicated, rendering traditional and standard Convolutional Neural Network (CNN) approaches less effective. The consequent laborious data labeling, representation of diverse and uncommon scenarios, and extensive data capture requirements underscore the imperative of synthetic data for improving machine learning model performance. To this end, we employ the PD-WoodScape synthetic dataset developed by Parallel Domain, alongside the WoodScape fisheye dataset. Thus, we present the WoodScape fisheye motion segmentation challenge for autonomous driving, held as part of the CVPR 2023 Workshop on Omnidirectional Computer Vision (OmniCV). As one of the first competitions focused on fisheye motion segmentation, we aim to explore and evaluate the potential and impact of utilizing synthetic data in this domain. In this paper, we provide a detailed analysis on the competition which attracted the participation of 112 global teams and a total of 234 submissions. This study delineates the complexities inherent in the task of motion segmentation, emphasizes the significance of fisheye datasets, articulate the necessity for synthetic datasets and the resultant domain gap they engender, outlining the foundational blueprint for devising successful solutions. Subsequently, we delve into the details of the baseline experiments and winning methods evaluating their qualitative and quantitative results, providing with useful insights.
△ Less
Submitted 16 January, 2024; v1 submitted 31 December, 2023;
originally announced January 2024.
-
Shock induced ignition and transition to detonation in the presence of mechanically induced non-linear acoustic forcing
Authors:
Wentian Wang,
James McDonald,
Matei Ioan Radulescu
Abstract:
We address the problem of shock induced ignition and transition to detonation in a reactive medium in the presence of mechanically induced fluctuations by a moving oscillating piston. For the inert problem prior to ignition, we provide a closed form model for the generation of the train of compression and expansions, their steepening into a train of N-shock waves and their reflection on the lead s…
▽ More
We address the problem of shock induced ignition and transition to detonation in a reactive medium in the presence of mechanically induced fluctuations by a moving oscillating piston. For the inert problem prior to ignition, we provide a closed form model for the generation of the train of compression and expansions, their steepening into a train of N-shock waves and their reflection on the lead shock, as well as the distribution energy dissipation rate in the induction zone. The model is found in excellent agreement with numerics. Reactive calculations were performed for hydrogen and ethylene fuels using a novel high-fidelity numerical Lagrangian scheme. Different regimes of ignition and transition to detonation, controlled by the time scale of the forcing and the two time scales of the chemistry: the induction and reaction times. Two novel hot spot cascade mechanisms were identified. The first relies on the coherence between the sequence of hot spot formation set by the piston forcing and forward wave interaction with the lead shock, generalizing the classic runaway in fast flames. The second hot spot cascade is triggered by the feedback between the pressure pulse generated by the first generation hot spot cascade and the shock. For slow forcing, the sensitization is through a modification to the classic run-away process, while the high frequency regime leads to very localized sub-critical hot-spot formation controlled by the cumulative energy dissipation of the first generation shocks at a distance comparable to the shock formation location.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
Predicting Ion Sequestration in Charged Polymers with the Steepest-Entropy-Ascent Quantum Thermodynamic Framework
Authors:
Jared McDonald,
Michael R. von Spakovsky,
William T. Reynolds Jr
Abstract:
The steepest-entropy-ascent quantum thermodynamic framework is used to investigate the effectiveness of multi-chain polyethyleneimine-methylenephosphonic acid in sequestering rare-earth ions (Eu$^{+3}$) from aqueous solutions. The framework applies a thermodynamic equation of motion to a discrete energy eigenstructure to model the binding kinetics of europium ions to reactive sites of the polymer…
▽ More
The steepest-entropy-ascent quantum thermodynamic framework is used to investigate the effectiveness of multi-chain polyethyleneimine-methylenephosphonic acid in sequestering rare-earth ions (Eu$^{+3}$) from aqueous solutions. The framework applies a thermodynamic equation of motion to a discrete energy eigenstructure to model the binding kinetics of europium ions to reactive sites of the polymer chains. The energy eigenstructure is generated using a non-Markovian Monte Carlo model that estimates energy level degeneracies. The equation of motion is used to determine the occupation probability of each energy level, describing the unique path through thermodynamic state space by which the polymer system sequesters rare-earth ions from solution. A second Monte Carlo simulation is conducted to relate the kinetic path in state space to physical descriptors associated with the polymer, including the radius of gyration, tortuosity, and Eu-neighbor distribution functions. These descriptors are used to visualize the evolution of the polymer during the sequestration process. The fraction of sequestered Eu$^{+3}$ ions depends upon the total energy of the system, with lower energy resulting in higher sequestration. The kinetics of the overall sequestration are dependent on the steepest-entropy-ascent principle used by the equation of motion to generate a unique kinetic path from an initial non-equilibrium state.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Adiabatic Axion-Photon Mixing Near Neutron Stars
Authors:
Jonas Tjemsland,
Jamie McDonald,
Samuel J. Witte
Abstract:
One of the promising new proposals to search for axions in astrophysical environments is to look for narrow radio lines produced from the resonant conversion of axion dark matter falling through the magnetospheres of neutron stars. For sufficiently strong magnetic fields, axion masses in the $\mathcal{O}(10μ{\rm eV)}$ range, and axion-photon couplings $g_{aγ} \gtrsim 10^{-12} \, {\rm GeV^{-1}}$, t…
▽ More
One of the promising new proposals to search for axions in astrophysical environments is to look for narrow radio lines produced from the resonant conversion of axion dark matter falling through the magnetospheres of neutron stars. For sufficiently strong magnetic fields, axion masses in the $\mathcal{O}(10μ{\rm eV)}$ range, and axion-photon couplings $g_{aγ} \gtrsim 10^{-12} \, {\rm GeV^{-1}}$, the conversion can become hyper-efficient, allowing axion-photon and photon-axion transitions to occur with $\mathcal{O}(1)$ probabilities. Despite the strong mixing between these particles, the observable radio flux emanating from the magnetosphere is expected to be heavily suppressed -- this is a consequence of the fact that photons sourced by infalling axions have a high probability of converting back into axions before esca** the magnetosphere. In this work, we study the evolution of the axion and photon phase space near the surface of highly magnetized neutron stars in the adiabatic regime, quantifying for the first time the properties of the radio flux that arise at high axion-photon couplings. We show that previous attempts to mimic the scaling in this regime have been overly conservative in their treatment, and that the suppression can be largely circumvented for radio observations targeting neutron star populations.
△ Less
Submitted 4 January, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference
Authors:
Siddharth Samsi,
Dan Zhao,
Joseph McDonald,
Baolin Li,
Adam Michaleas,
Michael Jones,
William Bergeron,
Jeremy Kepner,
Devesh Tiwari,
Vijay Gadepally
Abstract:
Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs…
▽ More
Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs already receive less attention than the energy costs of training LLMs -- despite how often these large models are called on to conduct inference in reality (e.g., ChatGPT). As these state-of-the-art LLMs see increasing usage and deployment in various domains, a better understanding of their resource utilization is crucial for cost-savings, scaling performance, efficient hardware usage, and optimal inference strategies.
In this paper, we describe experiments conducted to study the computational and energy utilization of inference with LLMs. We benchmark and conduct a preliminary analysis of the inference performance and inference energy costs of different sizes of LLaMA -- a recent state-of-the-art LLM -- developed by Meta AI on two generations of popular GPUs (NVIDIA V100 \& A100) and two datasets (Alpaca and GSM8K) to reflect the diverse set of tasks/benchmarks for LLMs in research and practice. We present the results of multi-node, multi-GPU inference using model sharding across up to 32 GPUs. To our knowledge, our work is the one of the first to study LLM inference performance from the perspective of computational and energy resources at this scale.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Generalized Ray Tracing for Axions in Astrophysical Plasmas
Authors:
J. I. McDonald,
S. J. Witte
Abstract:
Ray tracing plays a vital role in black hole imaging, modeling the emission mechanisms of pulsars, and deriving signatures from physics beyond the Standard Model. In this work we focus on one specific application of ray tracing, namely, predicting radio signals generated from the resonant conversion of axion dark matter in the strongly magnetized plasma surrounding neutron stars. The production an…
▽ More
Ray tracing plays a vital role in black hole imaging, modeling the emission mechanisms of pulsars, and deriving signatures from physics beyond the Standard Model. In this work we focus on one specific application of ray tracing, namely, predicting radio signals generated from the resonant conversion of axion dark matter in the strongly magnetized plasma surrounding neutron stars. The production and propagation of low-energy photons in these environments are sensitive to both the anisotropic response of the background plasma and curved spacetime; here, we employ a fully covariant framework capable of treating both effects. We implement this both via forward and backward ray tracing. In forward ray tracing, photons are sampled at the point of emission and propagated to infinity, whilst in the backward-tracing approach, photons are traced backwards from an image plane to the point of production. We explore various approximations adopted in prior work, quantifying the importance of gravity, plasma anisotropy, the neutron star mass and radius, and imposing the proper kinematic matching of the resonance. Finally, using a more realistic model for the charge distribution of magnetar magnetospheres, we revisit the sensitivity of current and future radio and sub-mm telescopes to spectral lines emanating from the Galactic Center Magnetar, showing such observations may extend sensitivity to axion masses $m_a \sim \mathcal{O}({\rm few}) \times 10^{-3}$ eV, potentially even probing parameter space of the QCD axion.
△ Less
Submitted 6 November, 2023; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Balanced-chromatic number and Hadwiger-like conjectures
Authors:
Andrea Jiménez,
Jessica Mcdonald,
Reza Naserasr,
Kathryn Nurse,
Daniel A. Quiroz
Abstract:
Motivated by different characterizations of planar graphs and the 4-Color Theorem, several structural results concerning graphs of high chromatic number have been obtained. Toward strengthening some of these results, we consider the \emph{balanced chromatic number}, $χ_b(\hat{G})$, of a signed graph $\hat{G}$. This is the minimum number of parts into which the vertices of a signed graph can be par…
▽ More
Motivated by different characterizations of planar graphs and the 4-Color Theorem, several structural results concerning graphs of high chromatic number have been obtained. Toward strengthening some of these results, we consider the \emph{balanced chromatic number}, $χ_b(\hat{G})$, of a signed graph $\hat{G}$. This is the minimum number of parts into which the vertices of a signed graph can be partitioned so that none of the parts induces a negative cycle. This extends the notion of the chromatic number of a graph since $χ(G)=χ_b(\tilde{G})$, where $\tilde{G}$ denotes the signed graph obtained from~$G$ by replacing each edge with a pair of (parallel) positive and negative edges. We introduce a signed version of Hadwiger's conjecture as follows.
Conjecture: If a signed graph $\hat{G}$ has no negative loop and no $\tilde{K_t}$-minor, then its balanced chromatic number is at most $t-1$.
We prove that this conjecture is, in fact, equivalent to Hadwiger's conjecture and show its relation to the Odd Hadwiger Conjecture.
Motivated by these results, we also consider the relation between subdivisions and balanced chromatic number. We prove that if $(G, σ)$ has no negative loop and no $\tilde{K_t}$-subdivision, then it admits a balanced $\frac{79}{2}t^2$-coloring. This qualitatively generalizes a result of Kawarabayashi (2013) on totally odd subdivisions.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Modelling high-Mach-number rarefied crossflows past a flat plate using the maximum-entropy moment method
Authors:
Stefano Boccelli,
Pietro Parodi,
Thierry E. Magin,
James G. McDonald
Abstract:
The 10 and 14-moment maximum-entropy methods are applied to the study of high-Mach-number non-reacting crossflows past a flat plate at large degrees of rarefaction. The moment solutions are compared to particle-based kinetic solutions, showing a varying degree of accuracy. At a Knudsen number of 0.1, the 10-moment method is able to reproduce the shock layer, while it fails to predict the low-densi…
▽ More
The 10 and 14-moment maximum-entropy methods are applied to the study of high-Mach-number non-reacting crossflows past a flat plate at large degrees of rarefaction. The moment solutions are compared to particle-based kinetic solutions, showing a varying degree of accuracy. At a Knudsen number of 0.1, the 10-moment method is able to reproduce the shock layer, while it fails to predict the low-density wake region, due to the lack of a heat flux. Conversely, the 14-moment method results in accurate predictions of both regions. At a Knudsen number of 1, the 10-moment method produces unphysical results in both the shock layer and in the wake. The 14-moment method also shows a reduced accuracy, but manages to predict a reasonable shock region, free of unphysical sub-shocks, and in qualitative agreement with the kinetic solution. Accuracy is partially lost in the wake, where the 14-moment method predicts a thin unphysical high-density layer, concentrated on the centreline. An analysis of the velocity distribution functions (VDF) indicates strongly non-Maxwellian shapes, and the presence of distinct particle populations, in the wake, crossing each other at the centreline. The particle-based and the 14-moment method VDFs are in qualitative agreement.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
Axion-Photon Conversion in 3D Media and Astrophysical Plasmas
Authors:
J. I. McDonald,
B. Garbrecht,
P. Millington
Abstract:
With axions now a primary candidate for dark matter, understanding their indirect astrophysical signatures is of paramount importance. Key to this is the production of photons from axions in magnetised astrophysical plasmas. While simple formulae for axion-photon mixing in 1D have been sketched several decades ago, there has recently been renewed interest in robust calculations for this process in…
▽ More
With axions now a primary candidate for dark matter, understanding their indirect astrophysical signatures is of paramount importance. Key to this is the production of photons from axions in magnetised astrophysical plasmas. While simple formulae for axion-photon mixing in 1D have been sketched several decades ago, there has recently been renewed interest in robust calculations for this process in arbitrary 3D plasmas. These calculations are vital for understanding, amongst other things, the radio production from axion dark matter conversion in neutron stars, which may lead to indirect axion dark matter detection with current telescopes or future searches, e.g., by the SKA. In this paper, we derive the relevant transport equations in magnetised plasmas. These equations describe both the production and propagation of photons in an arbitrary 3D medium due to the resonant conversion of axions into photons. They also fully incorporate the refraction of photons, and we find no evidence for a conjectured phenomenon of dephasing. Our result is free of divergences that plagued previous calculations, and our kinetic theory description provides a direct link between ray tracing and the production mechanism. These results mark an important step toward solving one of the major open questions concerning indirect searches of axions in recent years, namely how to compute the photon production rate from axions in arbitrary 3D plasmas.
△ Less
Submitted 20 December, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Group connectivity of 3-edge-connected signed graphs
Authors:
Alejandra Brewer Castano,
Jessica McDonald,
Kathryn Nurse
Abstract:
Jaeger, Linial, Payan, and Tarsi introduced the notion of $A$-connectivity for graphs in 1992, and proved a decomposition for cubic graphs from which $A$-connectivity follows for all 3-edge-connected graphs when $|A|\geq 6$. The concept of $A$-connectivity was generalized to signed graphs by Li, Luo, Ma, and Zhang in 2018 and they proved that all 4-edge-connected flow-admissible signed graphs are…
▽ More
Jaeger, Linial, Payan, and Tarsi introduced the notion of $A$-connectivity for graphs in 1992, and proved a decomposition for cubic graphs from which $A$-connectivity follows for all 3-edge-connected graphs when $|A|\geq 6$. The concept of $A$-connectivity was generalized to signed graphs by Li, Luo, Ma, and Zhang in 2018 and they proved that all 4-edge-connected flow-admissible signed graphs are $A$-connected when $|A|\geq 4$ and $|A|\neq 5$. We prove that all 3-edge-connected flow-admissible signed graphs are $A$-connected when $|A|\geq 6$ and $|A|\neq 7$. Our proof is based on a decomposition that is a signed-graph analogue of the decomposition found by Jaeger et. al, and which may be of independent interest.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Direct and ordinal products realized by triangular norm operators with no zero divisors
Authors:
Joseph McDonald
Abstract:
IIn this note we continue the work of Chon, as well as Mezzomo, Bedregal, and Santiago, by studying algebraic operations on fuzzy posets and bounded fuzzy lattices. We first prove that fuzzy posets are closed under finite direct products whenever the triangular norm realizing the product construction has no zero divisors. This result is then extended to the case of bounded fuzzy lattices. Some imm…
▽ More
IIn this note we continue the work of Chon, as well as Mezzomo, Bedregal, and Santiago, by studying algebraic operations on fuzzy posets and bounded fuzzy lattices. We first prove that fuzzy posets are closed under finite direct products whenever the triangular norm realizing the product construction has no zero divisors. This result is then extended to the case of bounded fuzzy lattices. Some immediate consequences are then obtained within the setting of direct products realized by triangular norms with no nilpotent elements as well as strictly monotone and cancellative triangular norms. We then introduce a triangular norm based construction of ordinal products and similarly show that fuzzy posets are closed under ordinal products whenever the triangular norm realizing the product construction has no zero divisors.
△ Less
Submitted 15 June, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Higgs Inflation via the Metastable Standard Model Potential, Generalised Renormalisation Frame Prescriptions and Predictions for Primordial Gravitational Waves
Authors:
J. McDonald
Abstract:
Higgs Inflation via the unmodified metastable Standard Model Higgs Potential is possible if the effective Planck mass in the Jordan frame increases after inflation ends. Here we consider the predictions of this model independently of the dynamics responsible for the Planck mass transition. The classical predictions are the same as for conventional Higgs Inflation. The quantum corrections are depen…
▽ More
Higgs Inflation via the unmodified metastable Standard Model Higgs Potential is possible if the effective Planck mass in the Jordan frame increases after inflation ends. Here we consider the predictions of this model independently of the dynamics responsible for the Planck mass transition. The classical predictions are the same as for conventional Higgs Inflation. The quantum corrections are dependent upon the conformal frame in which the effective potential is calculated. We generalise beyond the usual Prescription I and II renormalisation frame choices to include intermediate frames characterised by a parameter $α$. We find that the model predicts a well-defined correlation between the values of the scalar spectral index $n_{s}$ and tensor-to-scalar ratio $r$. For values of $n_{s}$ varying between the 2-$σ$ Planck observational limits, we find that $r$ varies between 0.002 and 0.005 as $n_{s}$ increases, compared to the classical prediction of 0.003. Therefore significantly larger or smaller values of $r$ are possible, which are correlated with larger or smaller values of $n_{s}$. This can be tested via the detection of primordial gravitational waves by the next generation of CMB polarisation experiments.
△ Less
Submitted 18 September, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Predicting Polymer Brush Behavior in Solvents using the Steepest-Entropy-Ascent Quantum Thermodynamic Framework
Authors:
Jared McDonald,
Michael R. von Spakovsky,
William T. Reynolds Jr
Abstract:
The steepest-entropy-ascent quantum thermodynamic (SEAQT) framework is utilized to study the effects of temperature on polymer brushes. The brushes are represented by a discrete energy spectrum and energy degeneracies obtained through the Replica-Exchange Wang-Landau algorithm. The SEAQT equation of motion is applied to the density of states to establish a unique kinetic path from an initial therm…
▽ More
The steepest-entropy-ascent quantum thermodynamic (SEAQT) framework is utilized to study the effects of temperature on polymer brushes. The brushes are represented by a discrete energy spectrum and energy degeneracies obtained through the Replica-Exchange Wang-Landau algorithm. The SEAQT equation of motion is applied to the density of states to establish a unique kinetic path from an initial thermodynamic state to a stable equilibrium state. The kinetic path describes the brush's evolution in state space as it interacts with a thermal reservoir. The predicted occupation probabilities along the kinetic path are used to determine expected thermodynamic and structural properties. The polymer density profile of a polystyrene brush in cyclohexane solvent is predicted using the equation of motion, and it agrees qualitatively with experimental density profiles. The Flory-Huggins parameter chosen to describe brush-solvent interactions affects the solvent distribution in the brush but has minimal impact on the polymer density profile. Three types of non-equilibrium kinetic paths with differing amounts of entropy production are considered: a heating path, a cooling path, and a heating-cooling path. Properties such as tortuosity, radius of gyration, brush density, solvent density, and brush chain conformations are calculated for each path.
△ Less
Submitted 1 November, 2023; v1 submitted 8 April, 2023;
originally announced April 2023.
-
Effect of Moisture Absorption on Curing of Wind Blades during Repair
Authors:
Sagar P. Shah,
Michael N. Olaya,
Evgenia Plaka,
Joseph McDonald,
Christopher J. Hansen,
Marianna Maiarù
Abstract:
Efficient structural repair of wind turbine blades is essential to limiting global warming and reducing the Levelized Cost of Energy (LCOE). Repairs carried out up-tower are sensitive to environmental conditions whose effect on the material properties during processing needs to be accounted for to accurately predict the repair outcome. This study investigates the effect of moisture content from en…
▽ More
Efficient structural repair of wind turbine blades is essential to limiting global warming and reducing the Levelized Cost of Energy (LCOE). Repairs carried out up-tower are sensitive to environmental conditions whose effect on the material properties during processing needs to be accounted for to accurately predict the repair outcome. This study investigates the effect of moisture content from environmental exposure on the cure kinetics of an infusion resin system used in wind turbine blade manufacturing and repair and provides an experimentally validated finite element tool for the analysis of cure cycle repairs as a function of repair geometry and moisture content. Moisture absorption tests on the two-part infusion system reported up to 12% moisture uptake by the curing agent under high temperature and relative humidity conditions. Differential scanning calorimetric measurements of resin in the presence of moisture revealed an accelerated cure behavior. Numerical predictions of a repair model agreed very well with the corresponding lab-scale repair and revealed a substantial temperature lag within the repair patch which resulted in thermal gradients and spatial distribution of the degree of cure. It was shown that the repair geometry and the accelerated-cure kinetics greatly influenced the temperature and cure distribution within the repair. The proposed approach can be used to reduce turbine downtime by minimizing the curing time.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Searching for Time-Dependent Axion Dark Matter Signals in Pulsars
Authors:
R. A. Battye,
M. J. Keith,
J. I. McDonald,
S. Srinivasan,
B. W. Stappers,
P. Weltevrede
Abstract:
Axion dark matter can be converted into photons in the magnetospheres of neutron stars leading to a spectral line centred on the Compton wavelength of the axion. Due to the rotation of the star and the plasma effects in the magnetosphere the signal is predicted to be periodic with significant time variation - a unique smoking gun for axion dark matter. As a proof of principle and to develop the me…
▽ More
Axion dark matter can be converted into photons in the magnetospheres of neutron stars leading to a spectral line centred on the Compton wavelength of the axion. Due to the rotation of the star and the plasma effects in the magnetosphere the signal is predicted to be periodic with significant time variation - a unique smoking gun for axion dark matter. As a proof of principle and to develop the methodology, we carry out the first time domain search of the signal using data from PSR J2144$-$3933 taken as part of the MeerTIME project on MeerKAT telescope. We search for specific signal templates using a matched filter technique and discuss when a time-domain analysis (as is typically the case in pulsar observations) gives greater sensitivity to the axion-coupling to photons in comparison to a simple time-averaged total flux study. We do not find any candidate signals and, hence, impose an upper limit on the axion-to-photon coupling of $g_{aγγ}<4\times 10^{-11}\,{\rm GeV}^{-1}$ over the mass range $m_{\rm a}=3.9-4.7\,μ{\rm eV}$ using this data. This limit relies on PSR J2144$-$3933 not being an extremely aligned rotator, as strongly supported by simple arguments based on the observed pulse profile width. We discuss the possibilities of improving this limit using future observations with MeerKAT and also SKA1-mid and the possibility of using other objects. Finally, to evade modelling uncertainties in axion radio signals, we also carry out a generic ``any periodic-signal search" in the data, finding no evidence for an axion signal.
△ Less
Submitted 28 November, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Another proof of Seymour's 6-flow theorem
Authors:
Matt DeVos,
Jessica McDonald,
Kathryn Nurse
Abstract:
In 1981 Seymour proved his famous 6-flow theorem asserting that every 2-edge-connected graph has a nowhere-zero flow in the group ${\mathbb Z}_2 \times {\mathbb Z}_3$ (in fact, he offers two proofs of this result). In this note we give a new short proof of a generalization of this theorem where ${\mathbb Z}_2 \times {\mathbb Z}_3$-valued functions are found subject to certain boundary constraints.
In 1981 Seymour proved his famous 6-flow theorem asserting that every 2-edge-connected graph has a nowhere-zero flow in the group ${\mathbb Z}_2 \times {\mathbb Z}_3$ (in fact, he offers two proofs of this result). In this note we give a new short proof of a generalization of this theorem where ${\mathbb Z}_2 \times {\mathbb Z}_3$-valued functions are found subject to certain boundary constraints.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
A Green(er) World for A.I
Authors:
Dan Zhao,
Nathan C. Frey,
Joseph McDonald,
Matthew Hubbell,
David Bestor,
Michael Jones,
Andrew Prout,
Vijay Gadepally,
Siddharth Samsi
Abstract:
As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications from A.I. have brought significant advances, from applications to vision and natural language to improvements to fields like medical imaging and materials engineering, their costs sho…
▽ More
As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications from A.I. have brought significant advances, from applications to vision and natural language to improvements to fields like medical imaging and materials engineering, their costs should not be neglected. As we embrace a world with ever-increasing amounts of data as well as research and development of A.I. applications, we are sure to face an ever-mounting energy footprint to sustain these computational budgets, data storage needs, and more. But, is this sustainable and, more importantly, what kind of setting is best positioned to nurture such sustainable A.I. in both research and practice? In this paper, we outline our outlook for Green A.I. -- a more sustainable, energy-efficient and energy-aware ecosystem for develo** A.I. across the research, computing, and practitioner communities alike -- and the steps required to arrive there. We present a bird's eye view of various areas for potential changes and improvements from the ground floor of AI's operational and hardware optimizations for datacenters/HPCs to the current incentive structures in the world of A.I. research and practice, and more. We hope these points will spur further discussion, and action, on some of these issues and their potential solutions.
△ Less
Submitted 27 January, 2023;
originally announced January 2023.
-
A Review of the Trends and Challenges in Adopting Natural Language Processing Methods for Education Feedback Analysis
Authors:
Thanveer Shaik,
Xiaohui Tao,
Yan Li,
Christopher Dann,
Jacquie Mcdonald,
Petrea Redmond,
Linda Galligan
Abstract:
Artificial Intelligence (AI) is a fast-growing area of study that stretching its presence to many business and research domains. Machine learning, deep learning, and natural language processing (NLP) are subsets of AI to tackle different areas of data processing and modelling. This review article presents an overview of AI impact on education outlining with current opportunities. In the education…
▽ More
Artificial Intelligence (AI) is a fast-growing area of study that stretching its presence to many business and research domains. Machine learning, deep learning, and natural language processing (NLP) are subsets of AI to tackle different areas of data processing and modelling. This review article presents an overview of AI impact on education outlining with current opportunities. In the education domain, student feedback data is crucial to uncover the merits and demerits of existing services provided to students. AI can assist in identifying the areas of improvement in educational infrastructure, learning management systems, teaching practices and study environment. NLP techniques play a vital role in analyzing student feedback in textual format. This research focuses on existing NLP methodologies and applications that could be adapted to educational domain applications like sentiment annotations, entity annotations, text summarization, and topic modelling. Trends and challenges in adopting NLP in education were reviewed and explored. Contextbased challenges in NLP like sarcasm, domain-specific language, ambiguity, and aspect-based sentiment analysis are explained with existing methodologies to overcome them. Research community approaches to extract the semantic meaning of emoticons and special characters in feedback which conveys user opinion and challenges in adopting NLP in education are explored.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
Leptogenesis via Inflaton Mass Terms in Non-Minimally Coupled Inflation
Authors:
Kit Lloyd-Stubbs,
John McDonald
Abstract:
We consider a model of baryogenesis based on adding lepton number-violating quadratic mass terms to the inflaton potential of a non-minimally coupled inflation model. The $L$-violating mass terms generate a lepton asymmetry in a complex inflaton field via the mass term Affleck-Dine mechanism, which is transferred to the Standard Model (SM) sector when the inflaton decays to right-handed (RH) neutr…
▽ More
We consider a model of baryogenesis based on adding lepton number-violating quadratic mass terms to the inflaton potential of a non-minimally coupled inflation model. The $L$-violating mass terms generate a lepton asymmetry in a complex inflaton field via the mass term Affleck-Dine mechanism, which is transferred to the Standard Model (SM) sector when the inflaton decays to right-handed (RH) neutrinos. The model is minimal in that it requires only the SM sector, RH neutrinos, and a non-minimally coupled inflaton sector. We find that baryon isocurvature fluctuations can be observable in metric inflation but are negligible in Palatini inflation. The model is compatible with reheating temperatures that may be detectable in the observable primordial gravitational waves predicted by metric inflation.
△ Less
Submitted 11 July, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
A Deep Learning-based Velocity Dealiasing Algorithm Derived from the WSR-88D Open Radar Product Generator
Authors:
Mark S. Veillette,
James M. Kurdzo,
Phillip M. Stepanian,
Joseph McDonald,
Siddharth Samsi,
John Y. N. Cho
Abstract:
Radial velocity estimates provided by Doppler weather radar are critical measurements used by operational forecasters for the detection and monitoring of life-impacting storms. The sampling methods used to produce these measurements are inherently susceptible to aliasing, which produces ambiguous velocity values in regions with high winds, and needs to be corrected using a velocity dealiasing algo…
▽ More
Radial velocity estimates provided by Doppler weather radar are critical measurements used by operational forecasters for the detection and monitoring of life-impacting storms. The sampling methods used to produce these measurements are inherently susceptible to aliasing, which produces ambiguous velocity values in regions with high winds, and needs to be corrected using a velocity dealiasing algorithm (VDA). In the US, the Weather Surveillance Radar-1988 Doppler (WSR-88D) Open Radar Product Generator (ORPG) is a processing environment that provides a world-class VDA; however, this algorithm is complex and can be difficult to port to other radar systems outside of the WSR-88D network. In this work, a Deep Neural Network (DNN) is used to emulate the 2-dimensional WSR-88D ORPG dealiasing algorithm. It is shown that a DNN, specifically a customized U-Net, is highly effective for building VDAs that are accurate, fast, and portable to multiple radar types. To train the DNN model, a large dataset is generated containing aligned samples of folded and dealiased velocity pairs. This dataset contains samples collected from WSR-88D Level-II and Level-III archives, and uses the ORPG dealiasing algorithm output as a source of truth. Using this dataset, a U-Net is trained to produce the number of folds at each point of a velocity image. Several performance metrics are presented using WSR-88D data. The algorithm is also applied to other non-WSR-88D radar systems to demonstrate portability to other hardware/software interfaces. A discussion of the broad applicability of this method is presented, including how other Level-III algorithms may benefit from this approach.
△ Less
Submitted 30 March, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Empirical Macroeconomics and DSGE Modeling in Statistical Perspective
Authors:
Daniel J. McDonald,
Cosma Rohilla Shalizi
Abstract:
Dynamic stochastic general equilibrium (DSGE) models have been an ubiquitous, and controversial, part of macroeconomics for decades. In this paper, we approach DSGEs purely as statstical models. We do this by applying two common model validation checks to the canonical Smets and Wouters 2007 DSGE: (1) we simulate the model and see how well it can be estimated from its own simulation output, and (2…
▽ More
Dynamic stochastic general equilibrium (DSGE) models have been an ubiquitous, and controversial, part of macroeconomics for decades. In this paper, we approach DSGEs purely as statstical models. We do this by applying two common model validation checks to the canonical Smets and Wouters 2007 DSGE: (1) we simulate the model and see how well it can be estimated from its own simulation output, and (2) we see how well it can seem to fit nonsense data. We find that (1) even with centuries' worth of data, the model remains poorly estimated, and (2) when we swap series at random, so that (e.g.) what the model gets as the inflation rate is really hours worked, what it gets as hours worked is really investment, etc., the fit is often only slightly impaired, and in a large percentage of cases actually improves (even out of sample). Taken together, these findings cast serious doubt on the meaningfulness of parameter estimates for this DSGE, and on whether this specification represents anything structural about the economy. Constructively, our approaches can be used for model validation by anyone working with macroeconomic time series.
△ Less
Submitted 31 October, 2022; v1 submitted 28 October, 2022;
originally announced October 2022.
-
Fast and Efficient Scene Categorization for Autonomous Driving using VAEs
Authors:
Saravanabalagi Ramachandran,
Jonathan Horgan,
Ganesh Sistu,
John McDonald
Abstract:
Scene categorization is a useful precursor task that provides prior knowledge for many advanced computer vision tasks with a broad range of applications in content-based image indexing and retrieval systems. Despite the success of data driven approaches in the field of computer vision such as object detection, semantic segmentation, etc., their application in learning high-level features for scene…
▽ More
Scene categorization is a useful precursor task that provides prior knowledge for many advanced computer vision tasks with a broad range of applications in content-based image indexing and retrieval systems. Despite the success of data driven approaches in the field of computer vision such as object detection, semantic segmentation, etc., their application in learning high-level features for scene recognition has not achieved the same level of success. We propose to generate a fast and efficient intermediate interpretable generalized global descriptor that captures coarse features from the image and use a classification head to map the descriptors to 3 scene categories: Rural, Urban and Suburban. We train a Variational Autoencoder in an unsupervised manner and map images to a constrained multi-dimensional latent space and use the latent vectors as compact embeddings that serve as global descriptors for images. The experimental results evidence that the VAE latent vectors capture coarse information from the image, supporting their usage as global descriptors. The proposed global descriptor is very compact with an embedding length of 128, significantly faster to compute, and is robust to seasonal and illuminational changes, while capturing sufficient scene information required for scene categorization.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Axion detection with phonon-polaritons revisited
Authors:
David J. E. Marsh,
Jamie I. McDonald,
Alexander J. Millar,
Jan Schütte-Engel
Abstract:
In the presence of a background magnetic field, axion dark matter induces an electric field and can thus excite phonon-polaritons in suitable materials. We revisit the calculation of the axion-photon conversion power output from such materials, accounting for finite volume effects, and material losses. Our calculation shows how phonon-polaritons can be converted to propagating photons at the mater…
▽ More
In the presence of a background magnetic field, axion dark matter induces an electric field and can thus excite phonon-polaritons in suitable materials. We revisit the calculation of the axion-photon conversion power output from such materials, accounting for finite volume effects, and material losses. Our calculation shows how phonon-polaritons can be converted to propagating photons at the material boundary, offering a route to detecting the signal. Using the dielectric functions of GaAs, Al$_2$O$_3$, and SiO$_2$, a fit to our loss model leads to a signal of lower magnitude than previous calculations. We demonstrate how knowledge of resonances in the dielectric function can directly be used to calculate the sensitivity of any material to axion dark matter. We argue that a combination of low losses encountered at $\mathcal{O}(1)$ K temperatures and near future improvements in detector dark count allow one to probe the QCD axion in the mass range $m_a\approx 100$ meV. This provides further impetus to examine novel materials and further develop detectors in the THz regime. We also discuss possible tuning methods to scan the axion mass.
△ Less
Submitted 17 April, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
GP-net: Flexible Viewpoint Grasp Proposal
Authors:
Anna Konrad,
John McDonald,
Rudi Villing
Abstract:
We present the Grasp Proposal Network (GP-net), a Convolutional Neural Network model which can generate 6-DoF grasps from flexible viewpoints, e.g. as experienced by mobile manipulators. To train GP-net, we synthetically generate a dataset containing depth-images and ground-truth grasp information. In real-world experiments, we use the EGAD evaluation benchmark to evaluate GP-net against two commo…
▽ More
We present the Grasp Proposal Network (GP-net), a Convolutional Neural Network model which can generate 6-DoF grasps from flexible viewpoints, e.g. as experienced by mobile manipulators. To train GP-net, we synthetically generate a dataset containing depth-images and ground-truth grasp information. In real-world experiments, we use the EGAD evaluation benchmark to evaluate GP-net against two commonly used algorithms, the Volumetric Gras** Network (VGN) and the Grasp Pose Detection package (GPD), on a PAL TIAGo mobile manipulator. In contrast to the state-of-the-art methods in robotic gras**, GP-net can be used for gras** objects from flexible, unknown viewpoints without the need to define the workspace and achieves a grasp success of 54.4% compared to 51.6% for VGN and 44.2% for GPD. We provide a ROS package along with our code and pre-trained models at https://aucoroboticsmu.github.io/GP-net/.
△ Less
Submitted 12 October, 2023; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Exponential Family Trend Filtering on Lattices
Authors:
Veeranjaneyulu Sadhanala,
Robert Bassett,
James Sharpnack,
Daniel J. McDonald
Abstract:
Trend filtering is a modern approach to nonparametric regression that is more adaptive to local smoothness than splines or similar basis procedures. Existing analyses of trend filtering focus on estimating a function corrupted by homoskedastic Gaussian noise, but our work extends this technique to general exponential family distributions. This extension is motivated by the need to study massive, g…
▽ More
Trend filtering is a modern approach to nonparametric regression that is more adaptive to local smoothness than splines or similar basis procedures. Existing analyses of trend filtering focus on estimating a function corrupted by homoskedastic Gaussian noise, but our work extends this technique to general exponential family distributions. This extension is motivated by the need to study massive, gridded climate data derived from polar-orbiting satellites. We present algorithms tailored to large problems, theoretical results for general exponential family likelihoods, and principled methods for tuning parameter selection without excess computation.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
An Evaluation of Low Overhead Time Series Preprocessing Techniques for Downstream Machine Learning
Authors:
Matthew L. Weiss,
Joseph McDonald,
David Bestor,
Charles Yee,
Daniel Edelman,
Michael Jones,
Andrew Prout,
Andrew Bowne,
Lindsey McEvoy,
Vijay Gadepally,
Siddharth Samsi
Abstract:
In this paper we address the application of pre-processing techniques to multi-channel time series data with varying lengths, which we refer to as the alignment problem, for downstream machine learning. The misalignment of multi-channel time series data may occur for a variety of reasons, such as missing data, varying sampling rates, or inconsistent collection times. We consider multi-channel time…
▽ More
In this paper we address the application of pre-processing techniques to multi-channel time series data with varying lengths, which we refer to as the alignment problem, for downstream machine learning. The misalignment of multi-channel time series data may occur for a variety of reasons, such as missing data, varying sampling rates, or inconsistent collection times. We consider multi-channel time series data collected from the MIT SuperCloud High Performance Computing (HPC) center, where different job start times and varying run times of HPC jobs result in misaligned data. This misalignment makes it challenging to build AI/ML approaches for tasks such as compute workload classification. Building on previous supervised classification work with the MIT SuperCloud Dataset, we address the alignment problem via three broad, low overhead approaches: sampling a fixed subset from a full time series, performing summary statistics on a full time series, and sampling a subset of coefficients from time series mapped to the frequency domain. Our best performing models achieve a classification accuracy greater than 95%, outperforming previous approaches to multi-channel time series classification with the MIT SuperCloud Dataset by 5%. These results indicate our low overhead approaches to solving the alignment problem, in conjunction with standard machine learning techniques, are able to achieve high levels of classification accuracy, and serve as a baseline for future approaches to addressing the alignment problem, such as kernel methods.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
High Velocity Stars in SDSS/APOGEE DR17
Authors:
Fredi Quispe-Huaynasi,
Fernando Roig,
Devin J. McDonald,
Veronica Loaiza-Tacuri,
Steven R. Majewski,
Fabio C. Wanderley,
Katia Cunha,
Claudio B. Pereira,
Sten Hasselquist,
Simone Daflon
Abstract:
We report 23 stars having Galactocentric velocities larger than $450~\mathrm{km\,s}^{-1}$ in the final data release of the APOGEE survey. This sample was generated using space velocities derived by complementing the high quality radial velocities from the APOGEE project in Sloan Digital Sky Survey's Data Release 17 (DR17) with distances and proper motions from Gaia early Data Release 3 (eDR3). We…
▽ More
We report 23 stars having Galactocentric velocities larger than $450~\mathrm{km\,s}^{-1}$ in the final data release of the APOGEE survey. This sample was generated using space velocities derived by complementing the high quality radial velocities from the APOGEE project in Sloan Digital Sky Survey's Data Release 17 (DR17) with distances and proper motions from Gaia early Data Release 3 (eDR3). We analyze the observed kinematics and derived dynamics of these stars, considering different potential models for the Galaxy. We find that three stars could be unbound depending on the adopted potential, but in general all of the stars show typical kinematics of halo stars. The APOGEE DR17 spectroscopic results and Gaia eDR3 photometry are used to assess the stellar parameters and chemical properties of the stars. All of the stars belong to the red giant branch, and, in general, they follow the abundance pattern of typical halo stars. There are a few exceptions that would deserve further analysis through high-resolution spectroscopy. In particular, we identify a high velocity Carbon-Enhanced Metal-Poor (CEMP) star, with Galactocentric velocity of 482 km\,s$^{-1}$. We do not confirm any hypervelocity star in the sample, but this result is very sensitive to the adopted distances, and less sensitive to the Galactic potential.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Predicting Non-Equilibrium Folding Behavior of Polymer Chains using the Steepest-Entropy-Ascent Quantum Thermodynamic Framework
Authors:
Jared McDonald,
Michael R. von Spakovsky,
William T. Reynolds Jr
Abstract:
The Replica Exchange Wang-Landau Method is used to estimate the energy landscape of a polymer composed of a simple hydrophobic and polar sequence using the HP protein model. Calculations of state transitions between the energy levels of the derived energy landscape are made using an equation of motion from the steepest-entropy-ascent quantum thermodynamic (SEAQT) framework. The SEAQT framework mak…
▽ More
The Replica Exchange Wang-Landau Method is used to estimate the energy landscape of a polymer composed of a simple hydrophobic and polar sequence using the HP protein model. Calculations of state transitions between the energy levels of the derived energy landscape are made using an equation of motion from the steepest-entropy-ascent quantum thermodynamic (SEAQT) framework. The SEAQT framework makes it possible to determine the unique kinetic paths from an arbitrary quasi-equilibrium or non-equilibrium initial state to stable equilibrium. Calculations performed with SEAQT require significantly reduced computational time versus comparable Monte Carlo simulations while providing otherwise unavailable thermodynamic and structural properties. Expected values for state averaged structural parameters are used to produce representative reconstructions of the calculated state-based evolution. Results show continuous transitions between states with no distinct folding phases. Changes in chain conformations during heating and cooling are more drastic along non-equilibrium paths than along quasi-equilibrium paths. In addition, SEAQT-derived kinetics are compared to experimentally derived intensity profiles describing the kinetics of the cytochrome c protein using Rouse dynamic relations.
△ Less
Submitted 29 January, 2023; v1 submitted 19 August, 2022;
originally announced August 2022.
-
Realizability conditions for relativistic gases with a non-zero heat flux
Authors:
Stefano Boccelli,
James G. McDonald
Abstract:
This work introduces a limitation on the minimum value that can be assumed by the energy of a relativistic gas in the presence of a non-zero heat flux. Such a limitation arises from the non-negativity of the particle distribution function, and is found by solving the Hamburger moment problem. The resulting limitation is seen to recover the Taub inequality in the case of a zero heat flux, but is mo…
▽ More
This work introduces a limitation on the minimum value that can be assumed by the energy of a relativistic gas in the presence of a non-zero heat flux. Such a limitation arises from the non-negativity of the particle distribution function, and is found by solving the Hamburger moment problem. The resulting limitation is seen to recover the Taub inequality in the case of a zero heat flux, but is more strict if a non-zero heat flux is considered. These results imply that, in order for the distribution function to be non-negative, (i) the energy of a gas must be larger than a minimum threshold; (ii) the heat flux, on the other hand, has a maximum value determined by the energy and the pressure tensor; and (iii) there exists an upper limit for the the adiabatic index $Γ$ of the relativistic equation of state, and that limit decreases in the presence of a heat flux and pressure anisotropy, asymptoting to a value $Γ= 1$. The latter point implies that the Synge equation of state is formally incompatible with a relativistic gas showing a heat flux, except in certain gas states.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Topological duality for orthomodular lattices
Authors:
Joseph McDonald,
Katalin Bimbó
Abstract:
A class of ordered relational topological spaces is described, which we call orthomodular spaces. Our construction of these spaces involves adding a topology to the class of orthomodular frames introduced by Hartonas, along the lines of Bimbó's topologization of the class of orthoframes employed by Goldblatt in his representation of ortholattices. We then prove that the category of orthomodular la…
▽ More
A class of ordered relational topological spaces is described, which we call orthomodular spaces. Our construction of these spaces involves adding a topology to the class of orthomodular frames introduced by Hartonas, along the lines of Bimbó's topologization of the class of orthoframes employed by Goldblatt in his representation of ortholattices. We then prove that the category of orthomodular lattices and homomorphisms is dually equivalent to the category of orthomodular spaces and certain continuous frame morphisms, which we call continuous weak p-morphisms. It is well-known that orthomodular lattices provide an algebraic semantics for the quantum logic Q. Hence, as an application of our duality, we develop a topological semantics for Q using orthomodular spaces and prove soundness and completeness.
△ Less
Submitted 25 April, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.
-
Higgs Inflation Using the Unstable Standard Model Potential
Authors:
John McDonald
Abstract:
It is likely that the Higgs potential of the Standard Model is unstable, turning negative at $φ< Λ\sim 10^{10}$ GeV. Here we consider whether it is possible to have Higgs Inflation on the positive stable region of the potential at $φ< Λ$. To do this we add a non-minimally coupled induced gravity sector with scalar $χ$ to the Standard Model. For an appropriate form for the non-minimal coupling of…
▽ More
It is likely that the Higgs potential of the Standard Model is unstable, turning negative at $φ< Λ\sim 10^{10}$ GeV. Here we consider whether it is possible to have Higgs Inflation on the positive stable region of the potential at $φ< Λ$. To do this we add a non-minimally coupled induced gravity sector with scalar $χ$ to the Standard Model. For an appropriate form for the non-minimal coupling of $χ$, we show that it is possible to have conventional Higgs inflation at small $φ< Λ$ if the effective Planck mass in the Jordan frame during inflation is sufficiently small, with a phase transition to $χ\neq 0$ at the end of Higgs inflation which increases the Jordan frame Planck mass to its presently observed value. In the Einstein frame this corresponds to a suppression of the Higgs kinetic and potential term at the end of inflation. We show that the predictions of Higgs inflation at tree level are unaltered from conventional Higgs Inflation, with the exception of the magnitude of the Higgs field during inflation. Hence Higgs Inflation can be achieved using the potential of the unmodified Standard Model.
△ Less
Submitted 13 September, 2023; v1 submitted 8 August, 2022;
originally announced August 2022.
-
sparsegl: An R Package for Estimating Sparse Group Lasso
Authors:
Xiaoxuan Liang,
Aaron Cohen,
Anibal Solón Heinsfeld,
Franco Pestilli,
Daniel J. McDonald
Abstract:
The sparse group lasso is a high-dimensional regression technique that is useful for problems whose predictors have a naturally grouped structure and where sparsity is encouraged at both the group and individual predictor level. In this paper we discuss a new R package for computing such regularized models. The intention is to provide highly optimized solution routines enabling analysis of very la…
▽ More
The sparse group lasso is a high-dimensional regression technique that is useful for problems whose predictors have a naturally grouped structure and where sparsity is encouraged at both the group and individual predictor level. In this paper we discuss a new R package for computing such regularized models. The intention is to provide highly optimized solution routines enabling analysis of very large datasets, especially in the context of sparse design matrices.
△ Less
Submitted 30 October, 2023; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Superradiance in Stars: Non-equilibrium approach to dam** of fields in stellar media
Authors:
Francesca Chadha-Day,
Björn Garbrecht,
Jamie McDonald
Abstract:
Superradiance in black holes is well-understood but a general treatment for superradiance in stars has until now been lacking. This is surprising given the ease with which we can observe isolated neutron stars and the array of signatures which would result from stellar superradiance. In this work, we present the first systematic pipeline for computing superradiance rates in rotating stars. Our met…
▽ More
Superradiance in black holes is well-understood but a general treatment for superradiance in stars has until now been lacking. This is surprising given the ease with which we can observe isolated neutron stars and the array of signatures which would result from stellar superradiance. In this work, we present the first systematic pipeline for computing superradiance rates in rotating stars. Our method can be used with any Lagrangian describing the interaction between the superradiant field and the constituents of the star. Our scheme falls into two parts: firstly we show how field theory at finite density can be used to express the absorption of long wavelength modes into the star in terms of microphsyical scattering processes. This allows us to derive a damped equation of motion for the bosonic field. We then feed this into an effective theory for long wavelengths (the so-called worldline formalism) to describe the amplification of superradiant modes of arbitrary multipole moment for a rapidly rotating star. Our method places stellar superradiance on a firm theoretical footing and allows the calculation of the superradiance rate arising from any interaction between a bosonic field and stellar matter.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge
Authors:
Saravanabalagi Ramachandran,
Ganesh Sistu,
Varun Ravi Kumar,
John McDonald,
Senthil Yogamani
Abstract:
Object detection is a comprehensively studied problem in autonomous driving. However, it has been relatively less explored in the case of fisheye cameras. The strong radial distortion breaks the translation invariance inductive bias of Convolutional Neural Networks. Thus, we present the WoodScape fisheye object detection challenge for autonomous driving which was held as part of the CVPR 2022 Work…
▽ More
Object detection is a comprehensively studied problem in autonomous driving. However, it has been relatively less explored in the case of fisheye cameras. The strong radial distortion breaks the translation invariance inductive bias of Convolutional Neural Networks. Thus, we present the WoodScape fisheye object detection challenge for autonomous driving which was held as part of the CVPR 2022 Workshop on Omnidirectional Computer Vision (OmniCV). This is one of the first competitions focused on fisheye camera object detection. We encouraged the participants to design models which work natively on fisheye images without rectification. We used CodaLab to host the competition based on the publicly available WoodScape fisheye dataset. In this paper, we provide a detailed analysis on the competition which attracted the participation of 120 global teams and a total of 1492 submissions. We briefly discuss the details of the winning methods and analyze their qualitative and quantitative results.
△ Less
Submitted 26 June, 2022;
originally announced June 2022.
-
ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Authors:
Pramit Dutta,
Ganesh Sistu,
Senthil Yogamani,
Edgar Galván,
John McDonald
Abstract:
Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream ta…
▽ More
Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream tasks. Current state-of-the art approaches to generate BEV-maps employ a Convolutional Neural Network (CNN) backbone to create feature-maps which are passed through a spatial transformer to project the derived features onto the BEV coordinate frame. In this paper, we evaluate the use of vision transformers (ViT) as a backbone architecture to generate BEV maps. Our network architecture, ViT-BEVSeg, employs standard vision transformers to generate a multi-scale representation of the input image. The resulting representation is then provided as an input to a spatial transformer decoder module which outputs segmentation maps in the BEV grid. We evaluate our approach on the nuScenes dataset demonstrating a considerable improvement in the performance relative to state-of-the-art approaches.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models
Authors:
Joseph McDonald,
Baolin Li,
Nathan Frey,
Devesh Tiwari,
Vijay Gadepally,
Siddharth Samsi
Abstract:
The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In pa…
▽ More
The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In particular, we focus on techniques to measure energy usage and different hardware and datacenter-oriented settings that can be tuned to reduce energy consumption for training and inference for language models. We characterize the impact of these settings on metrics such as computational performance and energy consumption through experiments conducted on a high performance computing system as well as popular cloud computing platforms. These techniques can lead to significant reduction in energy consumption when training language models or their use for inference. For example, power-cap**, which limits the maximum power a GPU can consume, can enable a 15\% decrease in energy usage with marginal increase in overall computation time when training a transformer-based language model.
△ Less
Submitted 19 May, 2022;
originally announced May 2022.
-
The MIT Supercloud Workload Classification Challenge
Authors:
Benny J. Tang,
Qiqi Chen,
Matthew L. Weiss,
Nathan Frey,
Joseph McDonald,
David Bestor,
Charles Yee,
William Arcand,
Chansup Byun,
Daniel Edelman,
Matthew Hubbell,
Michael Jones,
Jeremy Kepner,
Anna Klein,
Adam Michaleas,
Peter Michaleas,
Lauren Milechin,
Julia Mullen,
Andrew Prout,
Albert Reuther,
Antonio Rosa,
Andrew Bowne,
Lindsey McEvoy,
Baolin Li,
Devesh Tiwari
, et al. (2 additional authors not shown)
Abstract:
High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute…
▽ More
High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute workloads and their utilization characteristics, HPC systems may be able to better match available resources with the application demand. By leveraging datacenter instrumentation, it may be possible to develop AI-based approaches that can identify workloads and provide feedback to researchers and datacenter operators for improving operational efficiency. To enable this research, we released the MIT Supercloud Dataset, which provides detailed monitoring logs from the MIT Supercloud cluster. This dataset includes CPU and GPU usage by jobs, memory usage, and file system logs. In this paper, we present a workload classification challenge based on this dataset. We introduce a labelled dataset that can be used to develop new approaches to workload classification and present initial results based on existing approaches. The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads that can achieve higher accuracy than existing methods. Data and code will be made publicly available via the Datacenter Challenge website : https://dcc.mit.edu.
△ Less
Submitted 13 April, 2022; v1 submitted 12 April, 2022;
originally announced April 2022.
-
Dark Matter In Extreme Astrophysical Environments
Authors:
Masha Baryakhtar,
Regina Caputo,
Djuna Croon,
Kerstin Perez,
Emanuele Berti,
Joseph Bramante,
Malte Buschmann,
Richard Brito,
Thomas Y. Chen,
Philippa S. Cole,
Adam Coogan,
William E. East,
Joshua W. Foster,
Marios Galanis,
Maurizio Giannotti,
Bradley J. Kavanagh,
Ranjan Laha,
Rebecca K. Leane,
Benjamin V. Lehmann,
Gustavo Marques-Tavares,
Jamie McDonald,
Ken K. Y. Ng,
Nirmal Raj,
Laura Sagunski,
Jeremy Sakstein
, et al. (15 additional authors not shown)
Abstract:
Exploring dark matter via observations of extreme astrophysical environments -- defined here as heavy compact objects such as white dwarfs, neutron stars, and black holes, as well as supernovae and compact object merger events -- has been a major field of growth since the last Snowmass process. Theoretical work has highlighted the utility of current and near-future observatories to constrain novel…
▽ More
Exploring dark matter via observations of extreme astrophysical environments -- defined here as heavy compact objects such as white dwarfs, neutron stars, and black holes, as well as supernovae and compact object merger events -- has been a major field of growth since the last Snowmass process. Theoretical work has highlighted the utility of current and near-future observatories to constrain novel dark matter parameter space across the full mass range. This includes gravitational wave instruments and observatories spanning the electromagnetic spectrum, from radio to gamma-rays. While recent searches already provide leading sensitivity to various dark matter models, this work also highlights the need for theoretical astrophysics research to better constrain the properties of these extreme astrophysical systems. The unique potential of these search signatures to probe dark matter adds motivation to proposed next-generation astronomical and gravitational wave instruments.
△ Less
Submitted 7 November, 2022; v1 submitted 15 March, 2022;
originally announced March 2022.