Search | arXiv e-print repository

Evaluating Ensemble Methods for News Recommender Systems

Abstract: News recommendation is crucial for facilitating individuals' access to articles, particularly amid the increasingly digital landscape of news consumption. Consequently, extensive research is dedicated to News Recommender Systems (NRS) with increasingly sophisticated algorithms. Despite this sustained scholarly inquiry, there exists a notable research gap regarding the potential synergy achievable… ▽ More News recommendation is crucial for facilitating individuals' access to articles, particularly amid the increasingly digital landscape of news consumption. Consequently, extensive research is dedicated to News Recommender Systems (NRS) with increasingly sophisticated algorithms. Despite this sustained scholarly inquiry, there exists a notable research gap regarding the potential synergy achievable by amalgamating these algorithms to yield superior outcomes. This paper endeavours to address this gap by demonstrating how ensemble methods can be used to combine many diverse state-of-the-art algorithms to achieve superior results on the Microsoft News dataset (MIND). Additionally, we identify scenarios where ensemble methods fail to improve results and offer explanations for this occurrence. Our findings demonstrate that a combination of NRS algorithms can outperform individual algorithms, provided that the base learners are sufficiently diverse, with improvements of up to 5\% observed for an ensemble consisting of a content-based BERT approach and the collaborative filtering LSTUR algorithm. Additionally, our results demonstrate the absence of any improvement when combining insufficiently distinct methods. These findings provide insight into successful approaches of ensemble methods in NRS and advocates for the development of better systems through appropriate ensemble solutions. △ Less

Submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.14483 [pdf, other]

Valid Error Bars for Neural Weather Models using Conformal Prediction

Authors: Vignesh Gopakumar, Joel Oskarrson, Ander Gray, Lorenzo Zanisi, Stanislas Pamela, Daniel Giles, Matt Kusner, Marc Deisenroth

Abstract: Neural weather models have shown immense potential as inexpensive and accurate alternatives to physics-based models. However, most models trained to perform weather forecasting do not quantify the uncertainty associated with their forecasts. This limits the trust in the model and the usefulness of the forecasts. In this work we construct and formalise a conformal prediction framework as a post-pro… ▽ More Neural weather models have shown immense potential as inexpensive and accurate alternatives to physics-based models. However, most models trained to perform weather forecasting do not quantify the uncertainty associated with their forecasts. This limits the trust in the model and the usefulness of the forecasts. In this work we construct and formalise a conformal prediction framework as a post-processing method for estimating this uncertainty. The method is model-agnostic and gives calibrated error bounds for all variables, lead times and spatial locations. No modifications are required to the model and the computational cost is negligible compared to model training. We demonstrate the usefulness of the conformal prediction framework on a limited area neural weather model for the Nordic region. We further explore the advantages of the framework for deterministic and probabilistic models. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.11937 [pdf, other]

Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated. △ Less

Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JINST

arXiv:2405.02350 [pdf, ps, other]

What makes Models Compositional? A Theoretical View: With Supplement

Authors: Parikshit Ram, Tim Klinger, Alexander G. Gray

Abstract: Compositionality is thought to be a key component of language, and various compositional benchmarks have been developed to empirically probe the compositional generalization of existing sequence processing models. These benchmarks often highlight failures of existing models, but it is not clear why these models fail in this way. In this paper, we seek to theoretically understand the role the compo… ▽ More Compositionality is thought to be a key component of language, and various compositional benchmarks have been developed to empirically probe the compositional generalization of existing sequence processing models. These benchmarks often highlight failures of existing models, but it is not clear why these models fail in this way. In this paper, we seek to theoretically understand the role the compositional structure of the models plays in these failures and how this structure relates to their expressivity and sample complexity. We propose a general neuro-symbolic definition of compositional functions and their compositional complexity. We then show how various existing general and special purpose sequence processing models (such as recurrent, convolution and attention-based ones) fit this definition and use it to analyze their compositional complexity. Finally, we provide theoretical guarantees for the expressivity and systematic generalization of compositional models that explicitly depend on our proposed definition and highlighting factors which drive poor empirical performance. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: Extended version of the original IJCAI 2024 paper with detailed supplementary materials (27 pages, 7 figures)

arXiv:2405.01445 [pdf]

Depth-resolved profile of the interfacial ferromagnetism in $CaMnO_{3}/CaRuO_{3}$ superlattices

Authors: J. R. Paudel, A. Mansouri Tehrani, M. Terilli, M. Kareev, J. Grassi, R. K. Sah, L. Wu, V. N. Strocov, C. Klewe, P. Shafer, J. Chakhalian, N. A. Spaldin, A. X. Gray

Abstract: Emergent magnetic phenomena at interfaces represent a frontier in materials science, pivotal for advancing technologies in spintronics and magnetic storage. In this letter, we utilize a suite of advanced X-ray spectroscopic and scattering techniques to investigate emergent interfacial ferromagnetism in oxide superlattices comprised of antiferromagnetic CaMnO3 and paramagnetic CaRuO3. Our findings… ▽ More Emergent magnetic phenomena at interfaces represent a frontier in materials science, pivotal for advancing technologies in spintronics and magnetic storage. In this letter, we utilize a suite of advanced X-ray spectroscopic and scattering techniques to investigate emergent interfacial ferromagnetism in oxide superlattices comprised of antiferromagnetic CaMnO3 and paramagnetic CaRuO3. Our findings challenge prior theoretical models by demonstrating that the ferromagnetism extends beyond the interfacial layer into multiple unit cells of CaMnO3 and exhibits an asymmetric profile. Complementary density functional calculations reveal that the interfacial ferromagnetism is driven by the double exchange mechanism, facilitated by charge transfer from Ru to Mn ions. Additionally, defect chemistry, particularly the presence of oxygen vacancies, likely plays a crucial role in modifying the magnetic moments at the interface, leading to the observed asymmetry between the top and bottom CaMnO3 interfacial magnetic layers. Our findings underscore the potential of manipulating interfacial ferromagnetism through point defect engineering. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2403.17394 [pdf, other]

IQMDose3D: a software tool for reconstructing the dose in patient using patient planning CT images and the signals measured by IQM detector

Authors: Aitang Xing, Gary Goozee, Alison Gray, Vaughan Moutrie, Sankar Arumugam, Shrikant Deshpande, Anthony Espinoza, Vasilis Kondilis, Marjorie McDonald, Philip Vial

Abstract: The integral quality monitor (IQM) system compares the signal measured with a large volume chamber mounted to the linear accelerator's head to the signal calculated using the patient DICOM RT plan for patient-specific quality assurance (PSQA). A method was developed to reconstruct the dose in patients using the signal measured by IQM chamber and patient planning CT images. A software tool named IQ… ▽ More The integral quality monitor (IQM) system compares the signal measured with a large volume chamber mounted to the linear accelerator's head to the signal calculated using the patient DICOM RT plan for patient-specific quality assurance (PSQA). A method was developed to reconstruct the dose in patients using the signal measured by IQM chamber and patient planning CT images. A software tool named IQMDose3D was implemented to automate this procedure and integrated into the IQM-based PSQA workflow. IQMDose3D enables the physicists to evaluate PSQA by focusing on the clinical perspective by comparing the delivered plan to the approved clinical plan in terms of the clinical goals, dose-volume histogram (DVH) in addition to the three-dimensional (3D) gamma map and gamma pass rate. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by ICCR 2024 conference

arXiv:2403.17365 [pdf, other]

AutoMRISimQA: an automated system for daily quality control of a 3T MRI simulator

Authors: Aitang Xing, Gary Goozee, Gary Liney, Sankar Arumugam, Shrikant Deshpande, Anthony Espinoza, Alison Gray, Vasilis Kondilis, Doaa Elwadia, Robba Rai, Lois Holloway

Abstract: A software system named AutoMRISimQA was developed to monitor the daily performance of a wide-bore 3T scanner(MRI) which was designed and dedicated to radiotherapy simulation. The system can monitor the performance of the MRI simulator not only by using image quality indices such as signal-to-noise ratio (SNR), uniformity, ghosting and contrast but also performing a quick check of geometry accurac… ▽ More A software system named AutoMRISimQA was developed to monitor the daily performance of a wide-bore 3T scanner(MRI) which was designed and dedicated to radiotherapy simulation. The system can monitor the performance of the MRI simulator not only by using image quality indices such as signal-to-noise ratio (SNR), uniformity, ghosting and contrast but also performing a quick check of geometry accuracy as well as the external lasers quantitatively. It was implemented into the daily clinically workflow in 2013 and has been used for more than 10 years. It was also seamlessly integrated with QAtrack, allowing continuous monitoring of the consistency of the MRI simulator's performance. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by conference ICCR 2024

arXiv:2403.16887 [pdf]

ChatGPT "contamination": estimating the prevalence of LLMs in the scholarly literature

Authors: Andrew Gray

Abstract: The use of ChatGPT and similar Large Language Model (LLM) tools in scholarly communication and academic publishing has been widely discussed since they became easily accessible to a general audience in late 2022. This study uses keywords known to be disproportionately present in LLM-generated text to provide an overall estimate for the prevalence of LLM-assisted writing in the scholarly literature… ▽ More The use of ChatGPT and similar Large Language Model (LLM) tools in scholarly communication and academic publishing has been widely discussed since they became easily accessible to a general audience in late 2022. This study uses keywords known to be disproportionately present in LLM-generated text to provide an overall estimate for the prevalence of LLM-assisted writing in the scholarly literature. For the publishing year 2023, it is found that several of those keywords show a distinctive and disproportionate increase in their prevalence, individually and in combination. It is estimated that at least 60,000 papers (slightly over 1% of all articles) were LLM-assisted, though this number could be extended and refined by analysis of other characteristics of the papers or by identification of further indicative keywords. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 12 pages, 6 figures

arXiv:2403.13006 [pdf, other]

Gaussian process surrogate models for the properties of micro-tearing modes in spherical tokamaks

Authors: William Hornsby, Ander Gray, James Buchanan, Daniel Kenndy, Bhavin Patel, Francis Casson, Colin Roach, Mikkel Lykkegaard, Huy Nguyen, Nikolaos Papadimas, Ben Fourcin, Jordan Hart

Abstract: Spherical tokamaks (STs) have many desirable features that make them a suitable choice for fusion power plants. To understand their confinement properties, accurate calculation of turbulent micro-instabilities is necessary for tokamak design. Presented is a novel surrogate model for Micro-tearing modes (MTMs), the micro-instability thought to be dominant in high beta STs. Direct numerical calculat… ▽ More Spherical tokamaks (STs) have many desirable features that make them a suitable choice for fusion power plants. To understand their confinement properties, accurate calculation of turbulent micro-instabilities is necessary for tokamak design. Presented is a novel surrogate model for Micro-tearing modes (MTMs), the micro-instability thought to be dominant in high beta STs. Direct numerical calculation of micro-instabilities is computationally expensive and is a significant bottleneck in integrated plasma modelling. The considerable number of geometric and thermodynamic parameters, the interactions that influence these coefficients and the resolutions needed to accurately resolve these modes, makes direct numerical simulation for parameter space exploration computationally extremely challenging. However, this and the dearth of accurate reduced physics models for MTMs makes it suitable for surrogate modelling using Gaussian Process Regression, a modern machine learning technique. This paper outlines the further development of a data-driven reduced-order model across a spherical tokamak reactor-relevant parameter space utilising Gaussian Process Regression (GPR) and classification; techniques from machine learning. To build the original simple GP model these two components were used in an active learning loop to maximise the efficiency of data acquisition thus minimising computational cost. The `simple' GP was seen to show a plateau of fidelity with more data and to be under-confident, particular in areas of parameter space close to marginal stability. It is postulated that the presence of multiple sub-types of MTM could be the root cause, with the underlying function being less smooth than expected. An expansion of the model using clustering algorithms to find optimal sub models using a mixture of experts approach is shown to greatly improve the variances in the outputs of the GP model. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: Submitted to the IAEA FEC 2023 conference

arXiv:2402.13440 [pdf, other]

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

Authors: Chitra Subramanian, Miao Liu, Naweed Khan, Jonathan Lenchner, Aporva Amarnath, Sarathkrishna Swaminathan, Ryan Riegel, Alexander Gray

Abstract: Multi-agent reinforcement learning (MARL) is well-suited for runtime decision-making in optimizing the performance of systems where multiple agents coexist and compete for shared resources. However, applying common deep learning-based MARL solutions to real-world problems suffers from issues of interpretability, sample efficiency, partial observability, etc. To address these challenges, we present… ▽ More Multi-agent reinforcement learning (MARL) is well-suited for runtime decision-making in optimizing the performance of systems where multiple agents coexist and compete for shared resources. However, applying common deep learning-based MARL solutions to real-world problems suffers from issues of interpretability, sample efficiency, partial observability, etc. To address these challenges, we present an event-driven formulation, where decision-making is handled by distributed co-operative MARL agents using neuro-symbolic methods. The recently introduced neuro-symbolic Logical Neural Networks (LNN) framework serves as a function approximator for the RL, to train a rules-based policy that is both logical and interpretable by construction. To enable decision-making under uncertainty and partial observability, we developed a novel probabilistic neuro-symbolic framework, Probabilistic Logical Neural Networks (PLNN), which combines the capabilities of logical reasoning with probabilistic graphical models. In PLNN, the upward/downward inference strategy, inherited from LNN, is coupled with belief bounds by setting the activation function for the logical operator associated with each neural network node to a probability-respecting generalization of the Fréchet inequalities. These PLNN nodes form the unifying element that combines probabilistic logic and Bayes Nets, permitting inference for variables with unobserved states. We demonstrate our contributions by addressing key MARL challenges for power sharing in a system-on-chip application. △ Less

Submitted 20 February, 2024; originally announced February 2024.

ACM Class: I.2.6

arXiv:2402.04302 [pdf]

Ultrafast terahertz field control of the emergent magnetic and electronic interactions at oxide interfaces

Authors: A. M. Derrico, M. Basini, V. Unikandanunni, J. R. Paudel, M. Kareev, M. Terilli, T. -C. Wu, A. Alostaz, C. Klewe, P. Shafer, A. Gloskovskii, C. Schlueter, C. M. Schneider, J. Chakhalian, S. Bonetti, A. X. Gray

Abstract: Ultrafast electric-field control of emergent electronic and magnetic states at oxide interfaces offers exciting prospects for the development of new generations of energy-efficient devices. Here, we demonstrate that the electronic structure and emergent ferromagnetic interfacial state in epitaxial LaNiO3/CaMnO3 superlattices can be effectively controlled using intense single-cycle THz electric-fie… ▽ More Ultrafast electric-field control of emergent electronic and magnetic states at oxide interfaces offers exciting prospects for the development of new generations of energy-efficient devices. Here, we demonstrate that the electronic structure and emergent ferromagnetic interfacial state in epitaxial LaNiO3/CaMnO3 superlattices can be effectively controlled using intense single-cycle THz electric-field pulses. We employ a combination of polarization-dependent X-ray absorption spectroscopy with magnetic circular dichroism and X-ray resonant magnetic reflectivity to measure a detailed magneto-optical profile and thickness of the ferromagnetic interfacial layer. Then, we use time-resolved and temperature-dependent magneto-optical Kerr effect, along with transient optical reflectivity and transmissivity measurements, to disentangle multiple correlated electronic and magnetic processes driven by ultrafast high-field (~1 MV/cm) THz pulses. These processes include an initial sub-picosecond electronic response, consistent with non-equilibrium Joule heating; a rapid (~270 fs) demagnetization of the ferromagnetic interfacial layer, driven by THz-field-induced nonequilibrium spin-polarized currents; and subsequent multi-picosecond dynamics, possibly indicative of a change in the magnetic state of the superlattice due to the transfer of spin angular momentum to the lattice. Our findings shed light on the intricate interplay of electronic and magnetic phenomena in this strongly correlated material system, suggesting a promising avenue for efficient control of two-dimensional ferromagnetic states at oxide interfaces using ultrafast electric-field pulses. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2311.05967 [pdf, other]

doi 10.1088/1741-4326/ad313a

Plasma Surrogate Modelling using Fourier Neural Operators

Authors: Vignesh Gopakumar, Stanislas Pamela, Lorenzo Zanisi, Zongyi Li, Ander Gray, Daniel Brennand, Nitesh Bhatia, Gregory Stathopoulos, Matt Kusner, Marc Peter Deisenroth, Anima Anandkumar, JOREK Team, MAST Team

Abstract: Predicting plasma evolution within a Tokamak reactor is crucial to realizing the goal of sustainable fusion. Capabilities in forecasting the spatio-temporal evolution of plasma rapidly and accurately allow us to quickly iterate over design and control strategies on current Tokamak devices and future reactors. Modelling plasma evolution using numerical solvers is often expensive, consuming many hou… ▽ More Predicting plasma evolution within a Tokamak reactor is crucial to realizing the goal of sustainable fusion. Capabilities in forecasting the spatio-temporal evolution of plasma rapidly and accurately allow us to quickly iterate over design and control strategies on current Tokamak devices and future reactors. Modelling plasma evolution using numerical solvers is often expensive, consuming many hours on supercomputers, and hence, we need alternative inexpensive surrogate models. We demonstrate accurate predictions of plasma evolution both in simulation and experimental domains using deep learning-based surrogate modelling tools, viz., Fourier Neural Operators (FNO). We show that FNO has a speedup of six orders of magnitude over traditional solvers in predicting the plasma dynamics simulated from magnetohydrodynamic models, while maintaining a high accuracy (MSE in the normalised domain $\approx$ $10^{-5}$). Our modified version of the FNO is capable of solving multi-variable Partial Differential Equations (PDE), and can capture the dependence among the different variables in a single model. FNOs can also predict plasma evolution on real-world experimental data observed by the cameras positioned within the MAST Tokamak, i.e., cameras looking across the central solenoid and the divertor in the Tokamak. We show that FNOs are able to accurately forecast the evolution of plasma and have the potential to be deployed for real-time monitoring. We also illustrate their capability in forecasting the plasma shape, the locations of interactions of the plasma with the central solenoid and the divertor for the full (available) duration of the plasma shot within MAST. The FNO offers a viable alternative for surrogate modelling as it is quick to train and infer, and requires fewer data points, while being able to do zero-shot super-resolution and getting high-fidelity solutions. △ Less

Submitted 18 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

Journal ref: Nucl. Fusion 64 056025 (2024)

arXiv:2309.16467 [pdf, other]

Compositional Program Generation for Few-Shot Systematic Generalization

Authors: Tim Klinger, Luke Liu, Soham Dan, Maxwell Crouse, Parikshit Ram, Alexander Gray

Abstract: Compositional generalization is a key ability of humans that enables us to learn new concepts from only a handful examples. Neural machine learning models, including the now ubiquitous Transformers, struggle to generalize in this way, and typically require thousands of examples of a concept during training in order to generalize meaningfully. This difference in ability between humans and artificia… ▽ More Compositional generalization is a key ability of humans that enables us to learn new concepts from only a handful examples. Neural machine learning models, including the now ubiquitous Transformers, struggle to generalize in this way, and typically require thousands of examples of a concept during training in order to generalize meaningfully. This difference in ability between humans and artificial neural architectures, motivates this study on a neuro-symbolic architecture called the Compositional Program Generator (CPG). CPG has three key features: \textit{modularity}, \textit{composition}, and \textit{abstraction}, in the form of grammar rules, that enable it to generalize both systematically to new concepts in a few-shot manner, as well as productively by length on various sequence-to-sequence language tasks. For each input, CPG uses a grammar of the input language and a parser to generate a parse in which each grammar rule is assigned its own unique semantic module, a probabilistic copy or substitution program. Instances with the same parse are always processed with the same composed modules, while those with different parses may be processed with different modules. CPG learns parameters for the modules and is able to learn the semantics for new rules and types incrementally, without forgetting or retraining on rules it's already seen. It achieves perfect generalization on both the SCAN and COGS benchmarks using just 14 examples for SCAN and 22 examples for COGS -- state-of-the-art accuracy with a 1000x improvement in sample efficiency. △ Less

Submitted 18 January, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: 7 pages of text with 1 page of references

arXiv:2309.09785 [pdf, other]

Gaussian Process Regression models for the properties of micro-tearing modes in spherical tokamak

Authors: William Hornsby, Ander Gray, James Buchanan, Bhavin Patel, Daniel Kennedy, Francis Casson, Colin Roach, Mikkel Lykkegaard, Huy Nguyen, Nikolaos Papadimas, Ben Fourcin, Jordan Hart

Abstract: Spherical tokamaks (STs) have many desirable features that make them an attractive choice for a future fusion power plant. Power plant viability is intrinsically related to plasma heat and particle confinement and this is often determined by the level of micro-instability driven turbulence. Accurate calculation of the properties of turbulent micro-instabilities is therefore critical for tokamak de… ▽ More Spherical tokamaks (STs) have many desirable features that make them an attractive choice for a future fusion power plant. Power plant viability is intrinsically related to plasma heat and particle confinement and this is often determined by the level of micro-instability driven turbulence. Accurate calculation of the properties of turbulent micro-instabilities is therefore critical for tokamak design, however, the evaluation of these properties is computationally expensive. The considerable number of geometric and thermodynamic parameters and the high resolutions required to accurately resolve these instabilities makes repeated use of direct numerical simulations in integrated modelling workflows extremely computationally challenging and creates the need for fast, accurate, reduced-order models. This paper outlines the development of a data-driven reduced-order model, often termed a {\it surrogate model} for the properties of micro-tearing modes (MTMs) across a spherical tokamak reactor-relevant parameter space utilising Gaussian Process Regression (GPR) and classification; techniques from machine learning. These two components are used in an active learning loop to maximise the efficiency of data acquisition thus minimising computational cost. The high-fidelity gyrokinetic code GS2 is used to calculate the linear properties of the MTMs: the mode growth rate, frequency and normalised electron heat flux; core components of a quasi-linear transport model. Five-fold cross-validation and direct validation on unseen data is used to ascertain the performance of the resulting surrogate models. △ Less

Submitted 14 November, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

arXiv:2308.13292 [pdf, other]

A Bayesian Active Learning Approach to Comparative Judgement

Authors: Andy Gray, Alma Rahat, Tom Crick, Stephen Lindsay

Abstract: Assessment is a crucial part of education. Traditional marking is a source of inconsistencies and unconscious bias, placing a high cognitive load on the assessors. An approach to address these issues is comparative judgement (CJ). In CJ, the assessor is presented with a pair of items and is asked to select the better one. Following a series of comparisons, a rank is derived using a ranking model,… ▽ More Assessment is a crucial part of education. Traditional marking is a source of inconsistencies and unconscious bias, placing a high cognitive load on the assessors. An approach to address these issues is comparative judgement (CJ). In CJ, the assessor is presented with a pair of items and is asked to select the better one. Following a series of comparisons, a rank is derived using a ranking model, for example, the BTM, based on the results. While CJ is considered a reliable method for marking, there are concerns around transparency, and the ideal number of pairwise comparisons to generate a reliable estimation of the rank order is not known. Additionally, there have been attempts to generate a method of selecting pairs that should be compared next in an informative manner, but some existing methods are known to have created their own bias within results inflating the reliability metric used. As a result, a random selection approach is usually deployed. We propose a novel Bayesian approach to CJ (BCJ) for determining the ranks of compared items alongside a new way to select the pairs to present to the marker(s) using active learning (AL), addressing the key shortcomings of traditional CJ. Furthermore, we demonstrate how the entire approach may provide transparency by providing the user insights into how it is making its decisions and, at the same time, being more efficient. Results from our experiments confirm that the proposed BCJ combined with entropy-driven AL pair-selection method is superior to other alternatives. We also find that the more comparisons done, the more accurate BCJ becomes, which solves the issue the current method has of the model deteriorating if too many comparisons are performed. As our approach can generate the complete predicted rank distribution for an item, we also show how this can be utilised in devising a predicted grade, guided by the assessor. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: 16 pages

arXiv:2308.05796 [pdf, other]

doi 10.1103/PhysRevB.109.075169

Crystalline-electromagnetic responses of higher order topological semimetals

Authors: Mark R. Hirsbrunner, Alexander D. Gray, Taylor L. Hughes

Abstract: Previous work has shown that time-reversal symmetric Weyl semimetals with a quadrupolar arrangement of first-order Weyl nodes exhibit a mixed crystalline-electromagnetic response. For systems with higher order Weyl nodes, which are attached to both surface and hinge Fermi arcs, additional phenomena appear on surfaces of codimension $n>1$, such as electromagnetic responses of the hinges. Here we co… ▽ More Previous work has shown that time-reversal symmetric Weyl semimetals with a quadrupolar arrangement of first-order Weyl nodes exhibit a mixed crystalline-electromagnetic response. For systems with higher order Weyl nodes, which are attached to both surface and hinge Fermi arcs, additional phenomena appear on surfaces of codimension $n>1$, such as electromagnetic responses of the hinges. Here we construct a model possessing a quadrupole of higher order Weyl nodes to study the interplay between higher order topology and mixed crystalline-electromagnetic responses. We show that the higher order nature of the Weyl nodes yields a dipole of Dirac nodes on certain surfaces, leading to a mixed crystalline-electromagnetic \emph{surface} response that binds charge to dislocations and momentum-density to magnetic fields. In addition, we show that the model possesses a bulk quadrupole moment of crystal-momentum that provides a link between the bulk and surface responses of the system. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Report number: Phys. Rev. B 109, 075169

arXiv:2307.02689 [pdf, other]

Learning Symbolic Rules over Abstract Meaning Representations for Textual Reinforcement Learning

Authors: Subhajit Chaudhury, Sarathkrishna Swaminathan, Daiki Kimura, Prithviraj Sen, Keerthiram Murugesan, Rosario Uceda-Sosa, Michiaki Tatsubori, Achille Fokoue, Pavan Kapanipathi, Asim Munawar, Alexander Gray

Abstract: Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games. On the other hand, neuro-symbolic methods, specifically those that leverage an intermediate formal representation, are gaining significant attention in language understanding tasks. Th… ▽ More Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games. On the other hand, neuro-symbolic methods, specifically those that leverage an intermediate formal representation, are gaining significant attention in language understanding tasks. This is because of their advantages ranging from inherent interpretability, the lesser requirement of training data, and being generalizable in scenarios with unseen data. Therefore, in this paper, we propose a modular, NEuro-Symbolic Textual Agent (NESTA) that combines a generic semantic parser with a rule induction system to learn abstract interpretable rules as policies. Our experiments on established text-based game benchmarks show that the proposed NESTA method outperforms deep reinforcement learning-based techniques by achieving better generalization to unseen test games and learning from fewer training interactions. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: ACL 2023

arXiv:2306.15041 [pdf]

A Comparison of Neuroelectrophysiology Databases

Authors: Priyanka Subash, Alex Gray, Misque Boswell, Samantha L. Cohen, Rachael Garner, Sana Salehi, Calvary Fisher, Samuel Hobel, Satrajit Ghosh, Yaroslav Halchenko, Benjamin Dichter, Russell A. Poldrack, Chris Markiewicz, Dora Hermes, Arnaud Delorme, Scott Makeig, Brendan Behan, Alana Sparks, Stephen R Arnott, Zhengjia Wang, John Magnotti, Michael S. Beauchamp, Nader Pouratian, Arthur W. Toga, Dominique Duncan

Abstract: As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DAN… ▽ More As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DANDI), OpenNeuro, and Brain-CODE. The aim of this review is to describe archives that provide researchers with tools to store, share, and reanalyze both human and non-human neurophysiology data based on criteria that are of interest to the neuroscientific community. The Brain Imaging Data Structure (BIDS) and Neurodata Without Borders (NWB) are utilized by these archives to make data more accessible to researchers by implementing a common standard. As the necessity for integrating large-scale analysis into data repository platforms continues to grow within the neuroscientific community, this article will highlight the various analytical and customizable tools developed within the chosen archives that may advance the field of neuroinformatics. △ Less

Submitted 30 August, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: 22 pages, 6 figures, 5 tables

arXiv:2306.13906 [pdf, other]

doi 10.1145/3587102.3588792

Can GPT-4 Support Analysis of Textual Data in Tasks Requiring Highly Specialized Domain Expertise?

Authors: Jaromir Savelka, Kevin D. Ashley, Morgan A Gray, Hannes Westermann, Huihui Xu

Abstract: We evaluated the capability of generative pre-trained transformers~(GPT-4) in analysis of textual data in tasks that require highly specialized domain expertise. Specifically, we focused on the task of analyzing court opinions to interpret legal concepts. We found that GPT-4, prompted with annotation guidelines, performs on par with well-trained law student annotators. We observed that, with a rel… ▽ More We evaluated the capability of generative pre-trained transformers~(GPT-4) in analysis of textual data in tasks that require highly specialized domain expertise. Specifically, we focused on the task of analyzing court opinions to interpret legal concepts. We found that GPT-4, prompted with annotation guidelines, performs on par with well-trained law student annotators. We observed that, with a relatively minor decrease in performance, GPT-4 can perform batch predictions leading to significant cost reductions. However, employing chain-of-thought prompting did not lead to noticeably improved performance on this task. Further, we demonstrated how to analyze GPT-4's predictions to identify and mitigate deficiencies in annotation guidelines, and subsequently improve the performance of the model. Finally, we observed that the model is quite brittle, as small formatting related changes in the prompt had a high impact on the predictions. These findings can be leveraged by researchers and practitioners who engage in semantic/pragmatic annotations of texts in the context of the tasks requiring highly specialized domain expertise. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Journal ref: ITiCSE 2023: Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1. June 2023. Pages 117 - 123

arXiv:2306.10452 [pdf, other]

MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types

Authors: Keerthiram Murugesan, Sarathkrishna Swaminathan, Soham Dan, Subhajit Chaudhury, Chulaka Gunasekara, Maxwell Crouse, Diwakar Mahajan, Ibrahim Abdelaziz, Achille Fokoue, Pavan Kapanipathi, Salim Roukos, Alexander Gray

Abstract: With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model huma… ▽ More With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model human judgments in 7 NLP tasks, based on the fine-grained mismatches between a pair of texts. Inspired by the recent efforts in several NLP tasks for fine-grained evaluation, we introduce a set of 13 mismatch error types such as spatial/geographic errors, entity errors, etc, to guide the model for better prediction of human judgments. We propose a neural framework for evaluating machine texts that uses these mismatch error types as auxiliary tasks and re-purposes the existing single-number evaluation metrics as additional scalar features, in addition to textual features extracted from the machine and reference texts. Our experiments reveal key insights about the existing metrics via the mismatch errors. We show that the mismatch errors between the sentence pairs on the held-out datasets from 7 NLP tasks align well with the human evaluation. △ Less

Submitted 17 June, 2023; originally announced June 2023.

Comments: Accepted at ACL 2023 (ACL Findings Long)

arXiv:2306.09525 [pdf, other]

Explaining Legal Concepts with Augmented Large Language Models (GPT-4)

Authors: Jaromir Savelka, Kevin D. Ashley, Morgan A. Gray, Hannes Westermann, Huihui Xu

Abstract: Interpreting the meaning of legal open-textured terms is a key task of legal professionals. An important source for this interpretation is how the term was applied in previous court cases. In this paper, we evaluate the performance of GPT-4 in generating factually accurate, clear and relevant explanations of terms in legislation. We compare the performance of a baseline setup, where GPT-4 is direc… ▽ More Interpreting the meaning of legal open-textured terms is a key task of legal professionals. An important source for this interpretation is how the term was applied in previous court cases. In this paper, we evaluate the performance of GPT-4 in generating factually accurate, clear and relevant explanations of terms in legislation. We compare the performance of a baseline setup, where GPT-4 is directly asked to explain a legal term, to an augmented approach, where a legal information retrieval module is used to provide relevant context to the model, in the form of sentences from case law. We found that the direct application of GPT-4 yields explanations that appear to be of very high quality on their surface. However, detailed analysis uncovered limitations in terms of the factual accuracy of the explanations. Further, we found that the augmentation leads to improved quality, and appears to eliminate the issue of hallucination, where models invent incorrect statements. These findings open the door to the building of systems that can autonomously retrieve relevant sentences from case law and condense them into a useful explanation for legal scholars, educators or practicing lawyers alike. △ Less

Submitted 22 June, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

arXiv:2306.09234 [pdf, other]

doi 10.3847/1538-4357/acfcb1

Updated observing scenarios and multi-messenger implications for the International Gravitational-wave Network's O4 and O5

Authors: R. Weizmann Kiendrebeogo, Amanda M. Farah, Emily M. Foley, Abigail Gray, Nina Kunert, Anna Puecher, Andrew Toivonen, R. Oliver VandenBerg, Shreya Anand, Tomás Ahumada, Viraj Karambelkar, Michael W. Coughlin, Tim Dietrich, S. Zacharie Kam, Peter T. H. Pang, Leo P. Singer, Niharika Sravan

Abstract: An advanced LIGO and Virgo's third observing run brought another binary neutron star merger (BNS) and the first neutron-star black hole mergers. While no confirmed kilonovae were identified in conjunction with any of these events, continued improvements of analyses surrounding GW170817 allow us to project constraints on the Hubble Constant ($H_0$), the Galactic enrichment from $r$-process nucleosy… ▽ More An advanced LIGO and Virgo's third observing run brought another binary neutron star merger (BNS) and the first neutron-star black hole mergers. While no confirmed kilonovae were identified in conjunction with any of these events, continued improvements of analyses surrounding GW170817 allow us to project constraints on the Hubble Constant ($H_0$), the Galactic enrichment from $r$-process nucleosynthesis, and ultra-dense matter possible from forthcoming events. Here, we describe the expected constraints based on the latest expected event rates from the international gravitational-wave network (IGWN) and analyses of GW170817. We show the expected detection rate of gravitational waves and their counterparts, as well as how sensitive potential constraints are to the observed numbers of counterparts. We intend this analysis as support for the community when creating scientifically driven electromagnetic follow-up proposals. During the next observing run O4, we predict an annual detection rate of electromagnetic counterparts from BNS of $0.43^{+0.58}_{-0.26}$ ($1.97^{+2.68}_{-1.2}$) for the Zwicky Transient Facility (Rubin Observatory). △ Less

Submitted 12 December, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Journal ref: The Astrophysical Journal , 2023, 958, 158

arXiv:2305.20018 [pdf, other]

Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency

Authors: Maxwell Crouse, Ramon Astudillo, Tahira Naseem, Subhajit Chaudhury, Pavan Kapanipathi, Salim Roukos, Alexander Gray

Abstract: We introduce Logical Offline Cycle Consistency Optimization (LOCCO), a scalable, semi-supervised method for training a neural semantic parser. Conceptually, LOCCO can be viewed as a form of self-learning where the semantic parser being trained is used to generate annotations for unlabeled text that are then used as new supervision. To increase the quality of annotations, our method utilizes a coun… ▽ More We introduce Logical Offline Cycle Consistency Optimization (LOCCO), a scalable, semi-supervised method for training a neural semantic parser. Conceptually, LOCCO can be viewed as a form of self-learning where the semantic parser being trained is used to generate annotations for unlabeled text that are then used as new supervision. To increase the quality of annotations, our method utilizes a count-based prior over valid formal meaning representations and a cycle-consistency score produced by a neural text generation model as additional signals. Both the prior and semantic parser are updated in an alternate fashion from full passes over the training data, which can be seen as approximating the marginalization of latent structures through stochastic variational inference. The use of a count-based prior, frozen text generation model, and offline annotation process yields an approach with negligible complexity and latency increases as compared to conventional self-learning. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model. We demonstrate the utility of LOCCO on the well-known WebNLG benchmark where we obtain an improvement of 2 points against a self-learning parser under equivalent conditions, an improvement of 1.3 points against the previous state-of-the-art parser, and competitive text generation performance in terms of BLEU score. △ Less

Submitted 31 May, 2023; originally announced May 2023.

arXiv:2305.15022 [pdf, other]

Hierarchical clustering with dot products recovers hidden tree structure

Authors: Annie Gray, Alexander Modell, Patrick Rubin-Delanchy, Nick Whiteley

Abstract: In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a… ▽ More In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model. The key technical innovations are to understand how hierarchical information in this model translates into tree geometry which can be recovered from data, and to characterise the benefits of simultaneously growing sample size and data dimension. We demonstrate superior tree recovery performance with real data over existing approaches such as UPGMA, Ward's method, and HDBSCAN. △ Less

Submitted 1 March, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

arXiv:2304.06688 [pdf]

Direct experimental evidence of tunable charge transfer at the $LaNiO_{3}/CaMnO_{3}$ ferromagnetic interface

Authors: J. R. Paudel, M. Terilli, T. -C. Wu, J. D. Grassi, A. M. Derrico, R. K. Sah, M. Kareev, C. Klewe, P. Shafer, A. Gloskovskii, C. Schlueter, V. N. Strocov, J. Chakhalian, A. X. Gray

Abstract: Interfacial charge transfer in oxide heterostructures gives rise to a rich variety of electronic and magnetic phenomena. Designing heterostructures where one of the thin-film components exhibits a metal-insulator transition opens a promising avenue for controlling such phenomena both statically and dynamically. In this letter, we utilize a combination of depth-resolved soft X-ray standing-wave and… ▽ More Interfacial charge transfer in oxide heterostructures gives rise to a rich variety of electronic and magnetic phenomena. Designing heterostructures where one of the thin-film components exhibits a metal-insulator transition opens a promising avenue for controlling such phenomena both statically and dynamically. In this letter, we utilize a combination of depth-resolved soft X-ray standing-wave and hard X-ray photoelectron spectroscopies in conjunction with polarization-dependent X-ray absorption spectroscopy to investigate the effects of the metal-insulator transition in $LaNiO_{3}$ on the electronic and magnetic states at the $LaNiO_{3}/CaMnO_{3}$ interface. We report on a direct observation of the reduced effective valence state of the interfacial Mn cations in the metallic superlattice with an above-critical $LaNiO_{3}$ thickness (6 u.c.) due to the leakage of itinerant Ni 3d $e_{g}$ electrons into the interfacial $CaMnO_{3}$ layer. Conversely, in an insulating superlattice with a below-critical $LaNiO_{3}$ thickness of 2 u.c., a homogeneous effective valence state of Mn is observed throughout the $CaMnO_{3}$ layers due to the blockage of charge transfer across the interface. The ability to switch and tune interfacial charge transfer enables precise control of the emergent ferromagnetic state at the $LaNiO_{3}/CaMnO_{3}$ interface and, thus, has far-reaching consequences on the future strategies for the design of next-generation spintronic devices. △ Less

Submitted 13 April, 2023; originally announced April 2023.

arXiv:2302.13940 [pdf, other]

Activity Report on the Seventh African School of Fundamental Physics and Applications (ASP2022)

Authors: Kétévi A. Assamagan, Bobby Acharya, Kenneth Cecire, Christine Darve, Fernando Ferroni, Julia Ann Gray, Azwinndini Muronga

Abstract: The African School of Fundamental Physics and Applications, also known as the African School of Physics (ASP), was initiated in 2010, as a three-week biennial event, to offer additional training in fundamental and applied physics to African students with a minimum of three-year university education. Since its inception, ASP has grown to be much more than a school. ASP has become a series of activi… ▽ More The African School of Fundamental Physics and Applications, also known as the African School of Physics (ASP), was initiated in 2010, as a three-week biennial event, to offer additional training in fundamental and applied physics to African students with a minimum of three-year university education. Since its inception, ASP has grown to be much more than a school. ASP has become a series of activities and events with directed ethos towards physics as an engine for development in Africa. We report on the seven African School of Physics, ASP2022, organized at Nelson Mandela University, on November~28 to December~8, 2022. ASP2022 included programs for university students, high school teachers and high school pupils. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: 18 pages

arXiv:2302.03100 [pdf, other]

doi 10.1063/5.0141869

Observation of Coherently Coupled Cation Spin Dynamics in an Insulating Ferrimagnetic Oxide

Authors: C. Klewe, P. Shafer, J. E. Shoup, C. Kons, Y. Pogoryelov, R. Knut, B. A. Gray, H. -M. Jeon, B. M. Howe, O. Karis, Y. Suzuki, E. Arenholz, D. A. Arena, S. Emori

Abstract: Many technologically useful magnetic oxides are ferrimagnetic insulators, which consist of chemically distinct cations. Here, we examine the spin dynamics of different magnetic cations in ferrimagnetic NiZnAl-ferrite (Ni$_{0.65}$Zn$_{0.35}$Al$_{0.8}$Fe$_{1.2}$O$_4$) under continuous microwave excitation. Specifically, we employ time-resolved x-ray ferromagnetic resonance to separately probe Fe… ▽ More Many technologically useful magnetic oxides are ferrimagnetic insulators, which consist of chemically distinct cations. Here, we examine the spin dynamics of different magnetic cations in ferrimagnetic NiZnAl-ferrite (Ni$_{0.65}$Zn$_{0.35}$Al$_{0.8}$Fe$_{1.2}$O$_4$) under continuous microwave excitation. Specifically, we employ time-resolved x-ray ferromagnetic resonance to separately probe Fe$^{2+/3+}$ and Ni$^{2+}$ cations on different sublattice sites. Our results show that the precessing cation moments retain a rigid, collinear configuration to within $\approx$2$^\circ$. Moreover, the effective spin relaxation is identical to within $<$10% for all magnetic cations in the ferrite. We thus validate the oft-assumed ``ferromagnetic-like'' dynamics in resonantly driven ferrimagnetic oxides, where the magnetic moments from different cations precess as a coherent, collective magnetization. △ Less

Submitted 6 February, 2023; originally announced February 2023.

Journal ref: Appl. Phys. Lett. 122, 132401 (2023)

arXiv:2301.10414 [pdf, other]

Towards a Unification of Logic and Information Theory

Authors: Luis A. Lastras, Barry Trager, Jonathan Lenchner, Wojtek Szpankowski, Chai Wah Wu, Mark Squillante, Alex Gray

Abstract: This article introduces a theory of communication that covers the following generic scenario: Alice knows more than Bob about a certain set of logic propositions and Alice and Bob wish to communicate as efficiently as possible with the shared goal that, following their communication, Bob should be able to deduce a particular logic proposition that Alice knows to be true. We assume that our logic… ▽ More This article introduces a theory of communication that covers the following generic scenario: Alice knows more than Bob about a certain set of logic propositions and Alice and Bob wish to communicate as efficiently as possible with the shared goal that, following their communication, Bob should be able to deduce a particular logic proposition that Alice knows to be true. We assume that our logic system is propositional logic, and we build on top of one of the legendary works in this area, namely the work of Carnap and Bar-Hillel on a theory of semantic information. Our main contribution is a collection of theorems studying various different assumptions on what Alice and Bob know and what their goal is. These theorems all provide sharp upper and lower bounds phrased in terms of an entropy-like function that we call $Λ$, in reference to its apparent connection to problems of communication involving logic. It turns out that when the goal is to communicate only a portion of the knowledge that Alice possesses, the optimum communication cost is lower than most people seem to assume, yet unavoidably, such optimum communication strategies end up allowing Bob to prove even more things than originally intended. Another interesting outcome is that in some scenarios, Alice need not know the logic statements that Bob knows in order to attain asymptotically the same communication efficiency as if she knew the statement, in a nod to the famous Slepian-Wolf and Wyner-Ziv results from source coding theory. Our work also introduces practical codes, which are comprised of a combination of linear codes and enumerative source codes, which turn out to be asymptotically optimal for some scenarios. △ Less

Submitted 16 April, 2024; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2301.05507 [pdf, ps, other]

Correlation-Based And-Operations Can Be Copulas: A Proof

Authors: Enrique Miralles-Dolz, Ander Gray, Edoardo Patelli, Scott Ferson, Vladik Kreinovich, Olga Kosheleva

Abstract: In many practical situations, we know the probabilities $a$ and $b$ of two events $A$ and $B$, and we want to estimate the joint probability ${\rm Prob}(A\,\&\,B)$. The algorithm that estimates the joint probability based on the known values $a$ and $b$ is called an and-operation. An important case when such a reconstruction is possible is when we know the correlation between $A$ and $B$; we call… ▽ More In many practical situations, we know the probabilities $a$ and $b$ of two events $A$ and $B$, and we want to estimate the joint probability ${\rm Prob}(A\,\&\,B)$. The algorithm that estimates the joint probability based on the known values $a$ and $b$ is called an and-operation. An important case when such a reconstruction is possible is when we know the correlation between $A$ and $B$; we call the resulting and-operation correlation-based. On the other hand, in statistics, there is a widely used class of and-operations known as copulas. Empirical evidence seems to indicate that the correlation-based and-operation derived in https://doi.org/10.1007/978-3-031-08971-8_64 is a copula, but until now, no proof of this statement was available. In this paper, we provide such a proof. △ Less

Submitted 13 January, 2023; originally announced January 2023.

arXiv:2301.05131 [pdf, other]

Toward Theoretical Guidance for Two Common Questions in Practical Cross-Validation based Hyperparameter Selection

Authors: Parikshit Ram, Alexander G. Gray, Horst C. Samulowitz, Gregory Bramble

Abstract: We show, to our knowledge, the first theoretical treatments of two common questions in cross-validation based hyperparameter selection: (1) After selecting the best hyperparameter using a held-out set, we train the final model using {\em all} of the training data -- since this may or may not improve future generalization error, should one do this? (2) During optimization such as via SGD (stochasti… ▽ More We show, to our knowledge, the first theoretical treatments of two common questions in cross-validation based hyperparameter selection: (1) After selecting the best hyperparameter using a held-out set, we train the final model using {\em all} of the training data -- since this may or may not improve future generalization error, should one do this? (2) During optimization such as via SGD (stochastic gradient descent), we must set the optimization tolerance $ρ$ -- since it trades off predictive accuracy with computation cost, how should one set it? Toward these problems, we introduce the {\em hold-in risk} (the error due to not using the whole training data), and the {\em model class mis-specification risk} (the error due to having chosen the wrong model class) in a theoretical view which is simple, general, and suggests heuristics that can be used when faced with a dataset instance. In proof-of-concept studies in synthetic data where theoretical quantities can be controlled, we show that these heuristics can, respectively, (1) always perform at least as well as always performing retraining or never performing retraining, (2) either improve performance or reduce computational overhead by $2\times$ with no loss in predictive performance. △ Less

Submitted 12 January, 2023; originally announced January 2023.

Comments: Extended version of the paper appearing at the SIAM International Conference on Data Mining 2023 (SDM23)

arXiv:2301.02798 [pdf]

Modulation-Do** a Correlated Electron Insulator

Authors: Debasish Mondal, Smruti Rekha Mahapatra, Abigail M Derrico, Rajeev Kumar Rai, Jay R Paudel, Christoph Schlueter, Andrei Gloskovskii, Rajdeep Banerjee, Frank M F DeGroot, Dipankar D Sarma, Awadhesh Narayan, Pavan Nukala, Alexander X Gray, Naga Phani B Aetukuri

Abstract: Correlated electron materials (CEMs) host a rich variety of condensed matter phases. Vanadium dioxide (VO2) is a prototypical CEM with a temperature-dependent metal-to-insulator (MIT) transition with a concomitant crystal symmetry change. External control of MIT in VO2 - especially without inducing structural changes - has been a long-standing challenge. In this work, we design and synthesize modu… ▽ More Correlated electron materials (CEMs) host a rich variety of condensed matter phases. Vanadium dioxide (VO2) is a prototypical CEM with a temperature-dependent metal-to-insulator (MIT) transition with a concomitant crystal symmetry change. External control of MIT in VO2 - especially without inducing structural changes - has been a long-standing challenge. In this work, we design and synthesize modulation-doped VO2-based thin film heterostructures that closely emulate a textbook example of filling control in a correlated electron insulator. Using a combination of charge transport, hard x-ray photoelectron spectroscopy, and structural characterization, we show that the insulating state can be doped to achieve carrier densities greater than 5x10^21 cm^(-3) without inducing any measurable structural changes. We find that the MIT temperature (T_MIT) continuously decreases with increasing carrier concentration. Remarkably, the insulating state is robust even at do** concentrations as high as ~0.2 e-/vanadium. Finally, our work reveals modulation-do** as a viable method for electronic control of phase transitions in correlated electron oxides with the potential for use in future devices based on electric-field controlled phase transitions. △ Less

Submitted 7 January, 2023; originally announced January 2023.

Comments: Main paper 21 pages, 5 Figures. Supporting Information, 18 Pages,15 SI Figures and 1 SI Section

arXiv:2301.01669 [pdf]

Ultra-thin Epitaxial MgB2 on SiC: Substrate Surface Polarity Dependent Properties

Authors: Weibing Yang, Leila Kasaei, Hussein Hijazi, Sylvie Rangan, Yao-wen Yeh, Raj K. Sah, Jay R. Paudel, Ke Chen, Alexander X. Gray, Philip Batson, Leonard C. Feldman, Xiaoxing Xi

Abstract: High quality, ultrathin, superconducting films are required for advanced devices such as hot-electron bolometers, superconducting nanowire single photon detectors, and quantum applications. Using Hybrid Physical-Chemical Vapor Deposition (HPCVD), we show that MgB2 films as thin as 4 nm can be fabricated on the carbon terminated 6H-SiC (0001) surface with a superconducting transition temperature ab… ▽ More High quality, ultrathin, superconducting films are required for advanced devices such as hot-electron bolometers, superconducting nanowire single photon detectors, and quantum applications. Using Hybrid Physical-Chemical Vapor Deposition (HPCVD), we show that MgB2 films as thin as 4 nm can be fabricated on the carbon terminated 6H-SiC (0001) surface with a superconducting transition temperature above 33K and a rms roughness of 0.7 nm. Remarkably, the film quality is a function of the SiC surface termination, with the C-terminated surface preferred to the Si-terminated surface. To understand the MgB2 thin film/ SiC substrate interactions giving rise to this difference, we characterized the interfacial structures using Rutherford backscattering spectroscopy/channeling, electron energy loss spectroscopy, and x-ray photoemission spectroscopy. The MgB2/SiC interface structure is complex and different for the two terminations. Both terminations incorporate substantial unintentional oxide layers influencing MgB2 growth and morphology, but with different extent, intermixing and interface chemistry. In this paper, we report measurements of transport, resistivity, and critical superconducting temperature of MgB2/SiC that are different for the two terminations, and link interfacial structure variations to observed differences. The result shows that the C face of SiC is a preferred substrate for the deposition of ultrathin superconducting MgB2 films. △ Less

Submitted 4 January, 2023; originally announced January 2023.

arXiv:2212.02593 [pdf, ps, other]

Characterizing the Nonequilibrium Response of FeRh Thin Films using Time-Domain Thermoreflectance (TDTR)

Authors: Renee M. Harton, Alejandro Ceballos, Vivek Unikandanunni, Alexander Gray, Stefano Bonetti, Peter Krüger, Frances Hellman

Abstract: Time-Domain Thermoreflectance (TDTR) characterization of FeRh throughout its first-order antiferromagnetic (AF) to ferromagnetic (FM) transition shows that the transient reflectance, $Δ$R(t)/R, strongly depends on the magnetic order of the sample. Using TDTR, which uses optical pulses to induce small temperature excursions, we have found that the $Δ$R(t)/R of the AF phase exhibits a large negative… ▽ More Time-Domain Thermoreflectance (TDTR) characterization of FeRh throughout its first-order antiferromagnetic (AF) to ferromagnetic (FM) transition shows that the transient reflectance, $Δ$R(t)/R, strongly depends on the magnetic order of the sample. Using TDTR, which uses optical pulses to induce small temperature excursions, we have found that the $Δ$R(t)/R of the AF phase exhibits a large negative response, while the response of the FM phase is positive. This magnetic phase sensitivity has allowed us to study the transient response of both the AF and FM phase to the pump pulse excitation and the mixed phase of the material. These results are significant since the ultrafast properties of antiferromagnetic materials and mixed antiferromagnetic and ferromagnetic materials are difficult to detect using other conventional techniques.We have found that the AF phase exhibits a strong subpicosecond signal not observed in the FM phase. The magnetic phase dependence of the sign of $Δ$R(t)/R is qualitatively explained using the results of ab-initio density functional theory (DFT) calculations. Using the two-temperature model, we found that the change in the thermalization time across the transition is caused by differences in both the electronic heat capacity and the electron-phonon coupling factor of the AF and FM phases. The electron-phonon coupling constant in the AF phase is also determined using the two-temperature model conducted using the NTMpy code package. For the FM phase, we provide boundaries for the magnitude of the electron-phonon coupling factor for the FM phase. These results indicate that TDTR can be used to study the transient properties of magnetic materials that are otherwise challenging to probe. △ Less

Submitted 12 March, 2024; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: 14 pages, 6 figures

arXiv:2211.04740 [pdf, other]

Performance of the CMS High Granularity Calorimeter prototype to charged pion beams of 20$-$300 GeV/c

Authors: B. Acar, G. Adamov, C. Adloff, S. Afanasiev, N. Akchurin, B. Akgün, M. Alhusseini, J. Alison, J. P. Figueiredo de sa Sousa de Almeida, P. G. Dias de Almeida, A. Alpana, M. Alyari, I. Andreev, U. Aras, P. Aspell, I. O. Atakisi, O. Bach, A. Baden, G. Bakas, A. Bakshi, S. Banerjee, P. DeBarbaro, P. Bargassa, D. Barney, F. Beaudette , et al. (435 additional authors not shown)

Abstract: The upgrade of the CMS experiment for the high luminosity operation of the LHC comprises the replacement of the current endcap calorimeter by a high granularity sampling calorimeter (HGCAL). The electromagnetic section of the HGCAL is based on silicon sensors interspersed between lead and copper (or copper tungsten) absorbers. The hadronic section uses layers of stainless steel as an absorbing med… ▽ More The upgrade of the CMS experiment for the high luminosity operation of the LHC comprises the replacement of the current endcap calorimeter by a high granularity sampling calorimeter (HGCAL). The electromagnetic section of the HGCAL is based on silicon sensors interspersed between lead and copper (or copper tungsten) absorbers. The hadronic section uses layers of stainless steel as an absorbing medium and silicon sensors as an active medium in the regions of high radiation exposure, and scintillator tiles directly readout by silicon photomultipliers in the remaining regions. As part of the development of the detector and its readout electronic components, a section of a silicon-based HGCAL prototype detector along with a section of the CALICE AHCAL prototype was exposed to muons, electrons and charged pions in beam test experiments at the H2 beamline at the CERN SPS in October 2018. The AHCAL uses the same technology as foreseen for the HGCAL but with much finer longitudinal segmentation. The performance of the calorimeters in terms of energy response and resolution, longitudinal and transverse shower profiles is studied using negatively charged pions, and is compared to GEANT4 predictions. This is the first report summarizing results of hadronic showers measured by the HGCAL prototype using beam test data. △ Less

Submitted 27 May, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: Accepted for publication by JINST

arXiv:2210.12207 [pdf, other]

doi 10.3847/1538-4357/ac9cd9

A New Distance to the Supernova Remnant DA 530 Based on HI Absorption of Polarized Emission

Authors: Rebecca A. Booth, Roland Kothes, Tom Landecker, Jo-Anne Brown, Andrew Gray, Tyler Foster, Eric Greisen

Abstract: Supernova remnants (SNRs) are significant contributors of matter and energy to the interstellar medium. Understanding the impact and the mechanism of this contribution requires knowledge of the physical size, energy, and expansion rate of individual SNRs, which can only come if reliable distances can be obtained. We aim to determine the distance to the SNR DA 530 (G93.3+6.9), an object of low surf… ▽ More Supernova remnants (SNRs) are significant contributors of matter and energy to the interstellar medium. Understanding the impact and the mechanism of this contribution requires knowledge of the physical size, energy, and expansion rate of individual SNRs, which can only come if reliable distances can be obtained. We aim to determine the distance to the SNR DA 530 (G93.3+6.9), an object of low surface brightness. To achieve this, we used the Dominion Radio Astrophysical Observatory Synthesis Telescope and the National Radio Astronomy Observatory Very Large Array to observe the absorption by intervening HI of the polarized emission from DA 530. Significant absorption was detected at velocities $-28$ and -67 km/s (relative to the local standard of rest), corresponding to distances of 4.4 and 8.3 kpc, respectively. Based on the radio and X-ray characteristics of DA 530, we conclude that the minimum distance is 4.4$^{+0.4}_{-0.2}$ kpc. At this minimum distance, the diameter of the SNR is 34$^{+4}_{-1}$ pc, and the elevation above the Galactic plane is 537$^{+40}_{-32}$ pc. The $-67$ km/s absorption likely occurs in gas whose velocity is not determined by Galactic rotation. We present a new data processing method for combining Stokes $Q$ and $U$ observations of the emission from an SNR into a single HI absorption spectrum, which avoids the difficulties of the noise-bias subtraction required for the calculation of polarized intensity. The polarized absorption technique can be applied to determine distances to many more SNRs. △ Less

Submitted 21 October, 2022; originally announced October 2022.

arXiv:2208.14531 [pdf, other]

Hollow Rectangular Waveguide-fed Holographic Beamforming Antenna Additively Manufactured (3D Printed) with Conductive Polymer

Authors: Insang Yoo, Jonah Gollub, Shengrong Ye, Allen Gray, Okan Yurduseven, Manohar D. Deshpande, David R. Smith

Abstract: We present the design and fabrication of 3D printed holographic beamforming antennas. The antennas utilize additively manufactured hollow rectangular waveguides that feed radiating rectilinear slots inserted into the upper conducting wall. The lengths of the individual slots are altered to implement a holographic beamforming solution designed using a coupled dipole formalism. For rapid verificatio… ▽ More We present the design and fabrication of 3D printed holographic beamforming antennas. The antennas utilize additively manufactured hollow rectangular waveguides that feed radiating rectilinear slots inserted into the upper conducting wall. The lengths of the individual slots are altered to implement a holographic beamforming solution designed using a coupled dipole formalism. For rapid verification, the designed antennas are fabricated using a desktop dual-extrusion fused filament 3D printer. The body of each antenna and its inner conducting surface are respectively printed using polylactic acid and biodegradable conductive polyester composite material (i.e., Electrifi), which is later deposited with a layer of copper on its surface to improve surface conductivity and reduce surface roughness. The beamforming performance of the fabricated antennas is confirmed via experiments. The 3D printed metasurface antennas using the proposed fabrication technique illustrate emerging capabilities in the rapid prototy** of complex electromagnetic structures. △ Less

Submitted 30 August, 2022; originally announced August 2022.

arXiv:2208.11665 [pdf, other]

Statistical exploration of the Manifold Hypothesis

Authors: Nick Whiteley, Annie Gray, Patrick Rubin-Delanchy

Abstract: The Manifold Hypothesis is a widely accepted tenet of Machine Learning which asserts that nominally high-dimensional data are in fact concentrated near a low-dimensional manifold, embedded in high-dimensional space. This phenomenon is observed empirically in many real world situations, has led to development of a wide range of statistical methods in the last few decades, and has been suggested as… ▽ More The Manifold Hypothesis is a widely accepted tenet of Machine Learning which asserts that nominally high-dimensional data are in fact concentrated near a low-dimensional manifold, embedded in high-dimensional space. This phenomenon is observed empirically in many real world situations, has led to development of a wide range of statistical methods in the last few decades, and has been suggested as a key factor in the success of modern AI technologies. We show that rich and sometimes intricate manifold structure in data can emerge from a generic and remarkably simple statistical model -- the Latent Metric Model -- via elementary concepts such as latent variables, correlation and stationarity. This establishes a general statistical explanation for why the Manifold Hypothesis seems to hold in so many situations. Informed by the Latent Metric Model we derive procedures to discover and interpret the geometry of high-dimensional data, and explore hypotheses about the data generating mechanism. These procedures operate under minimal assumptions and make use of well known, scaleable graph-analytic algorithms. △ Less

Submitted 9 February, 2024; v1 submitted 24 August, 2022; originally announced August 2022.

MSC Class: 62R20; 62R40; 62G05; 62G20; 62R07; 62-08; 62H25; 62H30

arXiv:2208.01133 [pdf, ps, other]

A note on split extension classifiers of perfect objects

Authors: James R. A. Gray

Abstract: We show that for a pointed protomodular category $\mathbb{C}$ satisfying a certain condition on those Huq commutators which exist, if $X$ is a perfect object in $\mathbb{C}$ such that the split extension classifier $[X]$ exists, then the centralizer of the \emph{conjugation} morphism $c_X : X\to [X]$ is trivial and hence $[X]$ has trivial center. We show that for a pointed protomodular category $\mathbb{C}$ satisfying a certain condition on those Huq commutators which exist, if $X$ is a perfect object in $\mathbb{C}$ such that the split extension classifier $[X]$ exists, then the centralizer of the \emph{conjugation} morphism $c_X : X\to [X]$ is trivial and hence $[X]$ has trivial center. △ Less

Submitted 1 August, 2022; originally announced August 2022.

arXiv:2207.10583 [pdf, other]

doi 10.1007/978-3-031-08971-8_64

Correlated Boolean Operators for Uncertainty Logic

Authors: Enrique Miralles-Dolz, Ander Gray, Edoardo Patelli, Scott Ferson

Abstract: We present a correlated \textit{and} gate which may be used to propagate uncertainty and dependence through Boolean functions, since any Boolean function may be expressed as a combination of \textit{and} and \textit{not} operations. We argue that the \textit{and} gate is a bivariate copula family, which has the interpretation of constructing bivariate Bernoulli random variables following a given P… ▽ More We present a correlated \textit{and} gate which may be used to propagate uncertainty and dependence through Boolean functions, since any Boolean function may be expressed as a combination of \textit{and} and \textit{not} operations. We argue that the \textit{and} gate is a bivariate copula family, which has the interpretation of constructing bivariate Bernoulli random variables following a given Pearson correlation coefficient and marginal probabilities. We show how this copula family may be used to propagate uncertainty in the form of probabilities of events, probability intervals, and probability boxes, with only partial or no knowledge of the dependency between events, expressed as an interval for the correlation coefficient. These results generalise previous results by Fréchet on the conjunction of two events with unknown dependencies. We show an application propagating uncertainty through a fault tree for a pressure tank. This paper comes with an open-source Julia library for performing uncertainty logic. △ Less

Submitted 21 July, 2022; originally announced July 2022.

Journal ref: Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2022. Communications in Computer and Information Science, vol 1601, pages 798--811

arXiv:2207.06149 [pdf, ps, other]

A note on the relationship between action accessible and weakly action representable categories

Authors: James Richard Andrew Gray

Abstract: The main purpose of this paper is to show that the converse of the known implication weakly action representable implies action accessible is false. In particular we show that both action accessibility, as well as the (at least formally stronger) condition requiring the existence of all normalizers do not imply weakly-action-representability even for varieties. In addition we show that in contrast… ▽ More The main purpose of this paper is to show that the converse of the known implication weakly action representable implies action accessible is false. In particular we show that both action accessibility, as well as the (at least formally stronger) condition requiring the existence of all normalizers do not imply weakly-action-representability even for varieties. In addition we show that in contrast to both action accessibility and the condition requiring the existence of all normalizers, weakly-action representability is not necessarily inherited by Birkoff subcategories. △ Less

Submitted 13 July, 2022; originally announced July 2022.

arXiv:2206.12855 [pdf, ps, other]

A note on the Huq-commutativity of normal monomorphisms

Authors: James Richard Andrew Gray, Tamar Janelidze-Gray

Abstract: We give an alternative criteria for when a pair of Bourn-normal monomorphisms Huq-commute in a unital category. We use this to prove that in a unital category, in which a morphism is a monomorphism if and only if its kernel is zero morphism, a pair of Bourn-normal monomorphisms with the same codomain Huq-commute as soon as they have trivial pullback. As corollaries we show that several facts known… ▽ More We give an alternative criteria for when a pair of Bourn-normal monomorphisms Huq-commute in a unital category. We use this to prove that in a unital category, in which a morphism is a monomorphism if and only if its kernel is zero morphism, a pair of Bourn-normal monomorphisms with the same codomain Huq-commute as soon as they have trivial pullback. As corollaries we show that several facts known only in the protomodular context are in fact true in more general contexts. △ Less

Submitted 26 June, 2022; originally announced June 2022.

arXiv:2204.01805 [pdf]

Using Elo Rating as a Metric for Comparative Judgement in Educational Assessment

Authors: Andy Gray, Alma Rahat, Tom Crick, Stephen Lindsay, Darren Wallace

Abstract: Marking and feedback are essential features of teaching and learning, across the overwhelming majority of educational settings and contexts. However, it can take a great deal of time and effort for teachers to mark assessments, and to provide useful feedback to the students. Furthermore, it also creates a significant cognitive load on the assessors, especially in ensuring fairness and equity. Ther… ▽ More Marking and feedback are essential features of teaching and learning, across the overwhelming majority of educational settings and contexts. However, it can take a great deal of time and effort for teachers to mark assessments, and to provide useful feedback to the students. Furthermore, it also creates a significant cognitive load on the assessors, especially in ensuring fairness and equity. Therefore, an alternative approach to marking called comparative judgement (CJ) has been proposed in the educational space. Inspired by the law of comparative judgment (LCJ). This pairwise comparison for as many pairs as possible can then be used to rank all submissions. Studies suggest that CJ is highly reliable and accurate while making it quick for the teachers. Alternative studies have questioned this claim suggesting that the process can increase bias in the results as the same submission is shown many times to an assessor for increasing reliability. Additionally, studies have also found that CJ can result in the overall marking process taking longer than a more traditional method of marking as information about many pairs must be collected. In this paper, we investigate Elo, which has been extensively used in rating players in zero-sum games such as chess. We experimented on a large-scale Twitter dataset on the topic of a recent major UK political event ("Brexit", the UK's political exit from the European Union) to ask users which tweet they found funnier between a pair selected from ten tweets. Our analysis of the data reveals that the Elo rating is statistically significantly similar to the CJ ranking with a Kendall's tau score of 0.96 and a p-value of 1.5x10^(-5). We finish with an informed discussion regarding the potential wider application of this approach to a range of educational contexts. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: 12 pages, 4 figures, one table, pre-review version

arXiv:2201.05793 [pdf, other]

A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases

Authors: Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh Srivastava, Cezar Pendus, Saswati Dana, Dinesh Garg, Achille Fokoue, G P Shrivatsa Bhargav, Dinesh Khandelwal, Srinivas Ravishankar, Sairam Gurajada, Maria Chang, Rosario Uceda-Sosa, Salim Roukos, Alexander Gray, Guilherme Lima, Ryan Riegel, Francois Luus, L Venkata Subramaniam

Abstract: Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-… ▽ More Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata. The TempQA-WD dataset is available at https://github.com/IBM/tempqa-wd. △ Less

Submitted 15 January, 2022; originally announced January 2022.

Comments: 7 pages, 2 figures, 7 tables. arXiv admin note: substantial text overlap with arXiv:2109.13430

arXiv:2112.08829 [pdf, ps, other]

doi 10.1016/j.jpaa.2022.107293

Algebraic logoi

Authors: D. Bourn, A. S. Cigoli, J. R. A. Gray, T. Van der Linden

Abstract: We introduce normal cores, as well as the more general action cores, in the context of a semi-abelian category, and further generalise those to split extension cores in the context of a homological category. We prove that, if the category is moreover well-powered with (small) joins, then the existence of split extension cores is equivalent to the condition that the change-of-base functors in the f… ▽ More We introduce normal cores, as well as the more general action cores, in the context of a semi-abelian category, and further generalise those to split extension cores in the context of a homological category. We prove that, if the category is moreover well-powered with (small) joins, then the existence of split extension cores is equivalent to the condition that the change-of-base functors in the fibration of points are geometric. We call a finitely complete category that satisfies this condition an algebraic logos. We give examples of such categories, compare them with algebraically coherent ones, and study equivalent conditions as well as stability under common categorical operations. △ Less

Submitted 23 September, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: Revision with changes throughout the text; new final section; 22 pages

MSC Class: 18E13; 18B25; 20F12; 17A99; 08C05

Journal ref: J. Pure Appl. Algebra 227 (2023), 107293

arXiv:2112.07051 [pdf]

doi 10.1093/database/baac035

A Simple Standard for Sharing Ontological Map**s (SSSOM)

Authors: Nicolas Matentzoglu, James P. Balhoff, Susan M. Bello, Chris Bizon, Matthew Brush, Tiffany J. Callahan, Christopher G Chute, William D. Duncan, Chris T. Evelo, Davera Gabriel, John Graybeal, Alasdair Gray, Benjamin M. Gyori, Melissa Haendel, Henriette Harmse, Nomi L. Harris, Ian Harrow, Harshad Hegde, Amelia L. Hoyt, Charles T. Hoyt, Dazhi Jiao, Ernesto Jiménez-Ruiz, Simon Jupp, Hyeongsik Kim, Sebastian Koehler , et al. (19 additional authors not shown)

Abstract: Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for map** between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Map**s often lack the metadata needed to be correctly interpreted and applied. For example, ar… ▽ More Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for map** between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Map**s often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Are they associated in some other way? Such relationships between the mapped terms are often not documented, leading to incorrect assumptions and making them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Also, the lack of descriptions of how map**s were done makes it hard to combine and reconcile map**s, particularly curated and automated ones. The Simple Standard for Sharing Ontological Map**s (SSSOM) addresses these problems by: 1. Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in map**s explicit. 2. Defining an easy to use table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data standards. 3. Implementing open and community-driven collaborative workflows designed to evolve the standard continuously to address changing requirements and map** practices. 4. Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases, and survey some existing work on standardizing the exchange of map**s, with the goal of making map**s Findable, Accessible, Interoperable, and Reusable (FAIR). The SSSOM specification is at http://w3id.org/sssom/spec. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: Corresponding author: Christopher J. Mungall <[email protected]>

arXiv:2112.03324 [pdf, other]

Neuro-Symbolic Inductive Logic Programming with Logical Neural Networks

Authors: Prithviraj Sen, Breno W. S. R. de Carvalho, Ryan Riegel, Alexander Gray

Abstract: Recent work on neuro-symbolic inductive logic programming has led to promising approaches that can learn explanatory rules from noisy, real-world data. While some proposals approximate logical operators with differentiable operators from fuzzy or real-valued logic that are parameter-free thus diminishing their capacity to fit the data, other approaches are only loosely based on logic making it dif… ▽ More Recent work on neuro-symbolic inductive logic programming has led to promising approaches that can learn explanatory rules from noisy, real-world data. While some proposals approximate logical operators with differentiable operators from fuzzy or real-valued logic that are parameter-free thus diminishing their capacity to fit the data, other approaches are only loosely based on logic making it difficult to interpret the learned "rules". In this paper, we propose learning rules with the recently proposed logical neural networks (LNN). Compared to others, LNNs offer strong connection to classical Boolean logic thus allowing for precise interpretation of learned rules while harboring parameters that can be trained with gradient-based optimization to effectively fit the data. We extend LNNs to induce rules in first-order logic. Our experiments on standard benchmarking tasks confirm that LNN rules are highly interpretable and can achieve comparable or higher accuracy due to their flexible parameterization. △ Less

Submitted 6 December, 2021; originally announced December 2021.

arXiv:2111.06855 [pdf, other]

doi 10.1088/1748-0221/17/05/P05022

Response of a CMS HGCAL silicon-pad electromagnetic calorimeter prototype to 20-300 GeV positrons

Authors: B. Acar, G. Adamov, C. Adloff, S. Afanasiev, N. Akchurin, B. Akgün, F. Alam Khan, M. Alhusseini, J. Alison, A. Alpana, G. Altopp, M. Alyari, S. An, S. Anagul, I. Andreev, P. Aspell, I. O. Atakisi, O. Bach, A. Baden, G. Bakas, A. Bakshi, S. Bannerjee, P. Bargassa, D. Barney, F. Beaudette , et al. (364 additional authors not shown)

Abstract: The Compact Muon Solenoid Collaboration is designing a new high-granularity endcap calorimeter, HGCAL, to be installed later this decade. As part of this development work, a prototype system was built, with an electromagnetic section consisting of 14 double-sided structures, providing 28 sampling layers. Each sampling layer has an hexagonal module, where a multipad large-area silicon sensor is glu… ▽ More The Compact Muon Solenoid Collaboration is designing a new high-granularity endcap calorimeter, HGCAL, to be installed later this decade. As part of this development work, a prototype system was built, with an electromagnetic section consisting of 14 double-sided structures, providing 28 sampling layers. Each sampling layer has an hexagonal module, where a multipad large-area silicon sensor is glued between an electronics circuit board and a metal baseplate. The sensor pads of approximately 1 cm$^2$ are wire-bonded to the circuit board and are readout by custom integrated circuits. The prototype was extensively tested with beams at CERN's Super Proton Synchrotron in 2018. Based on the data collected with beams of positrons, with energies ranging from 20 to 300 GeV, measurements of the energy resolution and linearity, the position and angular resolutions, and the shower shapes are presented and compared to a detailed Geant4 simulation. △ Less

Submitted 31 March, 2022; v1 submitted 12 November, 2021; originally announced November 2021.

arXiv:2111.03570 [pdf, ps, other]

Why the 1-Wasserstein distance is the area between the two marginal CDFs

Authors: Marco De Angelis, Ander Gray

Abstract: We elucidate why the 1-Wasserstein distance $W_1$ coincides with the area between the two marginal cumulative distribution functions (CDFs). We first describe the Wasserstein distance in terms of copulas, and then show that $W_1$ with the Euclidean distance is attained with the $M$ copula. Two random variables whose dependence is given by the $M$ copula manifest perfect (positive) dependence. If w… ▽ More We elucidate why the 1-Wasserstein distance $W_1$ coincides with the area between the two marginal cumulative distribution functions (CDFs). We first describe the Wasserstein distance in terms of copulas, and then show that $W_1$ with the Euclidean distance is attained with the $M$ copula. Two random variables whose dependence is given by the $M$ copula manifest perfect (positive) dependence. If we express the random variables in terms of their CDFs, it is intuitive to see that the distance between two such random variables coincides with the area between the two CDFs. △ Less

Submitted 5 November, 2021; originally announced November 2021.

Comments: 6 pages, 1 figure, a pedagogical note

arXiv:2110.14458 [pdf]

doi 10.1116/6.0001584

Emergent phenomena at oxide interfaces studied with standing-wave photoelectron spectroscopy

Authors: C. -T. Kuo, G. Conti, J. E. Rault, C. M. Schneider, S. Nemšák, A. X. Gray

Abstract: Emergent phenomena at complex-oxide interfaces have become a vibrant field of study in the past two decades due to the rich physics and a wide range of possibilities for creating new states of matter and novel functionalities for potential devices. Electronic-structural characterization of such phenomena presents a unique challenge due to the lack of direct yet non-destructive techniques for probi… ▽ More Emergent phenomena at complex-oxide interfaces have become a vibrant field of study in the past two decades due to the rich physics and a wide range of possibilities for creating new states of matter and novel functionalities for potential devices. Electronic-structural characterization of such phenomena presents a unique challenge due to the lack of direct yet non-destructive techniques for probing buried layers and interfaces with the required Angstrom-level resolution, as well as element and orbital specificity. In this review article, we survey several recent studies wherein soft x-ray standing-wave photoelectron spectroscopy, a relatively newly developed technique, is used to investigate buried oxide interfaces exhibiting emergent phenomena such as metal-insulator transition, interfacial ferromagnetism, and two-dimensional electron gas. Advantages, challenges, and future applications of this methodology are also discussed. △ Less

Submitted 27 October, 2021; originally announced October 2021.

arXiv:2110.10973 [pdf, other]

LOA: Logical Optimal Actions for Text-based Interaction Games

Authors: Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, Alexander Gray

Abstract: We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games. The demonstration for LOA experiments consists of a web-based interactive platform for text-based games and visualization for acqu… ▽ More We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games. The demonstration for LOA experiments consists of a web-based interactive platform for text-based games and visualization for acquired knowledge for improving interpretability for trained rules. This demonstration also provides a comparison module with other neuro-symbolic approaches as well as non-symbolic state-of-the-art agent models on the same text-based games. Our LOA also provides open-sourced implementation in Python for the reinforcement learning environment to facilitate an experiment for studying neuro-symbolic agents. Code: https://github.com/ibm/loa △ Less

Submitted 21 October, 2021; originally announced October 2021.

Comments: ACL-IJCNLP 2021 (demo paper)

Showing 1–50 of 213 results for author: Gray, A