Search | arXiv e-print repository

Verifying Peephole Rewriting In SSA Compiler IRs

Authors: Siddharth Bhat, Alex Keizer, Chris Hughes, Andrés Goens, Tobias Grosser

Abstract: There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve a… ▽ More There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve at a rate that makes proof engineering alongside them prohibitively expensive. Nevertheless, well-scoped push-button automated verification tools such as the Alive peephole verifier for LLVM-IR gained recognition in domains where SMT solvers offer efficient (semi) decision procedures. In this paper, we aim to combine the convenience of automation with the versatility of ITPs for verifying peephole rewrites across domain-specific IRs. We formalize a core calculus for SSA-based IRs that is generic over the IR and covers so-called regions (nested sco** used by many domain-specific IRs in the MLIR ecosystem). Our mechanization in the Lean proof assistant provides a user-friendly frontend for translating MLIR syntax into our calculus. We provide scaffolding for defining and verifying peephole rewrites, offering tactics to eliminate the abstraction overhead of our SSA calculus. We prove correctness theorems about peephole rewriting, as well as two classical program transformations. To evaluate our framework, we consider three use cases from the MLIR ecosystem that cover different levels of abstractions: (1) bitvector rewrites from LLVM, (2) structured control flow, and (3) fully homomorphic encryption. We envision that our mechanization provides a foundation for formally verified rewrites on new domain-specific IRs. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: accepted at ITP 2024

arXiv:2406.08121 [pdf, ps, other]

Moments of derivatives of the Riemann zeta function: Characteristic polynomials and the hybrid formula

Authors: Christopher Hughes, Andrew Pearce-Crump

Abstract: We conjecture results about the moments of mixed derivatives of the Riemann zeta function, evaluated at the non-trivial zeros of the Riemann zeta function. We do this in two different ways, both giving us the same conjecture. In the first, we find asymptotics for the moments of derivatives of the characteristic polynomials of matrices in the Circular Unitary Ensemble. In the second, we consider th… ▽ More We conjecture results about the moments of mixed derivatives of the Riemann zeta function, evaluated at the non-trivial zeros of the Riemann zeta function. We do this in two different ways, both giving us the same conjecture. In the first, we find asymptotics for the moments of derivatives of the characteristic polynomials of matrices in the Circular Unitary Ensemble. In the second, we consider the hybrid model approach first proposed by Gonek, Hughes and Keating. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2405.18595 [pdf]

doi 10.1126/science.adj0625

Isotopic evidence of long-lived volcanism on Io

Authors: Katherine de Kleer, Ery C. Hughes, Francis Nimmo, John Eiler, Amy E. Hofmann, Statia Luszcz-Cook, Kathy Mandt

Abstract: Jupiter's moon Io hosts extensive volcanism driven by tidal heating. The isotopic composition of Io's inventory of volatile elements, including sulfur and chlorine, reflects its outgassing and mass loss history and provides an avenue for exploring its evolution. We used millimeter observations of Io's atmosphere to measure sulfur isotopes in gaseous SO2 and SO, and chlorine isotopes in gaseous NaC… ▽ More Jupiter's moon Io hosts extensive volcanism driven by tidal heating. The isotopic composition of Io's inventory of volatile elements, including sulfur and chlorine, reflects its outgassing and mass loss history and provides an avenue for exploring its evolution. We used millimeter observations of Io's atmosphere to measure sulfur isotopes in gaseous SO2 and SO, and chlorine isotopes in gaseous NaCl and KCl. We find $^{34}$S/$^{32}$S=0.0595$\pm$0.0038 ($δ^{34}$S=+347$\pm$86 per mille), which is highly enriched compared to average Solar System values and indicates that Io has lost 94 to 99% of its available sulfur. Our measurement of $^{37}$Cl/$^{35}$Cl=0.403$\pm$0.028 ($δ^{37}$Cl=+263$\pm$88 per mille) shows chlorine is similarly enriched. These measurements indicate that Io has been volcanically active for most or all of its history, with potentially higher outgassing and mass-loss rates at earlier times. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: This is the author's version of the work. It is posted here by permission of the AAAS for personal use, not for redistribution. The definitive version was published in Science on May 10, 2024, DOI: 10.1126/science.adj0625

Journal ref: Science, Volume 385, Issue 6696, pp. 682-687 (2024)

arXiv:2405.15583 [pdf, other]

Transfer Learning with Informative Priors: Simple Baselines Better than Previously Reported

Authors: Ethan Harvey, Mikhail Petrov, Michael C. Hughes

Abstract: We pursue transfer learning to improve classifier accuracy on a target task with few labeled examples available for training. Recent work suggests that using a source task to learn a prior distribution over neural net weights, not just an initialization, can boost target task performance. In this study, we carefully compare transfer learning with and without source task informed priors across 5 da… ▽ More We pursue transfer learning to improve classifier accuracy on a target task with few labeled examples available for training. Recent work suggests that using a source task to learn a prior distribution over neural net weights, not just an initialization, can boost target task performance. In this study, we carefully compare transfer learning with and without source task informed priors across 5 datasets. We find that standard transfer learning informed by an initialization only performs far better than reported in previous comparisons. The relative gains of methods using informative priors over standard transfer learning vary in magnitude across datasets. For the scenario of 5-300 examples per class, we find negative or negligible gains on 2 datasets, modest gains (between 1.5-3 points of accuracy) on 2 other datasets, and substantial gains (>8 points) on one dataset. Among methods using informative priors, we find that an isotropic covariance appears competitive with learned low-rank covariance matrix while being substantially simpler to understand and tune. Further analysis suggests that the mechanistic justification for informed priors -- hypothesized improved alignment between train and test loss landscapes -- is not consistently supported due to high variability in empirical landscapes. We release code to allow independent reproduction of all experiments. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.13505 [pdf, other]

Euclid: ERO -- NISP-only sources and the search for luminous $z=6-8$ galaxies

Authors: J. R. Weaver, S. Taamoli, C. J. R. McPartland, L. Zalesky, N. Allen, S. Toft, D. B. Sanders, H. Atek, R. A. A. Bowler, D. Stern, C. J. Conselice, B. Mobasher, I. Szapudi, P. R. M. Eisenhardt, G. Murphree, I. Valdes, K. Ito, S. Belladitta, P. A. Oesch, S. Serjeant, D. J. Mortlock, N. A. Hatch, M. Kluge, B. Milvang-Jensen, G. Rodighiero , et al. (163 additional authors not shown)

Abstract: This paper presents a search for high redshift galaxies from the Euclid Early Release Observations program "Magnifying Lens." The 1.5 deg$^2$ area covered by the twin Abell lensing cluster fields is comparable in size to the few other deep near-infrared surveys such as COSMOS, and so provides an opportunity to significantly increase known samples of rare UV-bright galaxies at $z\approx6-8$ (… ▽ More This paper presents a search for high redshift galaxies from the Euclid Early Release Observations program "Magnifying Lens." The 1.5 deg$^2$ area covered by the twin Abell lensing cluster fields is comparable in size to the few other deep near-infrared surveys such as COSMOS, and so provides an opportunity to significantly increase known samples of rare UV-bright galaxies at $z\approx6-8$ ($M_{\rm UV}\lesssim-22$). Beyond their still uncertain role in reionisation, these UV-bright galaxies are ideal laboratories from which to study galaxy formation and constrain the bright-end of the UV luminosity function. Of the 501994 sources detected from a combined $Y_{\rm E}$, $J_{\rm E}$, and $H_{\rm E}$ NISP detection image, 168 do not have any appreciable VIS/$I_{\rm E}$ flux. These objects span a range in spectral colours, separated into two classes: 139 extremely red sources; and 29 Lyman-break galaxy candidates. Best-fit redshifts and spectral templates suggest the former is composed of both $z\gtrsim5$ dusty star-forming galaxies and $z\approx1-3$ quiescent systems. The latter is composed of more homogeneous Lyman break galaxies at $z\approx6-8$. In both cases, contamination by L- and T-type dwarfs cannot be ruled out with Euclid images alone. Additional contamination from instrumental persistence is investigated using a novel time series analysis. This work lays the foundation for future searches within the Euclid Deep Fields, where thousands more $z\gtrsim6$ Lyman break systems and extremely red sources will be identified. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 22 pages, 13 figures, paper submitted as part of the A&A special issue `Euclid on Sky', which contains Euclid key reference papers and first results from the Euclid Early Release Observations

arXiv:2405.13504 [pdf, other]

Euclid: Early Release Observations -- A preview of the Euclid era through a galaxy cluster magnifying lens

Authors: H. Atek, R. Gavazzi, J. R. Weaver, J. M. Diego, T. Schrabback, N. A. Hatch, N. Aghanim, H. Dole, W. G. Hartley, S. Taamoli, G. Congedo, Y. Jimenez-Teja, J. -C. Cuillandre, E. Bañados, S. Belladitta, R. A. A. Bowler, M. Franco, M. Jauzac, G. Mahler, J. Richard, P. -F. Rocci, S. Serjeant, S. Toft, D. Abriola, P. Bergamini , et al. (178 additional authors not shown)

Abstract: We present the first analysis of the Euclid Early Release Observations (ERO) program that targets fields around two lensing clusters, Abell 2390 and Abell 2764. We use VIS and NISP imaging to produce photometric catalogs for a total of $\sim 500\,000$ objects. The imaging data reach a $5\,σ$ typical depth in the range 25.1-25.4 AB in the NISP bands, and 27.1-27.3 AB in the VIS band. Using the Lyma… ▽ More We present the first analysis of the Euclid Early Release Observations (ERO) program that targets fields around two lensing clusters, Abell 2390 and Abell 2764. We use VIS and NISP imaging to produce photometric catalogs for a total of $\sim 500\,000$ objects. The imaging data reach a $5\,σ$ typical depth in the range 25.1-25.4 AB in the NISP bands, and 27.1-27.3 AB in the VIS band. Using the Lyman-break method in combination with photometric redshifts, we identify $30$ Lyman-break galaxy (LBG) candidates at $z>6$ and 139 extremely red sources (ERSs), most likely at lower redshift. The deeper VIS imaging compared to NISP means we can routinely identify high-redshift Lyman breaks of the order of $3$ magnitudes, which reduces contamination by brown dwarf stars and low-redshift galaxies. Spectroscopic follow-up campaigns of such bright sources will help constrain both the bright end of the ultraviolet galaxy luminosity function and the quasar luminosity function at $z>6$, and constrain the physical nature of these objects. Additionally, we have performed a combined strong lensing and weak lensing analysis of A2390, and demonstrate how Euclid will contribute to better constraining the virial mass of galaxy clusters. From these data, we also identify optical and near-infrared counterparts of known $z>0.6$ clusters, which exhibit strong lensing features, establishing the ability of Euclid to characterize high-redshift clusters. Finally, we provide a glimpse of Euclid's ability to map the intracluster light out to larger radii than current facilities, enabling a better understanding of the cluster assembly history and map** of the dark matter distribution. This initial dataset illustrates the diverse spectrum of legacy science that will be enabled by the Euclid survey. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Paper submitted as part of the A&A special issue `Euclid on Sky', which contains Euclid key reference papers and first results from the Euclid Early Release Observations. 17 pages, 12 figures

arXiv:2405.13491 [pdf, other]

Euclid. I. Overview of the Euclid mission

Authors: Euclid Collaboration, Y. Mellier, Abdurro'uf, J. A. Acevedo Barroso, A. Achúcarro, J. Adamek, R. Adam, G. E. Addison, N. Aghanim, M. Aguena, V. Ajani, Y. Akrami, A. Al-Bahlawan, A. Alavi, I. S. Albuquerque, G. Alestas, G. Alguero, A. Allaoui, S. W. Allen, V. Allevato, A. V. Alonso-Tetilla, B. Altieri, A. Alvarez-Candal, A. Amara, L. Amendola , et al. (1086 additional authors not shown)

Abstract: The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14… ▽ More The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Paper submitted as part of the A&A special issue`Euclid on Sky'

arXiv:2403.11527 [pdf, ps, other]

Connecting 2-Forms, Conformal Transformations, Curvature Invariants and Topological Classes in Einstein Spacetimes

Authors: Jack C. M. Hughes, Fedor V. Kusmartsev

Abstract: The unique Nature of the Lorentz group in four dimensions is the root cause of the many remarkable properties of the Einstein spacetimes, in particular their operational structure on the 2-forms. We show how this operational structure can be used for two ends. First, it allows for a simple generalization of the Birkhoff theorem to Schwarzschild (A)de-Sitter spacetime. Second, it provides the means… ▽ More The unique Nature of the Lorentz group in four dimensions is the root cause of the many remarkable properties of the Einstein spacetimes, in particular their operational structure on the 2-forms. We show how this operational structure can be used for two ends. First, it allows for a simple generalization of the Birkhoff theorem to Schwarzschild (A)de-Sitter spacetime. Second, it provides the means to construct an Abelian endomorphism group on the space of 2-forms. It is observed that taking the trace over this group element-wise induces a further Abelian group which may be identified with a tensor representation of conformal transformations, giving Einstein spacetimes access to their own conformal equivalence class. A further trace over the group yields the curvature invariants of the spacetime. The Kretschmann scalar becomes the topological Euler density, which may be linked in a simple way to the Hawking temperature of horizons. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 19 pages, submitted to EPJ C

arXiv:2403.10658 [pdf, other]

InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning

Authors: Zhe Huang, Xiaowei Yu, Dajiang Zhu, Michael C. Hughes

Abstract: Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation neglects the potential for interaction between labeled and unlabeled images.… ▽ More Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation neglects the potential for interaction between labeled and unlabeled images. In this paper, we introduce InterLUDE, a new approach to enhance SSL made of two parts that each benefit from labeled-unlabeled interaction. The first part, embedding fusion, interpolates between labeled and unlabeled embeddings to improve representation learning. The second part is a new loss, grounded in the principle of consistency regularization, that aims to minimize discrepancies in the model's predictions between labeled versus unlabeled inputs. Experiments on standard closed-set SSL benchmarks and a medical SSL task with an uncurated unlabeled set show clear benefits to our approach. On the STL-10 dataset with only 40 labels, InterLUDE achieves 3.2% error rate, while the best previous method reports 14.9%. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: Semi-supervised Learning; Vision Transformers

arXiv:2403.06024 [pdf, other]

Semi-Supervised Multimodal Multi-Instance Learning for Aortic Stenosis Diagnosis

Authors: Zhe Huang, Xiaowei Yu, Benjamin S. Wessler, Michael C. Hughes

Abstract: Automated interpretation of ultrasound imaging of the heart (echocardiograms) could improve the detection and treatment of aortic stenosis (AS), a deadly heart disease. However, existing deep learning pipelines for assessing AS from echocardiograms have two key limitations. First, most methods rely on limited 2D cineloops, thereby ignoring widely available Doppler imaging that contains important c… ▽ More Automated interpretation of ultrasound imaging of the heart (echocardiograms) could improve the detection and treatment of aortic stenosis (AS), a deadly heart disease. However, existing deep learning pipelines for assessing AS from echocardiograms have two key limitations. First, most methods rely on limited 2D cineloops, thereby ignoring widely available Doppler imaging that contains important complementary information about pressure gradients and blood flow abnormalities associated with AS. Second, obtaining labeled data is difficult. There are often far more unlabeled echocardiogram recordings available, but these remain underutilized by existing methods. To overcome these limitations, we introduce Semi-supervised Multimodal Multiple-Instance Learning (SMMIL), a new deep learning framework for automatic interpretation for structural heart diseases like AS. When deployed, SMMIL can combine information from two input modalities, spectral Dopplers and 2D cineloops, to produce a study-level AS diagnosis. During training, SMMIL can combine a smaller labeled set and an abundant unlabeled set of both modalities to improve its classifier. Experiments demonstrate that SMMIL outperforms recent alternatives at 3-level AS severity classification as well as several clinically relevant AS detection tasks. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: Echocardiography; Multimodal; Semi-supervised Learning; Multiple-Instance Learning

arXiv:2403.03647 [pdf, ps, other]

The elementary theory of the 2-category of small categories

Authors: Calum Hughes, Adrian Miranda

Abstract: We give an elementary description of $2$-categories $\mathbf{Cat}\left(\mathcal{E}\right)$ of internal categories, functors and natural transformations, where $\mathcal{E}$ is a category modelling Lawvere's elementary theory of the category of sets (ETCS). This extends Bourke's characterisation of $2$-categories $\mathbf{Cat}\left(\mathcal{E}\right)$ where $\mathcal{E}$ has pullbacks to take accou… ▽ More We give an elementary description of $2$-categories $\mathbf{Cat}\left(\mathcal{E}\right)$ of internal categories, functors and natural transformations, where $\mathcal{E}$ is a category modelling Lawvere's elementary theory of the category of sets (ETCS). This extends Bourke's characterisation of $2$-categories $\mathbf{Cat}\left(\mathcal{E}\right)$ where $\mathcal{E}$ has pullbacks to take account for the extra properties in ETCS, and Lawvere's characterisation of the (one dimensional) category of small categories to take account of the two-dimensional structure. Important two-dimensional concepts which we introduce include $2$-well-pointedness, full-subobject classifiers, and the categorified axiom of choice. Along the way, we show how generating families (resp. orthogonal factorisation systems) on $\mathcal{E}$ give rise to generating families (resp. orthogonal factorisation systems) on $\mathbf{Cat}\left(\mathcal{E}\right)_{1}$, results which we believe are of independent interest. △ Less

Submitted 29 May, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

Comments: v2. 37 pages. Updated definition of 2D natural numbers object in order to give it a genuine 2D universal property. Other minor changes following referee report including some reorganisation of material for better flow. To appear in the Theory and Applications of Categories special volume for Bill Lawvere

MSC Class: 03B30; 03E30; 03G30; 18A15; 18B05; 18B25; 18B50; 18D40; 18N10

arXiv:2402.10945 [pdf, other]

Multiplicity Based Background Subtraction for Jets in Heavy Ion Collisions

Authors: Tanner Mengel, Patrick Steffanic, Charles Hughes, Antonio Carlos Oliveira Da Silva, Christine Nattrass

Abstract: Jet measurements in heavy ion collisions at low jet momentum can provide constraints on the properties of the quark gluon plasma but are overwhelmed by a significant, fluctuating background. We build upon our previous work which demonstrated the ability of the jet multiplicity method to extend jet measurements into the domain of low jet momentum [1, Mengel:2023]. We extend this method to a wide ra… ▽ More Jet measurements in heavy ion collisions at low jet momentum can provide constraints on the properties of the quark gluon plasma but are overwhelmed by a significant, fluctuating background. We build upon our previous work which demonstrated the ability of the jet multiplicity method to extend jet measurements into the domain of low jet momentum [1, Mengel:2023]. We extend this method to a wide range of jet resolution parameters. We investigate the over-complexity of non-interpretable machine learning used to tackle the problem of jet background subtraction through network optimization. Finally, we show that the resulting shallow neural network is able to learn the underlying relationship between jet multiplicity and background fluctuations, with a lesser complexity, reinforcing the utility of interpretable methods. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 14 pages including references. 10 figures. 1 appendix

arXiv:2401.14973 [pdf, other]

Discovering group dynamics in synchronous time series via hierarchical recurrent switching-state models

Authors: Michael Wojnowicz, Preetish Rath, Eric Miller, Jeffrey Miller, Clifford Hancock, Meghan O'Donovan, Seth Elkin-Frankston, Thaddeus Brunye, Michael C. Hughes

Abstract: We seek to model a collection of time series arising from multiple entities interacting over the same time period. Recent work focused on modeling individual time series is inadequate for our intended applications, where collective system-level behavior influences the trajectories of individual entities. To address such problems, we present a new hierarchical switching-state model that can be trai… ▽ More We seek to model a collection of time series arising from multiple entities interacting over the same time period. Recent work focused on modeling individual time series is inadequate for our intended applications, where collective system-level behavior influences the trajectories of individual entities. To address such problems, we present a new hierarchical switching-state model that can be trained in an unsupervised fashion to simultaneously explain both system-level and individual-level dynamics. We employ a latent system-level discrete state Markov chain that drives latent entity-level chains which in turn govern the dynamics of each observed time series. Feedback from the observations to the chains at both the entity and system levels improves flexibility via context-dependent state transitions. Our hierarchical switching recurrent dynamical models can be learned via closed-form variational coordinate ascent updates to all latent chains that scale linearly in the number of individual time series. This is asymptotically no more costly than fitting separate models for each entity. Experiments on synthetic and real datasets show that our model can produce better forecasts of future entity behavior than existing methods. Moreover, the availability of latent state chains at both the entity and system level enables interpretation of group dynamics. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.04693 [pdf, other]

Co-Clustering Multi-View Data Using the Latent Block Model

Authors: Joshua Tobin, Michaela Black, James Ng, Debbie Rankin, Jonathan Wallace, Catherine Hughes, Leane Hoey, Adrian Moore, **ling Wang, Geraldine Horigan, Paul Carlin, Helene McNulty, Anne M Molloy, Mimi Zhang

Abstract: The Latent Block Model (LBM) is a prominent model-based co-clustering method, returning parametric representations of each block cluster and allowing the use of well-grounded model selection methods. The LBM, while adapted in literature to handle different feature types, cannot be applied to datasets consisting of multiple disjoint sets of features, termed views, for a common set of observations.… ▽ More The Latent Block Model (LBM) is a prominent model-based co-clustering method, returning parametric representations of each block cluster and allowing the use of well-grounded model selection methods. The LBM, while adapted in literature to handle different feature types, cannot be applied to datasets consisting of multiple disjoint sets of features, termed views, for a common set of observations. In this work, we introduce the multi-view LBM, extending the LBM method to multi-view data, where each view marginally follows an LBM. In the case of two views, the dependence between them is captured by a cluster membership matrix, and we aim to learn the structure of this matrix. We develop a likelihood-based approach in which parameter estimation uses a stochastic EM algorithm integrating a Gibbs sampler, and an ICL criterion is derived to determine the number of row and column clusters in each view. To motivate the application of multi-view methods, we extend recent work develo** hypothesis tests for the null hypothesis that clusters of observations in each view are independent of each other. The testing procedure is integrated into the model estimation strategy. Furthermore, we introduce a penalty scheme to generate sparse row clusterings. We verify the performance of the developed algorithm using synthetic datasets, and provide guidance for optimal parameter selection. Finally, the multi-view co-clustering method is applied to a complex genomics dataset, and is shown to provide new insights for high-dimension multi-view problems. △ Less

Submitted 9 January, 2024; originally announced January 2024.

arXiv:2311.18025 [pdf, other]

A Probabilistic Method to Predict Classifier Accuracy on Larger Datasets given Small Pilot Data

Authors: Ethan Harvey, Wansu Chen, David M. Kent, Michael C. Hughes

Abstract: Practitioners building classifiers often start with a smaller pilot dataset and plan to grow to larger data in the near future. Such projects need a toolkit for extrapolating how much classifier accuracy may improve from a 2x, 10x, or 50x increase in data size. While existing work has focused on finding a single "best-fit" curve using various functional forms like power laws, we argue that modelin… ▽ More Practitioners building classifiers often start with a smaller pilot dataset and plan to grow to larger data in the near future. Such projects need a toolkit for extrapolating how much classifier accuracy may improve from a 2x, 10x, or 50x increase in data size. While existing work has focused on finding a single "best-fit" curve using various functional forms like power laws, we argue that modeling and assessing the uncertainty of predictions is critical yet has seen less attention. In this paper, we propose a Gaussian process model to obtain probabilistic extrapolations of accuracy or similar performance metrics as dataset size increases. We evaluate our approach in terms of error, likelihood, and coverage across six datasets. Though we focus on medical tasks and image modalities, our open source approach generalizes to any kind of classifier. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.05367 [pdf, other]

Reducing Disorder: An Information-Theory Formulation of MEV

Authors: Ciaran Hughes

Abstract: Maximal Extractable Value (MEV) has garnered significant attention in the cryptocurrency community. Such attention is a consequence of the revenue that can be generated from MEV, as well as the risks MEV poses to the fundamental value proposition of the underlying blockchain technology. In this work, we provide an information-theoretic formulation of MEV. With this formulation, we make common stat… ▽ More Maximal Extractable Value (MEV) has garnered significant attention in the cryptocurrency community. Such attention is a consequence of the revenue that can be generated from MEV, as well as the risks MEV poses to the fundamental value proposition of the underlying blockchain technology. In this work, we provide an information-theoretic formulation of MEV. With this formulation, we make common statements about MEV mathematically rigorous. For example, we show that i) all non-trivial blockchains and decentralised applications must generate MEV; ii) how MEV can be reduced at the expense of user expressibility; and iii) how MEV can be good or bad from an information theoretic standpoint. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 8 pages

arXiv:2309.14277 [pdf, other]

SINCERE: Supervised Information Noise-Contrastive Estimation REvisited

Authors: Patrick Feeney, Michael C. Hughes

Abstract: The information noise-contrastive estimation (InfoNCE) loss function provides the basis of many self-supervised deep learning methods due to its strong empirical results and theoretic motivation. Previous work suggests a supervised contrastive (SupCon) loss to extend InfoNCE to learn from available class labels. This SupCon loss has been widely-used due to reports of good empirical performance. Ho… ▽ More The information noise-contrastive estimation (InfoNCE) loss function provides the basis of many self-supervised deep learning methods due to its strong empirical results and theoretic motivation. Previous work suggests a supervised contrastive (SupCon) loss to extend InfoNCE to learn from available class labels. This SupCon loss has been widely-used due to reports of good empirical performance. However, in this work we find that the prior SupCon loss formulation has questionable justification because it can encourage some images from the same class to repel one another in the learned embedding space. This problematic intra-class repulsion gets worse as the number of images sharing one class label increases. We propose the Supervised InfoNCE REvisited (SINCERE) loss as a theoretically-justified supervised extension of InfoNCE that eliminates intra-class repulsion. Experiments show that SINCERE leads to better separation of embeddings from different classes and improves transfer learning classification accuracy. We additionally utilize probabilistic modeling to derive an information-theoretic bound that relates SINCERE loss to the symmeterized KL divergence between data-generating distributions for a target class and all other classes. △ Less

Submitted 2 July, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.08742 [pdf, other]

RoSSO: A High-Performance Python Package for Robotic Surveillance Strategy Optimization Using JAX

Authors: Yohan John, Connor Hughes, Gilberto Diaz-Garcia, Jason R. Marden, Francesco Bullo

Abstract: To enable the computation of effective randomized patrol routes for single- or multi-robot teams, we present RoSSO, a Python package designed for solving Markov chain optimization problems. We exploit machine-learning techniques such as reverse-mode automatic differentiation and constraint parametrization to achieve superior efficiency compared to general-purpose nonlinear programming solvers. Add… ▽ More To enable the computation of effective randomized patrol routes for single- or multi-robot teams, we present RoSSO, a Python package designed for solving Markov chain optimization problems. We exploit machine-learning techniques such as reverse-mode automatic differentiation and constraint parametrization to achieve superior efficiency compared to general-purpose nonlinear programming solvers. Additionally, we supplement a game-theoretic stochastic surveillance formulation in the literature with a novel greedy algorithm and multi-robot extension. We close with numerical results for a police district in downtown San Francisco that demonstrate RoSSO's capabilities on our new formulations and the prior work. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 7 pages, 4 figures, 3 tables, submitted to the 2024 IEEE International Conference on Robotics and Automation. See https://github.com/conhugh/RoSSO for associated codebase

arXiv:2308.14160 [pdf, other]

A Unified Transformer-based Network for multimodal Emotion Recognition

Authors: Kamran Ali, Charles E. Hughes

Abstract: The development of transformer-based models has resulted in significant advances in addressing various vision and NLP-based research challenges. However, the progress made in transformer-based methods has not been effectively applied to biosensing research. This paper presents a novel Unified Biosensor-Vision Multi-modal Transformer-based (UBVMT) method to classify emotions in an arousal-valence s… ▽ More The development of transformer-based models has resulted in significant advances in addressing various vision and NLP-based research challenges. However, the progress made in transformer-based methods has not been effectively applied to biosensing research. This paper presents a novel Unified Biosensor-Vision Multi-modal Transformer-based (UBVMT) method to classify emotions in an arousal-valence space by combining a 2D representation of an ECG/PPG signal with the face information. To achieve this goal, we first investigate and compare the unimodal emotion recognition performance of three image-based representations of the ECG/PPG signal. We then present our UBVMT network which is trained to perform emotion recognition by combining the 2D image-based representation of the ECG/PPG signal and the facial expression features. Our unified transformer model consists of homogeneous transformer blocks that take as an input the 2D representation of the ECG/PPG signal and the corresponding face frame for emotion representation learning with minimal modality-specific design. Our UBVMT model is trained by reconstructing masked patches of video frames and 2D images of ECG/PPG signals, and contrastive modeling to align face and ECG/PPG data. Extensive experiments on the MAHNOB-HCI and DEAP datasets show that our Unified UBVMT-based model produces comparable results to the state-of-the-art techniques. △ Less

Submitted 27 August, 2023; originally announced August 2023.

Comments: 12 pages

arXiv:2307.08919 [pdf, other]

Systematic comparison of semi-supervised and self-supervised learning for medical image classification

Authors: Zhe Huang, Ruijie Jiang, Shuchin Aeron, Michael C. Hughes

Abstract: In typical medical image classification problems, labeled data is scarce while unlabeled data is more available. Semi-supervised learning and self-supervised learning are two different research directions that can improve accuracy by learning from extra unlabeled data. Recent methods from both directions have reported significant gains on traditional benchmarks. Yet past benchmarks do not focus on… ▽ More In typical medical image classification problems, labeled data is scarce while unlabeled data is more available. Semi-supervised learning and self-supervised learning are two different research directions that can improve accuracy by learning from extra unlabeled data. Recent methods from both directions have reported significant gains on traditional benchmarks. Yet past benchmarks do not focus on medical tasks and rarely compare self- and semi- methods together on an equal footing. Furthermore, past benchmarks often handle hyperparameter tuning suboptimally. First, they may not tune hyperparameters at all, leading to underfitting. Second, when tuning does occur, it often unrealistically uses a labeled validation set that is much larger than the training set. Therefore currently published rankings might not always corroborate with their practical utility This study contributes a systematic evaluation of self- and semi- methods with a unified experimental protocol intended to guide a practitioner with scarce overall labeled data and a limited compute budget. We answer two key questions: Can hyperparameter tuning be effective with realistic-sized validation sets? If so, when all methods are tuned well, which self- or semi-supervised methods achieve the best accuracy? Our study compares 13 representative semi- and self-supervised methods to strong labeled-set-only baselines on 4 medical datasets. From 20000+ GPU hours of computation, we provide valuable best practices to resource-constrained practitioners: hyperparameter tuning is effective, and the semi-supervised method known as MixMatch delivers the most reliable gains across 4 datasets. △ Less

Submitted 29 March, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: CVPR 2024

arXiv:2306.00003 [pdf, other]

Detecting Heart Disease from Multi-View Ultrasound Images via Supervised Attention Multiple Instance Learning

Authors: Zhe Huang, Benjamin S. Wessler, Michael C. Hughes

Abstract: Aortic stenosis (AS) is a degenerative valve condition that causes substantial morbidity and mortality. This condition is under-diagnosed and under-treated. In clinical practice, AS is diagnosed with expert review of transthoracic echocardiography, which produces dozens of ultrasound images of the heart. Only some of these views show the aortic valve. To automate screening for AS, deep networks mu… ▽ More Aortic stenosis (AS) is a degenerative valve condition that causes substantial morbidity and mortality. This condition is under-diagnosed and under-treated. In clinical practice, AS is diagnosed with expert review of transthoracic echocardiography, which produces dozens of ultrasound images of the heart. Only some of these views show the aortic valve. To automate screening for AS, deep networks must learn to mimic a human expert's ability to identify views of the aortic valve then aggregate across these relevant images to produce a study-level diagnosis. We find previous approaches to AS detection yield insufficient accuracy due to relying on inflexible averages across images. We further find that off-the-shelf attention-based multiple instance learning (MIL) performs poorly. We contribute a new end-to-end MIL approach with two key methodological innovations. First, a supervised attention technique guides the learned attention mechanism to favor relevant views. Second, a novel self-supervised pretraining strategy applies contrastive learning on the representation of the whole study instead of individual images as commonly done in prior literature. Experiments on an open-access dataset and an external validation set show that our approach yields higher accuracy while reducing model size. △ Less

Submitted 4 April, 2024; v1 submitted 25 May, 2023; originally announced June 2023.

Comments: Echocardiogram; multiple-instance learning; self-supervised learning; semi-supervised learning; medical imaging

Journal ref: MLHC 2023

arXiv:2305.14253 [pdf, other]

A heuristic for discrete mean values of the derivative of the Riemann zeta function

Authors: Christopher Hughes, Greg Martin, Andrew Pearce-Crump

Abstract: Shanks conjectured that $ζ' (ρ)$, where $ρ$ ranges over non-trivial zeros of the Riemann zeta function, is real and positive in the mean. We present a history of this problem, including a generalisation to all higher-order derivatives $ζ^{(n)}(s)$, for which the sign of the mean alternatives between positive for odd $n$ and negative for even $n$. Furthermore, we give a simple heuristic that provid… ▽ More Shanks conjectured that $ζ' (ρ)$, where $ρ$ ranges over non-trivial zeros of the Riemann zeta function, is real and positive in the mean. We present a history of this problem, including a generalisation to all higher-order derivatives $ζ^{(n)}(s)$, for which the sign of the mean alternatives between positive for odd $n$ and negative for even $n$. Furthermore, we give a simple heuristic that provides the leading term (including its sign) of the asymptotic formula for the average value of $ζ^{(n)}(ρ)$. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2303.08275 [pdf, other]

doi 10.1103/PhysRevC.108.L021901

Interpretable Machine Learning Methods Applied to Jet Background Subtraction in Heavy Ion Collisions

Authors: Tanner Mengel, Patrick Steffanic, Charles Hughes, Antonio Carlos Oliveira da Silva, Christine Nattrass

Abstract: Jet measurements in heavy ion collisions can provide constraints on the properties of the quark gluon plasma, but the kinematic reach is limited by a large, fluctuating background. We present a novel application of symbolic regression to extract a functional representation of a deep neural network trained to subtract the background for measurements of jets in relativistic heavy ion collisions. We… ▽ More Jet measurements in heavy ion collisions can provide constraints on the properties of the quark gluon plasma, but the kinematic reach is limited by a large, fluctuating background. We present a novel application of symbolic regression to extract a functional representation of a deep neural network trained to subtract the background for measurements of jets in relativistic heavy ion collisions. We show that the deep neural network is approximately the same as a method using the particle multiplicity in a jet. This demonstrates that interpretable machine learning methods can provide insight into underlying physical processes. △ Less

Submitted 23 August, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Journal ref: PhysRevC.108.L021901(2023)6

arXiv:2302.08687 [pdf, other]

VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs

Authors: Geonhwa Jeong, Sana Damani, Abhimanyu Rajeshkumar Bambhaniya, Eric Qin, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna

Abstract: Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible via GEMM instructions. CPUs are pervasive and need to handle diverse requirements across DL workloads running in edge/HPC/cloud platforms. Therefore, as DL workloads embrace sparsity to reduce the computations… ▽ More Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible via GEMM instructions. CPUs are pervasive and need to handle diverse requirements across DL workloads running in edge/HPC/cloud platforms. Therefore, as DL workloads embrace sparsity to reduce the computations and memory size of models, it is also imperative for CPUs to add support for sparsity to avoid under-utilization of the dense matrix engine and inefficient usage of the caches and registers. This work presents VEGETA, a set of ISA and microarchitecture extensions over dense matrix engines to support flexible structured sparsity for CPUs, enabling programmable support for diverse DL models with varying degrees of sparsity. Compared to the state-of-the-art (SOTA) dense matrix engine in CPUs, a VEGETA engine provides 1.09x, 2.20x, 3.74x, and 3.28x speed-ups when running 4:4 (dense), 2:4, 1:4, and unstructured (95%) sparse DNN layers. △ Less

Submitted 23 February, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: This paper is accepted to HPCA 2023

arXiv:2301.09148 [pdf, other]

Separating signal from combinatorial jets in a high background environment

Authors: P. Steffanic, C. Hughes, C. Nattrass

Abstract: We study procedures for discriminating combinatorial jets in a high background environment, such as a heavy ion collision, from signal jets arising from a hard-scattering. We investigate a population of jets clustered from a combined PYTHIA+TennGen event, focusing on jets which can unambiguously be classified as signal or combinatorial jets. By selecting jets based on their kinematic properties, w… ▽ More We study procedures for discriminating combinatorial jets in a high background environment, such as a heavy ion collision, from signal jets arising from a hard-scattering. We investigate a population of jets clustered from a combined PYTHIA+TennGen event, focusing on jets which can unambiguously be classified as signal or combinatorial jets. By selecting jets based on their kinematic properties, we investigate whether it is possible to separate signal and combinatorial jets without biasing the signal population significantly. We find that, after a loose selection on the jet area, surviving combinatorial jets are dominantly imposters, combinatorial jets with properties indistinguishable from signal jets. We also find that, after a loose selection on the leading hadron momentum, surviving combinatorial jets are still dominantly imposters. We use rule extraction, a machine learning technique, to extract an optimal kinematic selection from a random forest trained on our population of jets. In general, this technique found a stricter kinematic selection on the jet's leading hadron momentum to be optimal. We find that it is possible to suppress combinatorial jets significantly using this machine learning based selection, but that some signal is removed as well. Due to this stricter kinematic selection, we find that the surviving signal is biased towards quark-like jets. Since similar selections are used in many measurements, this indicates that those measurements are biased towards quark-like jets as well. These studies should motivate an increased emphasis on assumptions made when suppressing and subtracting combinatorial background and the biases introduced by methods for doing so. △ Less

Submitted 31 July, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

Comments: This version of the manuscript contains supplemental appendices

arXiv:2212.00512 [pdf, other]

Hot and Cold QCD White Paper from ALICE-USA: Input for 2023 U.S. Long Range Plan for Nuclear Science

Authors: N. Alizadehvandchali, N. Apadula, M. Arslandok, C. Beattie, R. Bellwied, J. T. Blair, F. Bock, H. Bossi, A. Bylinkin, H. Caines, I. Chakaberia, M. Cherney, T. M. Cormier, R. Cruz-Torres, P. Dhankher, D. U. Dixit, R. J. Ehlers, W. Fan, M. Fasel, F. Flor, A. N. Flores, D. R. Gangadharan, E. Garcia-Solis, A. Gautam, E. Glimos , et al. (58 additional authors not shown)

Abstract: The ALICE-USA collaboration presents its plans for the 2023 U.S. Long Range Plan for Nuclear Science. The ALICE-USA collaboration presents its plans for the 2023 U.S. Long Range Plan for Nuclear Science. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: 26 pages. 1 figure

arXiv:2210.16129 [pdf, other]

doi 10.1103/PhysRevLett.131.020601

Coherent Control of Trapped Ion Qubits with Localized Electric Fields

Authors: R. Srinivas, C. M. Löschnauer, M. Malinowski, A. C. Hughes, R. Nourshargh, V. Negnevitsky, D. T. C. Allcock, S. A. King, C. Matthiesen, T. P. Harty, C. J. Ballance

Abstract: We present a new method for coherent control of trapped ion qubits in separate interaction regions of a multi-zone trap by simultaneously applying an electric field and a spin-dependent gradient. Both the phase and amplitude of the effective single-qubit rotation depend on the electric field, which can be localised to each zone. We demonstrate this interaction on a single ion using both laser-base… ▽ More We present a new method for coherent control of trapped ion qubits in separate interaction regions of a multi-zone trap by simultaneously applying an electric field and a spin-dependent gradient. Both the phase and amplitude of the effective single-qubit rotation depend on the electric field, which can be localised to each zone. We demonstrate this interaction on a single ion using both laser-based and magnetic field gradients in a surface-electrode ion trap, and measure the localisation of the electric field. △ Less

Submitted 28 October, 2022; originally announced October 2022.

arXiv:2210.05505 [pdf, other]

doi 10.1051/0004-6361/202244859

Quasar and galaxy classification using Gaia EDR3 and CatWise2020

Authors: Arvind C. N. Hughes, Coryn A. L. Bailer-Jones, Sara Jamal

Abstract: In this work, we assess the combined use of Gaia photometry and astrometry with infrared data from CatWISE in improving the identification of extragalactic sources compared to the classification obtained using Gaia data. We evaluate different input feature configurations and prior functions, with the aim of presenting a classification methodology integrating prior knowledge stemming from realistic… ▽ More In this work, we assess the combined use of Gaia photometry and astrometry with infrared data from CatWISE in improving the identification of extragalactic sources compared to the classification obtained using Gaia data. We evaluate different input feature configurations and prior functions, with the aim of presenting a classification methodology integrating prior knowledge stemming from realistic class distributions in the universe. In our work, we compare different classifiers, namely Gaussian Mixture Models (GMMs), XGBoost and CatBoost, and classify sources into three classes - star, quasar, and galaxy, with the target quasar and galaxy class labels obtained from SDSS16 and the star label from Gaia EDR3. In our approach, we adjust the posterior probabilities to reflect the intrinsic distribution of extragalactic sources in the universe via a prior function. We introduce two priors, a global prior reflecting the overall rarity of quasars and galaxies, and a mixed prior that incorporates in addition the distribution of the these sources as a function of Galactic latitude and magnitude. Our best classification performances, in terms of completeness and purity of the galaxy and quasar classes, are achieved using the mixed prior for sources at high latitudes and in the magnitude range G = 18.5 to 19.5. We apply our identified best-performing classifier to three application datasets from Gaia DR3, and find that the global prior is more conservative in what it considers to be a quasar or a galaxy compared to the mixed prior. In particular, when applied to the pure quasar and galaxy candidates samples, we attain a purity of 97% for quasars and 99.9% for galaxies using the global prior, and purities of 96% and 99% respectively using the mixed prior. We conclude our work by discussing the importance of applying adjusted priors portraying realistic class distributions in the universe. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: 21 pages, 23 figures, Accepted for publication in A&A

Journal ref: A&A 668, A99 (2022)

arXiv:2210.05161 [pdf, other]

doi 10.1093/mnras/stad170

Spectroscopic follow-up of statistically selected extremely metal-poor star candidates from GALAH DR3

Authors: G. S. Da Costa, M. S. Bessell, Thomas Nordlander, Arvind C. N. Hughes, Sven Buder, A. D. Mackey, Lee R. Spitler, D. B. Zucker

Abstract: The advent of large-scale stellar spectroscopic surveys naturally leads to the implementation of machine learning techniques to isolate, for example, small sub-samples of potentially interesting stars from the full data set. A recent example is the application of the t-SNE statistical method to $\sim$600,000 stellar spectra from the GALAH survey in order to identify a sample of candidate extremely… ▽ More The advent of large-scale stellar spectroscopic surveys naturally leads to the implementation of machine learning techniques to isolate, for example, small sub-samples of potentially interesting stars from the full data set. A recent example is the application of the t-SNE statistical method to $\sim$600,000 stellar spectra from the GALAH survey in order to identify a sample of candidate extremely metal-poor (EMP, [Fe/H] $\leq$ -3) stars. We report the outcome of low-resolution spectroscopic follow-up of 83 GALAH EMP candidates that lack any previous metallicity estimates. Overall, the statistical selection is found to be efficient ($\sim$one-third of the candidates have [Fe/H] $\leq$ -2.75) with low contamination ($<$10% have [Fe/H] $>$ -2), and with a metallicity distribution function that is consistent with previous work. Five stars are found to have [Fe/H] $\leq$ -3.0, one of which is a main sequence turnoff star. Two other stars are revealed as likely carbon-enhanced metal-poor (CEMP) stars of type CEMP-$s$, and a known carbon star is re-identified. The results indicate that the statistical selection approach employed was successful, and therefore it can be applied to forthcoming even larger stellar spectroscopic surveys with the expectation of similar positive outcomes. △ Less

Submitted 11 January, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: Accepted for publication in MNRAS; 8 pages, 6 figures

arXiv:2210.01918 [pdf, other]

doi 10.1109/TSP.2023.330361

Nonparametric and Regularized Dynamical Wasserstein Barycenters for Sequential Observations

Authors: Kevin C. Cheng, Shuchin Aeron, Michael C. Hughes, Eric L. Miller

Abstract: We consider probabilistic models for sequential observations which exhibit gradual transitions among a finite number of states. We are particularly motivated by applications such as human activity analysis where observed accelerometer time series contains segments representing distinct activities, which we call pure states, as well as periods characterized by continuous transition among these pure… ▽ More We consider probabilistic models for sequential observations which exhibit gradual transitions among a finite number of states. We are particularly motivated by applications such as human activity analysis where observed accelerometer time series contains segments representing distinct activities, which we call pure states, as well as periods characterized by continuous transition among these pure states. To capture this transitory behavior, the dynamical Wasserstein barycenter (DWB) model of Cheng et al. in 2021 [1] associates with each pure state a data-generating distribution and models the continuous transitions among these states as a Wasserstein barycenter of these distributions with dynamically evolving weights. Focusing on the univariate case where Wasserstein distances and barycenters can be computed in closed form, we extend [1] specifically relaxing the parameterization of the pure states as Gaussian distributions. We highlight issues related to the uniqueness in identifying the model parameters as well as uncertainties induced when estimating a dynamically evolving distribution from a limited number of samples. To ameliorate non-uniqueness, we introduce regularization that imposes temporal smoothness on the dynamics of the barycentric weights. A quantile-based approximation of the pure state distributions yields a finite dimensional estimation problem which we numerically solve using cyclic descent alternating between updates to the pure-state quantile functions and the barycentric weights. We demonstrate the utility of the proposed algorithm in segmenting both simulated and real world human activity time series. △ Less

Submitted 21 September, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

Journal ref: IEEE Transactions on Signal Processing (2023), volume 71, pages 3164 - 3178

arXiv:2208.11870 [pdf, other]

Fix-A-Step: Semi-supervised Learning from Uncurated Unlabeled Data

Authors: Zhe Huang, Mary-Joy Sidhom, Benjamin S. Wessler, Michael C. Hughes

Abstract: Semi-supervised learning (SSL) promises improved accuracy compared to training classifiers on small labeled datasets by also training on many unlabeled images. In real applications like medical imaging, unlabeled data will be collected for expediency and thus uncurated: possibly different from the labeled set in classes or features. Unfortunately, modern deep SSL often makes accuracy worse when gi… ▽ More Semi-supervised learning (SSL) promises improved accuracy compared to training classifiers on small labeled datasets by also training on many unlabeled images. In real applications like medical imaging, unlabeled data will be collected for expediency and thus uncurated: possibly different from the labeled set in classes or features. Unfortunately, modern deep SSL often makes accuracy worse when given uncurated unlabeled data. Recent complex remedies try to detect out-of-distribution unlabeled images and then discard or downweight them. Instead, we introduce Fix-A-Step, a simpler procedure that views all uncurated unlabeled images as potentially helpful. Our first insight is that even uncurated images can yield useful augmentations of labeled data. Second, we modify gradient descent updates to prevent optimizing a multi-task SSL loss from hurting labeled-set accuracy. Fix-A-Step can repair many common deep SSL methods, improving accuracy on CIFAR benchmarks across all tested methods and levels of artificial class mismatch. On a new medical SSL benchmark called Heart2Heart, Fix-A-Step can learn from 353,500 truly uncurated ultrasound images to deliver gains that generalize across hospitals. △ Less

Submitted 25 May, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

Comments: AISTATS 2023 (Oral)

arXiv:2207.12231 [pdf, other]

FAT-PIM: Low-Cost Error Detection for Processing-In-Memory

Authors: Kazi Abu Zubair, Sumit Kumar Jha, David Mohaisen, Clayton Hughes, Amro Awad

Abstract: Processing In Memory (PIM) accelerators are promising architecture that can provide massive parallelization and high efficiency in various applications. Such architectures can instantaneously provide ultra-fast operation over extensive data, allowing real-time performance in data-intensive workloads. For instance, Resistive Memory (ReRAM) based PIM architectures are widely known for their inherent… ▽ More Processing In Memory (PIM) accelerators are promising architecture that can provide massive parallelization and high efficiency in various applications. Such architectures can instantaneously provide ultra-fast operation over extensive data, allowing real-time performance in data-intensive workloads. For instance, Resistive Memory (ReRAM) based PIM architectures are widely known for their inherent dot-product computation capability. While the performance of such architecture is essential, reliability and accuracy are also important, especially in mission-critical real-time systems. Unfortunately, the PIM architectures have a fundamental limitation in guaranteeing error-free operation. As a result, current methods must pay high implementation costs or performance penalties to achieve reliable execution in the PIM accelerator. In this paper, we make a fundamental observation of this reliability limitation of ReRAM based PIM architecture. Accordingly, we propose a novel solution--Falut Tolerant PIM or FAT-PIM, that can improve reliability for such systems significantly at a low cost. Our evaluation shows that we can improve the error tolerance significantly with only 4.9% performance cost and 3.9% storage overhead. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: This paper is currently under submission. We arXiv our paper to establish credit for inventing this work

arXiv:2207.11193 [pdf, other]

doi 10.1103/PhysRevA.107.022617

Synthesizing a $\hatσ_z$ spin-dependent force for optical, metastable, and ground state trapped-ion qubits

Authors: O. Băzăvan, S. Saner, M. Minder, A. C. Hughes, R. T. Sutherland, D. M. Lucas, R. Srinivas, C. J. Ballance

Abstract: A single bichromatic field near-resonant to a qubit transition is typically used for $\hatσ_x$ or $\hatσ_y$ Mølmer-Sørensen type interactions in trapped ion systems. Using this field configuration, it is also possible to synthesize a $\hatσ_z$ spin-dependent force by merely adjusting the beat-note frequency. Here, we expand on previous work and present a comprehensive theoretical and experimental… ▽ More A single bichromatic field near-resonant to a qubit transition is typically used for $\hatσ_x$ or $\hatσ_y$ Mølmer-Sørensen type interactions in trapped ion systems. Using this field configuration, it is also possible to synthesize a $\hatσ_z$ spin-dependent force by merely adjusting the beat-note frequency. Here, we expand on previous work and present a comprehensive theoretical and experimental investigation of this scheme with a laser near-resonant to a quadrupole transition in $^{88}$Sr$^+$. Further, we characterise its robustness to optical phase and qubit frequency offsets, and demonstrate its versatility by entangling optical, metastable, and ground state qubits. △ Less

Submitted 1 December, 2022; v1 submitted 22 July, 2022; originally announced July 2022.

Comments: O. Băzăvan and S. Saner contributed equally to this work

arXiv:2206.15322 [pdf, other]

Score Equivalence for Staged Trees

Authors: Conor Hughes, Peter Strong, Aditi Shenvi

Abstract: Staged trees are a recently-developed, powerful family of probabilistic graphical models. An equivalence class of staged trees has now been characterised, and two fundamental statistical operators have been defined to traverse the equivalence class of a given staged tree. Here, two staged trees are said to be statistically equivalent when they represent the same set of distributions. Probabilistic… ▽ More Staged trees are a recently-developed, powerful family of probabilistic graphical models. An equivalence class of staged trees has now been characterised, and two fundamental statistical operators have been defined to traverse the equivalence class of a given staged tree. Here, two staged trees are said to be statistically equivalent when they represent the same set of distributions. Probabilistic graphical models such as staged trees are increasingly being used for causal analyses. Staged trees which are within the same equivalence class can encode very different causal hypotheses but data alone cannot help us distinguish between these. Therefore, in using score-based methods to learn the model structure and distributions from data for causal analyses, we should expect that a suitable scoring function is one which assigns the same score to statistically equivalent models. No scoring function has yet been proven to have this desirable property for staged trees. In this paper, we present a novel Bayesian Dirichlet scoring function based on path uniformity and mass conversation, and prove that this new scoring function is score-equivalent for staged trees. △ Less

Submitted 17 January, 2023; v1 submitted 30 June, 2022; originally announced June 2022.

arXiv:2206.11736 [pdf, other]

NovelCraft: A Dataset for Novelty Detection and Discovery in Open Worlds

Authors: Patrick Feeney, Sarah Schneider, Panagiotis Lymperopoulos, Li-** Liu, Matthias Scheutz, Michael C. Hughes

Abstract: In order for artificial agents to successfully perform tasks in changing environments, they must be able to both detect and adapt to novelty. However, visual novelty detection research often only evaluates on repurposed datasets such as CIFAR-10 originally intended for object classification, where images focus on one distinct, well-centered object. New benchmarks are needed to represent the challe… ▽ More In order for artificial agents to successfully perform tasks in changing environments, they must be able to both detect and adapt to novelty. However, visual novelty detection research often only evaluates on repurposed datasets such as CIFAR-10 originally intended for object classification, where images focus on one distinct, well-centered object. New benchmarks are needed to represent the challenges of navigating the complex scenes of an open world. Our new NovelCraft dataset contains multimodal episodic data of the images and symbolic world-states seen by an agent completing a pogo stick assembly task within a modified Minecraft environment. In some episodes, we insert novel objects of varying size within the complex 3D scene that may impact gameplay. Our visual novelty detection benchmark finds that methods that rank best on popular area-under-the-curve metrics may be outperformed by simpler alternatives when controlling false positives matters most. Further multimodal novelty detection experiments suggest that methods that fuse both visual and symbolic information can improve time until detection as well as overall discrimination. Finally, our evaluation of recent generalized category discovery methods suggests that adapting to new imbalanced categories in complex scenes remains an exciting open problem. △ Less

Submitted 28 March, 2023; v1 submitted 23 June, 2022; originally announced June 2022.

Comments: Published in Transactions on Machine Learning Research (03/2023)

arXiv:2206.00093 [pdf, other]

Easy Variational Inference for Categorical Models via an Independent Binary Approximation

Authors: Michael T. Wojnowicz, Shuchin Aeron, Eric L. Miller, Michael C. Hughes

Abstract: We pursue tractable Bayesian analysis of generalized linear models (GLMs) for categorical data. Thus far, GLMs are difficult to scale to more than a few dozen categories due to non-conjugacy or strong posterior dependencies when using conjugate auxiliary variable methods. We define a new class of GLMs for categorical data called categorical-from-binary (CB) models. Each CB model has a likelihood t… ▽ More We pursue tractable Bayesian analysis of generalized linear models (GLMs) for categorical data. Thus far, GLMs are difficult to scale to more than a few dozen categories due to non-conjugacy or strong posterior dependencies when using conjugate auxiliary variable methods. We define a new class of GLMs for categorical data called categorical-from-binary (CB) models. Each CB model has a likelihood that is bounded by the product of binary likelihoods, suggesting a natural posterior approximation. This approximation makes inference straightforward and fast; using well-known auxiliary variables for probit or logistic regression, the product of binary models admits conjugate closed-form variational inference that is embarrassingly parallel across categories and invariant to category ordering. Moreover, an independent binary model simultaneously approximates multiple CB models. Bayesian model averaging over these can improve the quality of the approximation for any given dataset. We show that our approach scales to thousands of categories, outperforming posterior estimation competitors like Automatic Differentiation Variational Inference (ADVI) and No U-Turn Sampling (NUTS) in the time required to achieve fixed prediction quality. △ Less

Submitted 31 May, 2022; originally announced June 2022.

Comments: to appear at ICML 2022

arXiv:2205.13066 [pdf, other]

Semi-supervised Drifted Stream Learning with Short Lookback

Authors: Weijieying Ren, Pengyang Wang, Xiaolin Li, Charles E. Hughes, Yanjie Fu

Abstract: In many scenarios, 1) data streams are generated in real time; 2) labeled data are expensive and only limited labels are available in the beginning; 3) real-world data is not always i.i.d. and data drift over time gradually; 4) the storage of historical streams is limited and model updating can only be achieved based on a very short lookback window. This learning setting limits the applicability a… ▽ More In many scenarios, 1) data streams are generated in real time; 2) labeled data are expensive and only limited labels are available in the beginning; 3) real-world data is not always i.i.d. and data drift over time gradually; 4) the storage of historical streams is limited and model updating can only be achieved based on a very short lookback window. This learning setting limits the applicability and availability of many Machine Learning (ML) algorithms. We generalize the learning task under such setting as a semi-supervised drifted stream learning with short lookback problem (SDSL). SDSL imposes two under-addressed challenges on existing methods in semi-supervised learning, continuous learning, and domain adaptation: 1) robust pseudo-labeling under gradual shifts and 2) anti-forgetting adaptation with short lookback. To tackle these challenges, we propose a principled and generic generation-replay framework to solve SDSL. The framework is able to accomplish: 1) robust pseudo-labeling in the generation step; 2) anti-forgetting adaption in the replay step. To achieve robust pseudo-labeling, we develop a novel pseudo-label classification model to leverage supervised knowledge of previously labeled data, unsupervised knowledge of new data, and, structure knowledge of invariant label semantics. To achieve adaptive anti-forgetting model replay, we propose to view the anti-forgetting adaptation task as a flat region search problem. We propose a novel minimax game-based replay objective function to solve the flat region search problem and develop an effective optimization solver. Finally, we present extensive experiments to demonstrate our framework can effectively address the task of anti-forgetting learning in drifted streams with short lookback. △ Less

Submitted 25 May, 2022; originally announced May 2022.

Comments: To appear in KDD 2022

arXiv:2204.05959 [pdf]

"Smarter" NICs for faster molecular dynamics: a case study

Authors: Sara Karamati, Clayton Hughes, K. Scott Hemmert, Ryan E. Grant, W. Whit Schonbein, Scott Levy, Thomas M. Conte, Jeffrey Young, Richard W. Vuduc

Abstract: This work evaluates the benefits of using a "smart" network interface card (SmartNIC) as a compute accelerator for the example of the MiniMD molecular dynamics proxy application. The accelerator is NVIDIA's BlueField-2 card, which includes an 8-core Arm processor along with a small amount of DRAM and storage. We test the networking and data movement performance of these cards compared to a standar… ▽ More This work evaluates the benefits of using a "smart" network interface card (SmartNIC) as a compute accelerator for the example of the MiniMD molecular dynamics proxy application. The accelerator is NVIDIA's BlueField-2 card, which includes an 8-core Arm processor along with a small amount of DRAM and storage. We test the networking and data movement performance of these cards compared to a standard Intel server host using microbenchmarks and MiniMD. In MiniMD, we identify two distinct classes of computation, namely core computation and maintenance computation, which are executed in sequence. We restructure the algorithm and code to weaken this dependence and increase task parallelism, thereby making it possible to increase utilization of the BlueField-2 concurrently with the host. We evaluate our implementation on a cluster consisting of 16 dual-socket Intel Broadwell host nodes with one BlueField-2 per host-node. Our results show that while the overall compute performance of BlueField-2 is limited, using them with a modified MiniMD algorithm allows for up to 20% speedup over the host CPU baseline with no loss in simulation accuracy. △ Less

Submitted 12 April, 2022; originally announced April 2022.

arXiv:2204.04575 [pdf, other]

The COHERENT Experimental Program

Authors: D. Akimov, S. Alawabdeh, P. An, A. Arteaga, C. Awe, P. S. Barbeau, C. Barry, B. Becker, V. Belov, I. Bernardi, M. A. Blackston, L. Blokland, C. Bock, B. Bodur, A. Bolozdynya, R. Bouabid, A. Bracho, J. Browning, B. Cabrera-Palmer, N. Chen, D. Chernyak, E. Conley, J. Daughhetee, J. Daughtry, E. Day , et al. (106 additional authors not shown)

Abstract: The COHERENT experiment located in Neutrino Alley at the Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL), has made the world's first two measurements of coherent elastic neutrino-nucleus scattering (CEvNS), on CsI and argon, using neutrinos produced at the SNS. The COHERENT collaboration continues to pursue CEvNS measurements on various targets as well as additional studies o… ▽ More The COHERENT experiment located in Neutrino Alley at the Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL), has made the world's first two measurements of coherent elastic neutrino-nucleus scattering (CEvNS), on CsI and argon, using neutrinos produced at the SNS. The COHERENT collaboration continues to pursue CEvNS measurements on various targets as well as additional studies of inelastic neutrino-nucleus interactions, searches for accelerator-produced dark matter (DM) and physics beyond the Standard Model, using the uniquely high-quality and high-intensity neutrino source available at the SNS. This white paper describes primarily COHERENT's ongoing and near-future program at the SNS First Target Station (FTS). Opportunities enabled by the SNS Second Target Station (STS) for the study of neutrino physics and development of novel detector technologies are elaborated in a separate white paper. △ Less

Submitted 9 April, 2022; originally announced April 2022.

Comments: 38 papers, 24 figures; Snowmass contribution

arXiv:2203.10843 [pdf]

doi 10.3847/1538-4357/ac5fa7

The GALAH Survey: A New Sample of Extremely Metal-Poor Stars Using A Machine Learning Classification Algorithm

Authors: Arvind C. N. Hughes, Lee R. Spitler, Daniel B. Zucker, Thomas Nordlander, Jeffrey Simpson, Gary S. Da Costa, Yuan-Sen Ting, Chengyuan Li, Joss Bland-Hawthorn, Sven Buder, Andrew R. Casey, Gayandhi M. De Silva, Valentina D'Orazi, Ken C. Freeman, Michael R. Hayden, Janez Kos, Geraint F. Lewis, Jane Lin, Karin Lind, Sarah L. Martell, Katharine J. Schlesinger, Sanjib Sharma, Tomaz Zwitter, The GALAH Collaboration

Abstract: Extremely Metal-Poor (EMP) stars provide a valuable probe of early chemical enrichment in the Milky Way. Here we leverage a large sample of $\sim600,000$ high-resolution stellar spectra from the GALAH survey plus a machine learning algorithm to find 54 candidates with estimated [Fe/H]~$\leq$~-3.0, 6 of which have [Fe/H]~$\leq$~-3.5. Our sample includes $\sim 20 \%$ main sequence EMP candidates, un… ▽ More Extremely Metal-Poor (EMP) stars provide a valuable probe of early chemical enrichment in the Milky Way. Here we leverage a large sample of $\sim600,000$ high-resolution stellar spectra from the GALAH survey plus a machine learning algorithm to find 54 candidates with estimated [Fe/H]~$\leq$~-3.0, 6 of which have [Fe/H]~$\leq$~-3.5. Our sample includes $\sim 20 \%$ main sequence EMP candidates, unusually high for \emp surveys. We find the magnitude-limited metallicity distribution function of our sample is consistent with previous work that used more complex selection criteria. The method we present has significant potential for application to the next generation of massive stellar spectroscopic surveys, which will expand the available spectroscopic data well into the millions of stars. △ Less

Submitted 8 August, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

Comments: 27 pages, 20 figures, accepted for publication in ApJ, candidate table available at this https://github.com/arvhug/GALAH---TSNE_EMP

arXiv:2203.06550 [pdf, other]

Reinforced Imitative Graph Learning for Mobile User Profiling

Authors: Dongjie Wang, Pengyang Wang, Yanjie Fu, Kunpeng Liu, Hui Xiong, Charles E. Hughes

Abstract: Mobile user profiling refers to the efforts of extracting users' characteristics from mobile activities. In order to capture the dynamic varying of user characteristics for generating effective user profiling, we propose an imitation-based mobile user profiling framework. Considering the objective of teaching an autonomous agent to imitate user mobility based on the user's profile, the user profil… ▽ More Mobile user profiling refers to the efforts of extracting users' characteristics from mobile activities. In order to capture the dynamic varying of user characteristics for generating effective user profiling, we propose an imitation-based mobile user profiling framework. Considering the objective of teaching an autonomous agent to imitate user mobility based on the user's profile, the user profile is the most accurate when the agent can perfectly mimic the user behavior patterns. The profiling framework is formulated into a reinforcement learning task, where an agent is a next-visit planner, an action is a POI that a user will visit next, and the state of the environment is a fused representation of a user and spatial entities. An event in which a user visits a POI will construct a new state, which helps the agent predict users' mobility more accurately. In the framework, we introduce a spatial Knowledge Graph (KG) to characterize the semantics of user visits over connected spatial entities. Additionally, we develop a mutual-updating strategy to quantify the state that evolves over time. Along these lines, we develop a reinforcement imitative graph learning framework for mobile user profiling. Finally, we conduct extensive experiments to demonstrate the superiority of our approach. △ Less

Submitted 12 March, 2022; originally announced March 2022.

Comments: TKDE Under Review

arXiv:2201.00896 [pdf, ps, other]

The Inexact Cyclic Block Proximal Gradient Method and Properties of Inexact Proximal Maps

Authors: Leandro Maia, David Huckleberry Gutman, Ryan Christopher Hughes

Abstract: This paper expands the Cyclic Block Proximal Gradient method for block separable composite minimization by allowing for inexactly computed gradients and proximal maps. The resultant algorithm, the Inexact Cyclic Block Proximal Gradient (I-CBPG) method, shares the same convergence rate as its exactly computed analogue provided the allowable errors decrease sufficiently quickly or are pre-selected t… ▽ More This paper expands the Cyclic Block Proximal Gradient method for block separable composite minimization by allowing for inexactly computed gradients and proximal maps. The resultant algorithm, the Inexact Cyclic Block Proximal Gradient (I-CBPG) method, shares the same convergence rate as its exactly computed analogue provided the allowable errors decrease sufficiently quickly or are pre-selected to be sufficiently small. We provide numerical experiments that showcase the practical computational advantage of I-CBPG for certain fixed tolerances of approximation error and for a dynamically decreasing error tolerance regime in particular. We establish a tight relationship between inexact proximal map evaluations and $δ$-subgradients in our $δ$-Second Prox Theorem. This theorem forms the foundation of our convergence analysis and enables us to show that inexact gradient computations and other notions of inexact proximal map computation can be subsumed within a single unifying framework. △ Less

Submitted 3 January, 2022; originally announced January 2022.

arXiv:2110.07117 [pdf, other]

doi 10.1098/rsta.2021.0300

FAIR Data Pipeline: provenance-driven data management for traceable scientific workflows

Authors: Sonia Natalie Mitchell, Andrew Lahiff, Nathan Cummings, Jonathan Hollocombe, Bram Boskamp, Ryan Field, Dennis Reddyhoff, Kristian Zarebski, Antony Wilson, Bruno Viola, Martin Burke, Blair Archibald, Paul Bessell, Richard Blackwell, Lisa A Boden, Alys Brett, Sam Brett, Ruth Dundas, Jessica Enright, Alejandra N. Gonzalez-Beltran, Claire Harris, Ian Hinder, Christopher David Hughes, Martin Knight, Vino Mano , et al. (13 additional authors not shown)

Abstract: Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily da… ▽ More Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of "following the science" are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline developed during the COVID-19 pandemic that allows easy annotation of data as they are consumed by analyses, while tracing the provenance of scientific outputs back through the analytical source code to data sources. Such a tool provides a mechanism for the public, and fellow scientists, to better assess the trust that should be placed in scientific evidence, while allowing scientists to support policy-makers in openly justifying their decisions. We believe that tools such as this should be promoted for use across all areas of policy-facing research. △ Less

Submitted 4 May, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

arXiv:2110.06741 [pdf, other]

Dynamical Wasserstein Barycenters for Time-series Modeling

Authors: Kevin C. Cheng, Shuchin Aeron, Michael C. Hughes, Eric L. Miller

Abstract: Many time series can be modeled as a sequence of segments representing high-level discrete states, such as running and walking in a human activity application. Flexible models should describe the system state and observations in stationary "pure-state" periods as well as transition periods between adjacent segments, such as a gradual slowdown between running and walking. However, most prior work a… ▽ More Many time series can be modeled as a sequence of segments representing high-level discrete states, such as running and walking in a human activity application. Flexible models should describe the system state and observations in stationary "pure-state" periods as well as transition periods between adjacent segments, such as a gradual slowdown between running and walking. However, most prior work assumes instantaneous transitions between pure discrete states. We propose a dynamical Wasserstein barycentric (DWB) model that estimates the system state over time as well as the data-generating distributions of pure states in an unsupervised manner. Our model assumes each pure state generates data from a multivariate normal distribution, and characterizes transitions between states via displacement-interpolation specified by the Wasserstein barycenter. The system state is represented by a barycentric weight vector which evolves over time via a random walk on the simplex. Parameter learning leverages the natural Riemannian geometry of Gaussian distributions under the Wasserstein distance, which leads to improved convergence speeds. Experiments on several human activity datasets show that our proposed DWB model accurately learns the generating distribution of pure states while improving state estimation for transition periods compared to the commonly used linear interpolation mixture models. △ Less

Submitted 31 October, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: To appear at Neurips 2021

arXiv:2110.01752 [pdf, other]

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU

Authors: Geonhwa Jeong, Eric Qin, Ananda Samajdar, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna

Abstract: As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have been the premier architectural choice as matrix engines in offload accelerators. However, we demonstrate that incorporating them inside CPUs can introduce under-utilization and stalls due to limited register storage to amortize the fill and… ▽ More As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have been the premier architectural choice as matrix engines in offload accelerators. However, we demonstrate that incorporating them inside CPUs can introduce under-utilization and stalls due to limited register storage to amortize the fill and drain times of the array. To address this, we propose RASA, Register-Aware Systolic Array. We develop techniques to divide an execution stage into several sub-stages and overlap instructions to hide overheads and run them concurrently. RASA-based designs improve performance significantly with negligible area and power overhead. △ Less

Submitted 4 October, 2021; originally announced October 2021.

Comments: This paper is accepted to DAC 2021

arXiv:2109.03601 [pdf, other]

Assessing the Needs of the Quantum Industry

Authors: Ciaran Hughes, Doug Finke, Dan-Adrian German, Celia Merzbacher, Patrick M. Vora, H. J. Lewandowski

Abstract: Quantum information science and technology (QIST) has progressed significantly in the last decade, such that it is no longer solely in the domain of research labs, but is now beginning to be developed for, and applied in, industrial applications and products. With the emergence of this new quantum industry, a new workforce trained in QIST skills and knowledge is needed. To help support education a… ▽ More Quantum information science and technology (QIST) has progressed significantly in the last decade, such that it is no longer solely in the domain of research labs, but is now beginning to be developed for, and applied in, industrial applications and products. With the emergence of this new quantum industry, a new workforce trained in QIST skills and knowledge is needed. To help support education and training of this workforce, universities and colleges require knowledge of the type of jobs available for their students and what skills and degrees are most relevant for those new jobs. Additionally, students need to know how to tailor their degrees to best align with the current needs of the quantum industry. We report on the results from a survey of 57 companies in the quantum industry, with the goal of elucidating the jobs, skills, and degrees that are relevant for this new workforce. We find a range of job opportunities from highly specific jobs, such as quantum algorithm developer and error correction scientist, to broader jobs categories within the business, software, and hardware sectors. These broader jobs require a range of skills, most of which are not quantum related. Further, except for the highly specific jobs, companies that responded to the survey are looking for a range of degree levels to fill these new positions, from bachelors to masters to PhDs. With this knowledge, students, instructors, and university administrators can make informed decisions about how to address the challenge of increasing the future quantum workforce. △ Less

Submitted 25 August, 2021; originally announced September 2021.

arXiv:2108.00080 [pdf, other]

A New Semi-supervised Learning Benchmark for Classifying View and Diagnosing Aortic Stenosis from Echocardiograms

Authors: Zhe Huang, Gary Long, Benjamin Wessler, Michael C. Hughes

Abstract: Semi-supervised image classification has shown substantial progress in learning from limited labeled data, but recent advances remain largely untested for clinical applications. Motivated by the urgent need to improve timely diagnosis of life-threatening heart conditions, especially aortic stenosis, we develop a benchmark dataset to assess semi-supervised approaches to two tasks relevant to cardia… ▽ More Semi-supervised image classification has shown substantial progress in learning from limited labeled data, but recent advances remain largely untested for clinical applications. Motivated by the urgent need to improve timely diagnosis of life-threatening heart conditions, especially aortic stenosis, we develop a benchmark dataset to assess semi-supervised approaches to two tasks relevant to cardiac ultrasound (echocardiogram) interpretation: view classification and disease severity classification. We find that a state-of-the-art method called MixMatch achieves promising gains in heldout accuracy on both tasks, learning from a large volume of truly unlabeled images as well as a labeled set collected at great expense to achieve better performance than is possible with the labeled set alone. We further pursue patient-level diagnosis prediction, which requires aggregating across hundreds of images of diverse view types, most of which are irrelevant, to make a coherent prediction. The best patient-level performance is achieved by new methods that prioritize diagnosis predictions from images that are predicted to be clinically-relevant views and transfer knowledge from the view task to the diagnosis task. We hope our released Tufts Medical Echocardiogram Dataset and evaluation framework inspire further improvements in multi-task semi-supervised learning for clinical applications. △ Less

Submitted 30 July, 2021; originally announced August 2021.

Comments: To appear in the Proceedings of the Machine Learning for Healthcare (MLHC) conference, 2021. 20 pages (including 7 tables & 3 figures). 13 additional pages of references and supplementary material

arXiv:2107.13379 [pdf, other]

Evaluating the Use of Reconstruction Error for Novelty Localization

Authors: Patrick Feeney, Michael C. Hughes

Abstract: The pixelwise reconstruction error of deep autoencoders is often utilized for image novelty detection and localization under the assumption that pixels with high error indicate which parts of the input image are unfamiliar and therefore likely to be novel. This assumed correlation between pixels with high reconstruction error and novel regions of input images has not been verified and may limit th… ▽ More The pixelwise reconstruction error of deep autoencoders is often utilized for image novelty detection and localization under the assumption that pixels with high error indicate which parts of the input image are unfamiliar and therefore likely to be novel. This assumed correlation between pixels with high reconstruction error and novel regions of input images has not been verified and may limit the accuracy of these methods. In this paper we utilize saliency maps to evaluate whether this correlation exists. Saliency maps reveal directly how much a change in each input pixel would affect reconstruction loss, while each pixel's reconstruction error may be attributed to many input pixels when layers are fully connected. We compare saliency maps to reconstruction error maps via qualitative visualizations as well as quantitative correspondence between the top K elements of the maps for both novel and normal images. Our results indicate that reconstruction error maps do not closely correlate with the importance of pixels in the input images, making them insufficient for novelty localization. △ Less

Submitted 28 July, 2021; originally announced July 2021.

arXiv:2106.03005 [pdf, other]

A discrete mean-value theorem for the higher derivatives of the Riemann zeta function

Authors: Christopher Hughes, Andrew Pearce-Crump

Abstract: We show that the $n$th derivative of the Riemann zeta function, when summed over the non-trivial zeros of zeta, is real and positive/negative in the mean for $n$ odd/even, respectively. We show this by giving a full asymptotic expansion of these sums. We show that the $n$th derivative of the Riemann zeta function, when summed over the non-trivial zeros of zeta, is real and positive/negative in the mean for $n$ odd/even, respectively. We show this by giving a full asymptotic expansion of these sums. △ Less

Submitted 3 March, 2022; v1 submitted 5 June, 2021; originally announced June 2021.

Comments: This version gives a much more detailed explanation for the bound on the error terms, correcting the previous bound. The presentation has been changed in various places to help with the flow and accuracy of our comments

MSC Class: 11M06; 11M26

arXiv:2106.02206 [pdf, other]

Stochastic Iterative Graph Matching

Authors: Linfeng Liu, Michael C. Hughes, Soha Hassoun, Li-** Liu

Abstract: Recent works leveraging Graph Neural Networks to approach graph matching tasks have shown promising results. Recent progress in learning discrete distributions poses new opportunities for learning graph matching models. In this work, we propose a new model, Stochastic Iterative Graph MAtching (SIGMA), to address the graph matching problem. Our model defines a distribution of matchings for a graph… ▽ More Recent works leveraging Graph Neural Networks to approach graph matching tasks have shown promising results. Recent progress in learning discrete distributions poses new opportunities for learning graph matching models. In this work, we propose a new model, Stochastic Iterative Graph MAtching (SIGMA), to address the graph matching problem. Our model defines a distribution of matchings for a graph pair so the model can explore a wide range of possible matchings. We further introduce a novel multi-step matching procedure, which learns how to refine a graph pair's matching results incrementally. The model also includes dummy nodes so that the model does not have to find matchings for nodes without correspondence. We fit this model to data via scalable stochastic optimization. We conduct extensive experiments across synthetic graph datasets as well as biochemistry and computer vision applications. Across all tasks, our results show that SIGMA can produce significantly improved graph matching results compared to state-of-the-art models. Ablation studies verify that each of our components (stochastic training, iterative matching, and dummy nodes) offers noticeable improvement. △ Less

Submitted 12 September, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

Comments: ICML 2021

Showing 1–50 of 151 results for author: Hughes, C