Search | arXiv e-print repository

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Authors: Guilherme Penedo, Hynek Kydlíček, Loubna Ben allal, Anton Lozhkov, Margaret Mitchell, Colin Raffel, Leandro Von Werra, Thomas Wolf

Abstract: The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-of-the-art open LLMs like Llama 3 and Mixtral are not publicly available and very little is known about how they were created. In this work, we introduce FineWeb, a 15-trillion token dataset derived from 96 Common Crawl snapshots that produ… ▽ More The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-of-the-art open LLMs like Llama 3 and Mixtral are not publicly available and very little is known about how they were created. In this work, we introduce FineWeb, a 15-trillion token dataset derived from 96 Common Crawl snapshots that produces better-performing LLMs than other open pretraining datasets. To advance the understanding of how best to curate high-quality pretraining datasets, we carefully document and ablate all of the design choices used in FineWeb, including in-depth investigations of deduplication and filtering strategies. In addition, we introduce FineWeb-Edu, a 1.3-trillion token collection of educational text filtered from FineWeb. LLMs pretrained on FineWeb-Edu exhibit dramatically better performance on knowledge- and reasoning-intensive benchmarks like MMLU and ARC. Along with our datasets, we publicly release our data curation codebase and all of the models trained during our ablation experiments. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.12974 [pdf, other]

Bounding irrelevant operators in the 3d Gross-Neveu-Yukawa CFTs

Authors: Matthew S. Mitchell, David Poland

Abstract: We perform a numerical bootstrap study of scalar operators in the critical 3d Gross-Neveu-Yukawa models, a family of conformal field theories containing N Majorana fermions in the fundamental representation of an O(N) global symmetry. We compute rigorous bounds on the scaling dimensions of the next-to-lowest parity-even and parity-odd singlet scalars at N = 2, 4, and 8. All of these dimensions hav… ▽ More We perform a numerical bootstrap study of scalar operators in the critical 3d Gross-Neveu-Yukawa models, a family of conformal field theories containing N Majorana fermions in the fundamental representation of an O(N) global symmetry. We compute rigorous bounds on the scaling dimensions of the next-to-lowest parity-even and parity-odd singlet scalars at N = 2, 4, and 8. All of these dimensions have lower bounds greater than 3, implying that there are only two relevant singlet scalars and placing constraints on the RG flow structure of these theories. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 13 pages, 4 figures

arXiv:2406.12951 [pdf]

Reviewing climate change attribution in UK natural hazards and their impacts

Authors: Regan Mudhar, Dann M. Mitchell, Peter A. Stott, Richard A. Betts

Abstract: The field of Detection and Attribution is rapidly moving beyond weather and climate, and towards incorporating hazards and their impacts on natural and human systems. Here, we review the comprehensive literature base relevant for the UK ahead of the next Climate Change Risk Assessment. The current literature highlights a detectable and non-trivial influence of climate change in many UK impact sect… ▽ More The field of Detection and Attribution is rapidly moving beyond weather and climate, and towards incorporating hazards and their impacts on natural and human systems. Here, we review the comprehensive literature base relevant for the UK ahead of the next Climate Change Risk Assessment. The current literature highlights a detectable and non-trivial influence of climate change in many UK impact sectors already - notably health, agriculture, and infrastructure. We found that heatwaves were the most studied hazard overall, with a unanimous consensus on a strong attributable signal of human-induced climate change in their increased frequency and intensity over the last century. The most notable gap identified overall was in attributing climate-related impacts to human influence, with a few impact studies for only a handful of the hazards assessed. Furthermore, just under half of the 29 hazards were not found to have any UK-relevant attribution studies, with most of the remainder having three or fewer. This review highlights requirements for and opportunities to develop attribution scicnce to meet the needs of the UK. Diversifying hazards and impacts studied, in conjunction with the techniques and approaches used, will undoubtedly benefit the community. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2405.13974 [pdf, other]

CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models

Authors: Giada Pistilli, Alina Leidinger, Yacine Jernite, Atoosa Kasirzadeh, Alexandra Sasha Luccioni, Margaret Mitchell

Abstract: This paper introduces the "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset, designed to evaluate the social and cultural variation of Large Language Models (LLMs) across multiple languages and value-sensitive topics. We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, so… ▽ More This paper introduces the "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset, designed to evaluate the social and cultural variation of Large Language Models (LLMs) across multiple languages and value-sensitive topics. We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy. CIVICS is designed to generate responses showing LLMs' encoded and implicit values. Through our dynamic annotation processes, tailored prompt design, and experiments, we investigate how open-weight LLMs respond to value-sensitive issues, exploring their behavior across diverse linguistic and cultural contexts. Using two experimental set-ups based on log-probabilities and long-form responses, we show social and cultural variability across different LLMs. Specifically, experiments involving long-form responses demonstrate that refusals are triggered disparately across models, but consistently and more frequently in English or translated statements. Moreover, specific topics and sources lead to more pronounced differences across model answers, particularly on immigration, LGBTQI rights, and social welfare. As shown by our experiments, the CIVICS dataset aims to serve as a tool for future research, promoting reproducibility and transparency across broader linguistic settings, and furthering the development of AI technologies that respect and reflect global cultural diversities and value pluralism. The CIVICS dataset and tools will be made available upon publication under open licenses; an anonymized version is currently available at https://huggingface.co/CIVICS-dataset. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.10715 [pdf, other]

Functionalized mm-scale vapor cells for alkali-metal spectroscopy and magnetometry

Authors: Harini Raghavan, Michael C. D. Tayler, Kostas Mouloudakis, Rachel Rae, Sami Lähteenmäki, Rasmus Zetter, Petteri Laine, Jacques Haesler, Laurent Balet, Thomas Overstolz, Sylvain Karlen, Morgan W. Mitchell

Abstract: We describe micro-fabricated rubidium vapor cells with integrated temperature-control functionality and demonstrate their suitability for use in miniaturized ultra-sensitive magnetometers. These functionalized vapor cells (FVCs) embody a dual-chamber design in low-conductivity silicon with anti-permeation coatings and micro-structured thin-film platinum surface traces as resistive heaters and temp… ▽ More We describe micro-fabricated rubidium vapor cells with integrated temperature-control functionality and demonstrate their suitability for use in miniaturized ultra-sensitive magnetometers. These functionalized vapor cells (FVCs) embody a dual-chamber design in low-conductivity silicon with anti-permeation coatings and micro-structured thin-film platinum surface traces as resistive heaters and temperature sensors. Thermal tests show our ability to control alkali metal distribution within the FVCs, ensuring a clean sensing chamber for optical measurements. Optical absorption spectroscopy is used to correlate the temperature readings with vapor density and to measure buffer gas pressure, of interest for optimizing sensitivity. Finally, we demonstrate zero-field resonance magnetometry with 18 fT/Hz$^{1/2}$ sensitivity in the 10 Hz to 100 Hz band, limited by laser noise and magnetic shield noise, which indicates that the functionalization does not introduce significant magnetic noise. △ Less

Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Authors HR and MCDT contributed equally

arXiv:2405.08833 [pdf, other]

GHz-rate optical phase shift in light matter interaction-engineered, silicon-ferroelectric nematic liquid crystals

Authors: Iman Taghavi, Omid Esmaeeli, Sheri Jahan Chowdhury, Matthew Mitchell, Donald Witt, Cory Pecinovsky, Jason Sickler, Nicolas A. F. Jaeger, Sudip Shekhar, Lukas Chrostowski

Abstract: Organic electro-optic (OEO) materials have demonstrated promising performance in develo** electro-optic phase shifters (EOPS) and modulators compared to their inorganic counterparts. Integration with other devices in a silicon photonic (SiP) process, simple nanofabrication, and temperature/aging robustness remain to be developed for this class of hybrid material platforms. In particular, electro… ▽ More Organic electro-optic (OEO) materials have demonstrated promising performance in develo** electro-optic phase shifters (EOPS) and modulators compared to their inorganic counterparts. Integration with other devices in a silicon photonic (SiP) process, simple nanofabrication, and temperature/aging robustness remain to be developed for this class of hybrid material platforms. In particular, electro-optic (EO) polymers need an electro-thermal poling method, which has limited their potential and utilization in large-scale SiP. Devices made of paraelectric nematic liquid crystals (PN-LC), another primary type of OEO material, feature a very efficient but slow phase shift mechanism. We present a general-purpose EOPS that applies to various modulator embodiments to address these concerns. Based on that, we report a GHz-fast phase shift in a newly discovered family of OEO, namely ferroelectric nematic liquid crystals (FN-LC), which finally enables liquid crystals to have significant second-order nonlinear optical coefficients and associated Pockels effect. The new material avoids poling issues associated with EO polymers and can pave the way for hybrid silicon-OEO systems with CMOS-foundry compatibility. Furthermore, we propose a finger-loaded, non-slotted waveguide that enhances light-matter interaction, allowing us to achieve DC and AC modulation efficiencies of $\approx$ 0.25 V.mm and 25.7 V.mm, respectively, an on-chip insertion loss of $\approx$ 4 dB, and an EO bandwidth of f$_{-6dB}$ >4.18 GHz. The remaining figures of merit for our poling-free EOPS are equivalent to EO polymer-enabled devices with fewer manufacturing difficulties. We demonstrate an electrically and photonically packaged chip that contains >100 silicon-FN-LC modulators to evaluate the large-scale integration of our poling-free phase shifters and modulators. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 21 pages, 5 figures

arXiv:2404.14345 [pdf, other]

Laser-written micro-channel atomic magnetometer

Authors: Andrea Zanoni, Kostas Mouloudakis, Michael C. D. Tayler, Giacomo Corrielli, Roberto Osellame, Morgan W. Mitchell, Vito Giovanni Lucivero

Abstract: We demonstrate a sensitive optically-pumped magnetometer using rubidium vapor and 0.75 amg of nitrogen buffer gas in a sub-mm-width sensing channel excavated by femtosecond laser writing followed by chemical etching. The channel is buried less than 1 mm below the surface of its fused silica host material, which also includes reservoir chambers and micro-strainer connections, to preserve a clean op… ▽ More We demonstrate a sensitive optically-pumped magnetometer using rubidium vapor and 0.75 amg of nitrogen buffer gas in a sub-mm-width sensing channel excavated by femtosecond laser writing followed by chemical etching. The channel is buried less than 1 mm below the surface of its fused silica host material, which also includes reservoir chambers and micro-strainer connections, to preserve a clean optical environment. Using a zero-field-resonance magnetometry strategy and a sensing volume of 2.25 mm$^3$, we demonstrate a sensitivity of $\approx$ 1 $\mathrm{pT}/\sqrt{\mathrm{Hz}}$ at $10$ Hz. The device can be integrated with photonic structures and microfluidic channels with 3D versatility. Its sensitivity, bandwidth and stand-off distance will enable detection of localized fields from magnetic nanoparticles and \mul NMR samples. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 6 pages, 5 figures

arXiv:2403.18516 [pdf, other]

doi 10.3847/2515-5172/ad382a

Multi-View Deep Learning for Imaging Atmospheric Cherenkov Telescopes

Authors: Hannes Warnhofer, Samuel T. Spencer, Alison M. W. Mitchell

Abstract: This research note concerns the application of deep-learning-based multi-view-imaging techniques to data from the H.E.S.S. Imaging Atmospheric Cherenkov Telescope array. We find that the earlier the fusion of layer information from different views takes place in the neural network, the better our model performs with this data. Our analysis shows that the point in the network where the information… ▽ More This research note concerns the application of deep-learning-based multi-view-imaging techniques to data from the H.E.S.S. Imaging Atmospheric Cherenkov Telescope array. We find that the earlier the fusion of layer information from different views takes place in the neural network, the better our model performs with this data. Our analysis shows that the point in the network where the information from the different views is combined is far more important for the model performance than the method used to combine the information. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Accepted in Research Notes of the American Astronomical Society. 3 Pages, 1 Figure

Journal ref: 2024 Res. Notes AAS 8 91

arXiv:2403.16650 [pdf, other]

Probing Stellar Clusters from Gaia DR2 as Galactic PeVatrons: I -- Expected Gamma-ray and Neutrino Emission

Authors: Alison M. W. Mitchell, Giovanni Morlino, Silvia Celli, Stefano Menchiari, Andreas Specovius

Abstract: Young & massive stellar clusters (SCs) are a potential source of galactic cosmic rays up to very high energies as a result of two possible acceleration scenarios. Collective stellar winds from massive member stars form a wind-blown bubble with a termination shock (TS) at which particle acceleration to PeV energies may occur. Furthermore, shock acceleration may occur at SNRs expanding inside the bu… ▽ More Young & massive stellar clusters (SCs) are a potential source of galactic cosmic rays up to very high energies as a result of two possible acceleration scenarios. Collective stellar winds from massive member stars form a wind-blown bubble with a termination shock (TS) at which particle acceleration to PeV energies may occur. Furthermore, shock acceleration may occur at SNRs expanding inside the bubble. By applying a model of CR acceleration at both the wind TS and SNR shocks to catalogues of known SCs derived from Gaia DR2, we identify the most promising targets to search for evidence of PeVatron activity. Predictions for the secondary fluxes of gamma-ray and neutrino emission are derived based on particle acceleration at the collective wind TS and the subsequent hadronic interactions with the surrounding medium. Predictions from our modelling under baseline and optimistic scenarios are compared to data, finding consistent results. We estimate the detection prospects for future gamma-ray and neutrino experiments. We find that degree-scale angular sizes of the wind-blown bubbles are typical, that may pose a challenge for experimental detection. A shortlist of the most promising candidates is provided, with an anticipated flux range. Of order 10 SCs may be detectable with future facilities, and 1-5 could be currently operating as PeVatrons. Of these, three gamma-ray detected SCs have data within our predicted range. Our model can consistently describe gamma-ray measurements of SC emission. Several further as-yet-undetected SCs offer promising targets for future observations, although the flux range allowed by our model can be large (> factor 10). The large angular size of the wind-blown bubble may lead to low surface brightness emission, worsening the problem of source confusion. Nevertheless, we discuss how further work will help to constrain SCs as PeVatron candidates. (abridged) △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: Submitted to Astronomy and Astrophysics. 19 pages, 10 figures, 5 tables

arXiv:2403.08674 [pdf, other]

Quantum jump photodetector for narrowband photon counting with a single atom

Authors: Laura Zarraoa, Romain Veyron, Tomas Lamich, Morgan W. Mitchell

Abstract: Using a single neutral \textsuperscript{87}Rb atom held in an optical trap, and "quantum jump" detection of single-photon-initiated state changes, we demonstrate an intrinsically-narrowband single-photon detector, of interest for separating weak signals from strong optical background. Using novel statistical analysis, we measure quantum efficiency of \SI{2.9+-0.2e-3}{}, a record for single-pass qu… ▽ More Using a single neutral \textsuperscript{87}Rb atom held in an optical trap, and "quantum jump" detection of single-photon-initiated state changes, we demonstrate an intrinsically-narrowband single-photon detector, of interest for separating weak signals from strong optical background. Using novel statistical analysis, we measure quantum efficiency of \SI{2.9+-0.2e-3}{}, a record for single-pass quantum jump production, and dark counts of \SI{9+-20e-3}{counts\per\second} during passive accumulation plus \SI{1.8+-0.1e-2}{counts} per readout, orders of magnitude below those of traditional single-photon detectors. The \SI{6}{\mega\hertz} detection bandwidth is orders of magnitude narrower than existing atomic filters. Available methods can substantially improve \QJPDAcronym{} quantum efficiency, dark counts, bandwidth, and tunability. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 7 pages, 3 figures

arXiv:2403.03212 [pdf, other]

Performance of a modular ton-scale pixel-readout liquid argon time projection chamber

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi… ▽ More The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 47 pages, 41 figures

Report number: FERMILAB-PUB-24-0073-LBNF

arXiv:2402.10766 [pdf, other]

Live magnetic observation of parahydrogen hyperpolarization dynamics

Authors: James Eills, Morgan W. Mitchell, Irene Marco Rius, Michael C. D. Tayler

Abstract: Hyperpolarized nuclear spins in molecules exhibit high magnetization that is unachievable by classical polarization techniques, making them widely used as sensors in physics, chemistry, and medicine. The state of a hyperpolarized material, however, is typically only studied indirectly and with partial destruction of magnetization, due to the nature of conventional detection by resonant-pickup nucl… ▽ More Hyperpolarized nuclear spins in molecules exhibit high magnetization that is unachievable by classical polarization techniques, making them widely used as sensors in physics, chemistry, and medicine. The state of a hyperpolarized material, however, is typically only studied indirectly and with partial destruction of magnetization, due to the nature of conventional detection by resonant-pickup nuclear magnetic resonance spectroscopy or imaging. Here we establish atomic magnetometers with sub-pT sensitivity as an use an alternative modality to detect in real time the complex dynamics of hyperpolarized materials without disturbing or interrupting the magnetogenesis process. As an example of dynamics that are impossible to detect in real time by conventional means, we examine parahydrogen-induced $^{1}$H and $^{13}$C magnetization during adiabatic eigenbasis transformations at \si{\micro\tesla}-field avoided crossings. Continuous but nondestructive magnetometry reveals previously unseen spin dynamics, fidelity limits, and magnetization back-action effects. As a second example, we apply magnetometry to observe the chemical-exchange-driven $^{13}$C hyperpolarization of [1--$^{13}$C]-pyruvate -- the most important spin tracer for clinical metabolic imaging. Our approach can be readily combined with other high-sensitivity magnetometers and is applicable to a broader range of general observation scenarios involving production, transport and systems interaction of hyperpolarized compounds. △ Less

Submitted 22 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.10746 [pdf, other]

Spin projection noise and the magnetic sensitivity of optically pumped magnetometers

Authors: K. Mouloudakis, V. Koutrouli, I. K. Kominis, M. W. Mitchell, G. Vasilakis

Abstract: Present protocols for obtaining the ultimate magnetic sensitivity of optically pumped magnetometers (OPMs) utilizing alkali-metal ensembles rely on uncorrelated atoms in stretched states. A new approach for calculating the spin projection noise (SPN)-limited signal to noise ratio (SNR) and the magnetic sensitivity of OPMs is proposed. Our model is based solely on the mean-field density matrix dyna… ▽ More Present protocols for obtaining the ultimate magnetic sensitivity of optically pumped magnetometers (OPMs) utilizing alkali-metal ensembles rely on uncorrelated atoms in stretched states. A new approach for calculating the spin projection noise (SPN)-limited signal to noise ratio (SNR) and the magnetic sensitivity of OPMs is proposed. Our model is based solely on the mean-field density matrix dynamics and in contrast to previous models, it applies to both low and high field regimes, it takes into account the degree of spin polarization, the intra- and interhyperfine correlations, the decoherence processes, the atom-light coupling and the effects of the spin dynamics on the spin-noise spectra. Fine tuning of the probe frequency allow us to explore different hyperfine states and ground-state correlations. Especially in the spin-exchange-relaxation-free (SERF) regime, alongside the magnetic resonance narrowing and the increased number density, hallmarks of SERF magnetometers, we report on a new SERF feature; the reduction of spin-projection noise at the spin precession frequency as a consequence of strongly-correlated hyperfine spins that attenuate and redistribute SPN when properly probed. △ Less

Submitted 22 February, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.08955 [pdf, other]

Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models

Authors: Martha Lewis, Melanie Mitchell

Abstract: Large language models (LLMs) have performed well on several reasoning benchmarks, including ones that test analogical reasoning abilities. However, it has been debated whether they are actually performing humanlike abstract reasoning or instead employing less general processes that rely on similarity to what has been seen in their training data. Here we investigate the generality of analogy-making… ▽ More Large language models (LLMs) have performed well on several reasoning benchmarks, including ones that test analogical reasoning abilities. However, it has been debated whether they are actually performing humanlike abstract reasoning or instead employing less general processes that rely on similarity to what has been seen in their training data. Here we investigate the generality of analogy-making abilities previously claimed for LLMs (Webb, Holyoak, & Lu, 2023). We take one set of analogy problems used to evaluate LLMs and create a set of "counterfactual" variants-versions that test the same abstract reasoning abilities but that are likely dissimilar from any pre-training data. We test humans and three GPT models on both the original and counterfactual problems, and show that, while the performance of humans remains high for all the problems, the GPT models' performance declines sharply on the counterfactual set. This work provides evidence that, despite previously reported successes of LLMs on analogical reasoning, these models lack the robustness and generality of human analogy-making. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.01568 [pdf, other]

Do** Liquid Argon with Xenon in ProtoDUNE Single-Phase: Effects on Scintillation Light

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, H. Amar Es-sghir, P. Amedo, J. Anderson, D. A. Andrade, C. Andreopoulos , et al. (1300 additional authors not shown)

Abstract: Do** of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first do** test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUN… ▽ More Do** of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first do** test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUNE-SP) at CERN, featuring 770 t of total liquid argon mass with 410 t of fiducial mass. The goal of the run was to measure the light and charge response of the detector to the addition of xenon, up to a concentration of 18.8 ppm. The main purpose was to test the possibility for reduction of non-uniformities in light collection, caused by deployment of photon detectors only within the anode planes. Light collection was analysed as a function of the xenon concentration, by using the pre-existing photon detection system (PDS) of ProtoDUNE-SP and an additional smaller set-up installed specifically for this run. In this paper we first summarize our current understanding of the argon-xenon energy transfer process and the impact of the presence of nitrogen in argon with and without xenon dopant. We then describe the key elements of ProtoDUNE-SP and the injection method deployed. Two dedicated photon detectors were able to collect the light produced by xenon and the total light. The ratio of these components was measured to be about 0.65 as 18.8 ppm of xenon were injected. We performed studies of the collection efficiency as a function of the distance between tracks and light detectors, demonstrating enhanced uniformity of response for the anode-mounted PDS. We also show that xenon do** can substantially recover light losses due to contamination of the liquid argon by nitrogen. △ Less

Submitted 9 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: 35 pages, 20 figures

Report number: CERN-EP-2024-024; FERMILAB-PUB-23-0819-LBNF

arXiv:2401.10376 [pdf, other]

PAC Code Rate-Profile Design Using Search-Constrained Optimization Algorithms

Authors: Mohsen Moradi, David G. M. Mitchell

Abstract: In this paper, we introduce a novel rate-profile design based on search-constrained optimization techniques to assess the performance of polarization-adjusted convolutional (PAC) codes under Fano (sequential) decoding. The results demonstrate that the resulting PAC code offers much reduced computational complexity compared to a construction based on a conventional genetic algorithm without a perfo… ▽ More In this paper, we introduce a novel rate-profile design based on search-constrained optimization techniques to assess the performance of polarization-adjusted convolutional (PAC) codes under Fano (sequential) decoding. The results demonstrate that the resulting PAC code offers much reduced computational complexity compared to a construction based on a conventional genetic algorithm without a performance loss in error-correction performance. As the fitness function of our algorithm, we propose an adaptive successive cancellation list decoding algorithm to determine the weight distribution of the rate profiles. The simulation results indicate that, for a PAC(256, 128) code, only 8% of the population requires that their fitness function be evaluated with a large list size. This represents an improvement of almost 92% over a conventional evolutionary algorithm. For a PAC(64, 32) code, this improvement is about 99%. We also plotted the performance of the high-rate PAC(128, 105) and PAC(64, 51) codes, and the results show that they exhibit superior performance compared to other algorithms. △ Less

Submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.10284 [pdf, other]

MorpheusNet: Resource efficient sleep stage classifier for embedded on-line systems

Authors: Ali Kavoosi, Morgan P. Mitchell, Raveen Kariyawasam, John E. Fleming, Penny Lewis, Heidi Johansen-Berg, Hayriye Cagnan, Timothy Denison

Abstract: Sleep Stage Classification (SSC) is a labor-intensive task, requiring experts to examine hours of electrophysiological recordings for manual classification. This is a limiting factor when it comes to leveraging sleep stages for therapeutic purposes. With increasing affordability and expansion of wearable devices, automating SSC may enable deployment of sleep-based therapies at scale. Deep Learning… ▽ More Sleep Stage Classification (SSC) is a labor-intensive task, requiring experts to examine hours of electrophysiological recordings for manual classification. This is a limiting factor when it comes to leveraging sleep stages for therapeutic purposes. With increasing affordability and expansion of wearable devices, automating SSC may enable deployment of sleep-based therapies at scale. Deep Learning has gained increasing attention as a potential method to automate this process. Previous research has shown accuracy comparable to manual expert scores. However, previous approaches require sizable amount of memory and computational resources. This constrains the ability to classify in real time and deploy models on the edge. To address this gap, we aim to provide a model capable of predicting sleep stages in real-time, without requiring access to external computational sources (e.g., mobile phone, cloud). The algorithm is power efficient to enable use on embedded battery powered systems. Our compact sleep stage classifier can be deployed on most off-the-shelf microcontrollers (MCU) with constrained hardware settings. This is due to the memory footprint of our approach requiring significantly fewer operations. The model was tested on three publicly available data bases and achieved performance comparable to the state of the art, whilst reducing model complexity by orders of magnitude (up to 280 times smaller compared to state of the art). We further optimized the model with quantization of parameters to 8 bits with only an average drop of 0.95% in accuracy. When implemented in firmware, the quantized model achieves a latency of 1.6 seconds on an Arm CortexM4 processor, allowing its use for on-line SSC-based therapies. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: This paper was presented at the 2023 IEEE conference on Systems, Man, and Cybernetics (SMC)

arXiv:2401.05265 [pdf, other]

Stability of superfluids in tilted optical lattices with periodic driving

Authors: Robbie Cruickshank, Andrea Di Carli, Matthew Mitchell, Arthur La Rooij, Stefan Kuhr, Charles E. Creffield, Elmar Haller

Abstract: Tilted lattice potentials with periodic driving play a crucial role in the study of artificial gauge fields and topological phases with ultracold quantum gases. However, driving-induced heating and the growth of phonon modes restrict their use for probing interacting many-body states. Here, we experimentally investigate phonon modes and interaction-driven instabilities of superfluids in the lowest… ▽ More Tilted lattice potentials with periodic driving play a crucial role in the study of artificial gauge fields and topological phases with ultracold quantum gases. However, driving-induced heating and the growth of phonon modes restrict their use for probing interacting many-body states. Here, we experimentally investigate phonon modes and interaction-driven instabilities of superfluids in the lowest band of a shaken optical lattice. We identify stable and unstable parameter regions and provide a general resonance condition. In contrast to the high-frequency approximation of a Floquet description, we use the superfluids' micromotion to analyze the growth of phonon modes from slow to fast driving frequencies. Our observations enable the prediction of stable parameter regimes for quantum-simulation experiments aimed at studying driven systems with strong interactions over extended time scales. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: 12 pages, 9 figures

arXiv:2312.12256 [pdf, other]

Cavity-resonated detection of spin polarization in a microfabricated atomic vapor cell

Authors: María Hernández Ruiz, Yintao Ma, Hana Medhat, Vito Giovanni Lucivero, Morgan W. Mitchell

Abstract: We demonstrate continuous Pound-Drever-Hall (PDH) nondestructive monitoring of the electron spin polarization of an atomic vapor in a microfabricated vapor cell within an optical resonator. The two-chamber silicon and glass cell contains $^{87}$Rb and 1.3 amagat of N$_{2}$ buffer gas, and is placed within a planar optical resonator formed by two mirrors with dichroic dielectric coatings to resonan… ▽ More We demonstrate continuous Pound-Drever-Hall (PDH) nondestructive monitoring of the electron spin polarization of an atomic vapor in a microfabricated vapor cell within an optical resonator. The two-chamber silicon and glass cell contains $^{87}$Rb and 1.3 amagat of N$_{2}$ buffer gas, and is placed within a planar optical resonator formed by two mirrors with dichroic dielectric coatings to resonantly enhance the coupling to phase-modulated probe light near the D$_2$ line at 780 nm. We describe the theory of signal generation in this system, including the spin-dependent complex refractive index, cavity optical transfer functions, and PDH signal response to spin polarization. We observe cavity transmission and PDH signals across $\approx 200$ GHz of detuning around the atomic resonance line. By resonant optical pum** on the 795 nm D$_1$ line, we observe spin-dependent cavity line shifts, in good agreement with theory. We use the saturation of the line shift vs. optical pum** power to calibrate the number density and efficiency of the optical pum**. In the unresolved sideband regime, we observe quantum-noise-limited PDH readout of the spin polarization density, with a flat noise floor of $9 \times 10^9$ spins cm$^{-3}$ Hz$^{-1/2}$ for frequencies above 700 Hz. We note possible extensions of the technique. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.09323 [pdf, other]

Perspectives on the State and Future of Deep Learning - 2023

Authors: Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

Abstract: The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until the AI singularity paperclip-frenzy-driven doomsday, kee** an updated list of topical questions and interviewing new community members for each edition. In this issue, we probed people's opinions on inter… ▽ More The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until the AI singularity paperclip-frenzy-driven doomsday, kee** an updated list of topical questions and interviewing new community members for each edition. In this issue, we probed people's opinions on interpretable AI, the value of benchmarking in modern NLP, the state of progress towards understanding deep learning, and the future of academia. △ Less

Submitted 18 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

arXiv:2312.03452 [pdf, other]

Telling different unravelings apart via nonlinear quantum-trajectory averages

Authors: Eloy Piñol, Th. K. Mavrogordatos, Dustin Keys, Romain Veyron, Piotr Sierant, Miguel Angel García-March, Samuele Grandi, Morgan W. Mitchell, Jan Wehr, Maciej Lewenstein

Abstract: The Gorini-Kossakowski-Sudarshan-Lindblad master equation (ME) governs the density matrix of open quantum systems (OQSs). When an OQS is subjected to weak continuous measurement, its state evolves as a stochastic quantum trajectory, whose statistical average solves the ME. The ensemble of such trajectories is termed an unraveling of the ME. We propose a method to operationally distinguish unraveli… ▽ More The Gorini-Kossakowski-Sudarshan-Lindblad master equation (ME) governs the density matrix of open quantum systems (OQSs). When an OQS is subjected to weak continuous measurement, its state evolves as a stochastic quantum trajectory, whose statistical average solves the ME. The ensemble of such trajectories is termed an unraveling of the ME. We propose a method to operationally distinguish unravelings produced by the same ME in different measurement scenarios, using nonlinear averages of observables over trajectories. We apply the method to the paradigmatic quantum nonlinear system of resonance fluorescence in a two-level atom. We compare the Poisson-type unraveling, induced by direct detection of photons scattered from the two-level emitter, and the Wiener-type unraveling, induced by phase-sensitive detection of the emitted field. We show that a quantum-trajectory-averaged variance is able to distinguish these measurement scenarios. We evaluate the performance of the method, which can be readily extended to more complex OQSs, under a range of realistic experimental conditions. △ Less

Submitted 8 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: 5 pages, 3 figures, with supplementary material, revised version

arXiv:2312.03130 [pdf, other]

The DUNE Far Detector Vertical Drift Technology, Technical Design Report

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, H. Amar, P. Amedo, J. Anderson, D. A. Andrade, C. Andreopoulos , et al. (1304 additional authors not shown)

Abstract: DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precisi… ▽ More DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precision measurements of the PMNS matrix parameters, including the CP-violating phase. It will also stand ready to observe supernova neutrino bursts, and seeks to observe nucleon decay as a signature of a grand unified theory underlying the standard model. The DUNE far detector implements liquid argon time-projection chamber (LArTPC) technology, and combines the many tens-of-kiloton fiducial mass necessary for rare event searches with the sub-centimeter spatial resolution required to image those events with high precision. The addition of a photon detection system enhances physics capabilities for all DUNE physics drivers and opens prospects for further physics explorations. Given its size, the far detector will be implemented as a set of modules, with LArTPC designs that differ from one another as newer technologies arise. In the vertical drift LArTPC design, a horizontal cathode bisects the detector, creating two stacked drift volumes in which ionization charges drift towards anodes at either the top or bottom. The anodes are composed of perforated PCB layers with conductive strips, enabling reconstruction in 3D. Light-trap-style photon detection modules are placed both on the cryostat's side walls and on the central cathode where they are optically powered. This Technical Design Report describes in detail the technical implementations of each subsystem of this LArTPC that, together with the other far detector modules and the near detector, will enable DUNE to achieve its physics goals. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 425 pages; 281 figures Central editing team: A. Heavey, S. Kettell, A. Marchionni, S. Palestini, S. Rajogopalan, R. J. Wilson

Report number: Fermilab Report no: TM-2813-LBNF

arXiv:2311.18088 [pdf, other]

NuSTAR Hard X-ray Monitoring of Gravitationally Lensed Quasar RX J1131-1231

Authors: Cora A. DeFrancesco, Xinyu Dai, Mark Mitchell, Abderahmen Zoghbi, Christopher W. Morgan

Abstract: The X-ray emission from active galactic nuclei (AGN) is believed to come from a combination of inverse Compton scattering of photons from the accretion disk and reprocessing of the direct X-ray emission by reflection. We present hard (10-80 keV) and soft (0.5-8 keV) X-ray monitoring of a gravitationally lensed quasar RX J1131-1231 with NuSTAR, Swift, and XMM-Newton between 10 June 2016 and 30 Nove… ▽ More The X-ray emission from active galactic nuclei (AGN) is believed to come from a combination of inverse Compton scattering of photons from the accretion disk and reprocessing of the direct X-ray emission by reflection. We present hard (10-80 keV) and soft (0.5-8 keV) X-ray monitoring of a gravitationally lensed quasar RX J1131-1231 with NuSTAR, Swift, and XMM-Newton between 10 June 2016 and 30 November 2020. Comparing the amplitude of quasar microlensing variability at the hard and soft bands allows a size comparison, where larger sources lead to smaller microlensing variability. During the period between 6 June 2018 and 30 November 2020, where both the hard and soft light curves are available, the hard and soft bands varied by factors of 3.7 and 5.5, respectively, with rms variability of $0.40\pm0.05$ and $0.57\pm0.02$. Both the variability amplitude and rms are moderately smaller for the hard X-ray emission, indicating that the hard X-ray emission is moderately larger than the soft X-ray emission region. We found the reflection fraction from seven joint hard and soft X-ray monitoring epochs is effectively consistent with a constant with low significance variability. After decomposing the total X-ray flux into direct and reprocessed components, we find a smaller variability amplitude for the reprocessed flux compared to the direct emission. The power-law cutoff energy is constrained at 96$^{+47}_{-24}$ keV, which position the system in the allowable parameter space due to the pair production limit. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 12 pages, 6 figures, 3 tables, accepted for publication by the Astrophysical Journal

arXiv:2311.16896 [pdf, other]

65 GOPS/neuron Photonic Tensor Core with Thin-film Lithium Niobate Photonics

Authors: Zhong** Lin, Bhavin J. Shastri, Shangxuan Yu, **gxiang Song, Yuntao Zhu, Arman Safarnejadian, Wangning Cai, Yanmei Lin, Wei Ke, Mustafa Hammood, Tianye Wang, Mengyue Xu, Zibo Zheng, Mohammed Al-Qadasi, Omid Esmaeeli, Mohamed Rahim, Grzegorz Pakulski, Jens Schmid, Pedro Barrios, Weihong Jiang, Hugh Morison, Matthew Mitchell, Xiaogang Qiang, Xun Guan, Nicolas A. F. Jaeger , et al. (6 additional authors not shown)

Abstract: Photonics offers a transformative approach to artificial intelligence (AI) and neuromorphic computing by providing low latency, high bandwidth, and energy-efficient computations. Here, we introduce a photonic tensor core processor enabled by time-multiplexed inputs and charge-integrated outputs. This fully integrated processor, comprising only two thin-film lithium niobate (TFLN) modulators, a III… ▽ More Photonics offers a transformative approach to artificial intelligence (AI) and neuromorphic computing by providing low latency, high bandwidth, and energy-efficient computations. Here, we introduce a photonic tensor core processor enabled by time-multiplexed inputs and charge-integrated outputs. This fully integrated processor, comprising only two thin-film lithium niobate (TFLN) modulators, a III-V laser, and a charge-integration photoreceiver, can implement an entire layer of a neural network. It can execute 65 billion operations per second (GOPS) per neuron, including simultaneous weight updates-a hitherto unachieved speed. Our processor stands out from conventional photonic processors, which have static weights set during training, as it supports fast "hardware-in-the-loop" training, and can dynamically adjust the inputs (fan-in) and outputs (fan-out) within a layer, thereby enhancing its versatility. Our processor can perform large-scale dot-product operations with vector dimensions up to 131,072. Furthermore, it successfully classifies (supervised learning) and clusters (unsupervised learning) 112*112-pixel images after "hardware-in-the-loop" training. To handle "hardware-in-the-loop" training for clustering AI tasks, we provide a solution for multiplications involving two negative numbers based on our processor. △ Less

Submitted 30 November, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

Comments: 19 pages, 6 figures

MSC Class: 78A05

arXiv:2311.09247 [pdf, other]

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks

Authors: Melanie Mitchell, Alessandro B. Palmarini, Arseny Moskvichev

Abstract: We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4, using the ConceptARC benchmark [10], which is designed to evaluate robust understanding and reasoning with core-knowledge concepts. We extend the work of Moskvichev et al. [10] by evaluating GPT-4 on more detailed, one-shot prompting (rather than simple, zero-shot prompts) with text versions of ConceptARC ta… ▽ More We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4, using the ConceptARC benchmark [10], which is designed to evaluate robust understanding and reasoning with core-knowledge concepts. We extend the work of Moskvichev et al. [10] by evaluating GPT-4 on more detailed, one-shot prompting (rather than simple, zero-shot prompts) with text versions of ConceptARC tasks, and by evaluating GPT-4V, the multimodal version of GPT-4, on zero- and one-shot prompts using image versions of the simplest tasks. Our experimental results support the conclusion that neither version of GPT-4 has developed robust abstraction abilities at humanlike levels. △ Less

Submitted 11 December, 2023; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: Corrected Figure 3 (extra spaces were replaced by commas, which were lost in original formatting)

Journal ref: Proceedings of the LLM-CP Workshop, AAAI 2024

arXiv:2310.18007 [pdf, other]

LHAASO J2108+5157 as a Molecular Cloud Illuminated by a Supernova Remnant

Authors: A. M. W. Mitchell

Abstract: The search for Galactic PeVatrons - astrophysical accelerators of cosmic rays to PeV energies - has entered a new phase in recent years with the discovery of the first Ultra-High-Energy (UHE, $E>100$ TeV) gamma-ray sources by the HAWC and LHAASO experiments. Establishing whether the emission is leptonic or hadronic in nature, however, requires multiwavelength data and modelling studies. Among the… ▽ More The search for Galactic PeVatrons - astrophysical accelerators of cosmic rays to PeV energies - has entered a new phase in recent years with the discovery of the first Ultra-High-Energy (UHE, $E>100$ TeV) gamma-ray sources by the HAWC and LHAASO experiments. Establishing whether the emission is leptonic or hadronic in nature, however, requires multiwavelength data and modelling studies. Among the currently known UHE sources, LHAASO J2108+5157 is an enigmatic source without clear association to a plausible accelerator, yet spatially coincident with molecular clouds. We investigate the scenario of a molecular cloud illuminated by cosmic rays accelerated in a nearby supernova remnant (SNR) as an explanation for LHAASO J2108+5157. We aim to constrain the required properties of the SNR as well as which of the clouds identified in the vicinity is the most likely association. We use a model for cosmic ray acceleration in SNRs, their transport through the interstellar medium and subsequent interaction with molecular material, to predict the corresponding gamma-ray emission. The parameter space of SNR properties is explored to find the most plausible parameter combination that can account for the gamma-ray spectrum of LHAASO J2108+5157. In the case that a SNR is illuminating the cloud, we find that it must be young ($<10$ kyr) and located within $40-60$ pc of the cloud. A SN scenario with a low Sedov time is preferred, with a maximum proton energy of 3 PeV assumed. No SNRs matching these properties are currently known, although an as yet undetected SNR remains feasible. The galactic CR sea is insufficient to solely account for the observed flux, such that a PeVatron accelerator must be present in the vicinity. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: 7 pages, 4 figures, 3 tables. Accepted for publication in A&A

arXiv:2310.13956 [pdf, other]

Uniaxial compression of 3D printed samples with voids: laboratory measurements compared with predictions from Effective Medium Theory

Authors: Filip P. Adamus, Ashley Stanton-Yonge, Thomas M. Mitchell, David Healy, Philip G. Meredith

Abstract: 3D printing technology offers the possibility of producing synthetic samples with accurately defined microstructures. As indicated by effective medium theory (EMT), the shapes, orientations, and sizes of voids significantly affect the overall elastic response of a solid body. By performing uniaxial compression tests on twenty types of 3D-printed samples containing voids of different geometries, we… ▽ More 3D printing technology offers the possibility of producing synthetic samples with accurately defined microstructures. As indicated by effective medium theory (EMT), the shapes, orientations, and sizes of voids significantly affect the overall elastic response of a solid body. By performing uniaxial compression tests on twenty types of 3D-printed samples containing voids of different geometries, we examine whether the measured effective elasticities are accurately predicted by EMT. To manufacture the sample, we selected printers that use different technologies; fused deposition modelling (FDM), and stereolithography (SLA). We show how printer settings (FDM case) or sample cure time (SLA case) affect the measured properties. We also examine the reproducibility of elasticity tests on identically designed samples. To obtain the range of theoretical predictions, we assume either uniform strain or uniform stress. Our study of over two hundred samples shows that measured effective elastic moduli can fit EMT predictions with an error of less than 5% using both FDM and SLA methods if certain printing specifications and sample design considerations are taken into account. Notably, we find that the pore volume fraction of the designed samples should be above 1% to induce a measurable softening effect, but below 5% to produce accurate EMT estimations that fit the measured elastic properties of the samples. Our results highlight both the strengths of EMT for predicting the effective properties of solids with low pore fraction volume microstructural configurations, and the limitations for high porosity microstructures. △ Less

Submitted 21 October, 2023; originally announced October 2023.

Comments: 43 pages, 19 figs, 9 tables

arXiv:2310.07429 [pdf, other]

doi 10.22323/1.444.0690

Hadronic Re-Acceleration at the Crab Pulsar Wind Termination Shock as a Source of PeV Gamma-Rays

Authors: Samuel T. Spencer, Alison M. W. Mitchell, Brian Reville

Abstract: Recent results from LHAASO and Tibet AS$γ$ suggest that the Crab Nebula's gamma-ray spectrum extends to the PeV energy range, however the production mechanisms of this highest energy emission remain unclear. It has been postulated that a secondary component of hadronic emission could explain the highest energy gamma-ray flux points, however the origin and acceleration mechanism for this hadronic p… ▽ More Recent results from LHAASO and Tibet AS$γ$ suggest that the Crab Nebula's gamma-ray spectrum extends to the PeV energy range, however the production mechanisms of this highest energy emission remain unclear. It has been postulated that a secondary component of hadronic emission could explain the highest energy gamma-ray flux points, however the origin and acceleration mechanism for this hadronic population has yet to be explained. We postulate one scenario in which hadrons diffuse over time into the Crab pulsar wind nebula from the surrounding supernova ejecta, and are subsequently re-accelerated by the pulsar wind termination shock. We present results of direct particle transport simulations (including radial evolution) to determine if this scenario is viable over the lifetime of the Crab system. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 7 pages, 2 figures. From proceedings of the 38th International Cosmic Ray Conference (ICRC2023)

Journal ref: PoS(ICRC2023)690

arXiv:2310.04487 [pdf, other]

Fiber-taper collected emission from NV centers in high-$Q/V$ diamond microdisks

Authors: Tamiko Masuda, J. P. E. Hadden, David P. Lake, Matthew Mitchell, Sigurd Flågan, Paul E. Barclay

Abstract: Fiber-coupled microdisks are a promising platform for enhancing the spontaneous emission from color centers in diamond. The measured cavity-enhanced emission from the microdisk is governed by the effective volume ($V$) of each cavity mode, the cavity quality factor ($Q$), and the coupling between the microdisk and the fiber. Here we observe photoluminescence from an ensemble of nitrogen-vacancy ce… ▽ More Fiber-coupled microdisks are a promising platform for enhancing the spontaneous emission from color centers in diamond. The measured cavity-enhanced emission from the microdisk is governed by the effective volume ($V$) of each cavity mode, the cavity quality factor ($Q$), and the coupling between the microdisk and the fiber. Here we observe photoluminescence from an ensemble of nitrogen-vacancy centers into high $Q/V$ microdisk modes, which when combined with coherent spectroscopy of the microdisk modes, allows us to elucidate the relative contributions of these factors. The broad emission spectrum acts as an internal light source facilitating mode identification over several cavity free spectral ranges. Analysis of the fiber-taper collected microdisk emission reveals spectral filtering both by the cavity and the fiber-taper, the latter of which we find preferentially couples to higher-order microdisk modes. Coherent mode spectroscopy is used to measure $Q\sim 1\times10^{5}$ -- the highest reported values for diamond microcavities operating at visible wavelengths. With realistic optimization of the microdisk dimensions, we predict that Purcell factors of $\sim 50$ are within reach. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 15 pages, 4 figures

arXiv:2310.01557 [pdf, other]

SmartPlay: A Benchmark for LLMs as Intelligent Agents

Authors: Yue Wu, Xuan Tang, Tom M. Mitchell, Yuanzhi Li

Abstract: Recent large language models (LLMs) have demonstrated great potential toward intelligent agents and next-gen automation, but there currently lacks a systematic benchmark for evaluating LLMs' abilities as agents. We introduce SmartPlay: both a challenging benchmark and a methodology for evaluating LLMs as agents. SmartPlay consists of 6 different games, including Rock-Paper-Scissors, Tower of Hanoi… ▽ More Recent large language models (LLMs) have demonstrated great potential toward intelligent agents and next-gen automation, but there currently lacks a systematic benchmark for evaluating LLMs' abilities as agents. We introduce SmartPlay: both a challenging benchmark and a methodology for evaluating LLMs as agents. SmartPlay consists of 6 different games, including Rock-Paper-Scissors, Tower of Hanoi, Minecraft. Each game features a unique setting, providing up to 20 evaluation settings and infinite environment variations. Each game in SmartPlay uniquely challenges a subset of 9 important capabilities of an intelligent LLM agent, including reasoning with object dependencies, planning ahead, spatial reasoning, learning from history, and understanding randomness. The distinction between the set of capabilities each game test allows us to analyze each capability separately. SmartPlay serves not only as a rigorous testing ground for evaluating the overall performance of LLM agents but also as a road-map for identifying gaps in current methodologies. We release our benchmark at github.com/Microsoft/SmartPlay △ Less

Submitted 17 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

arXiv:2308.16717 [pdf, other]

Joint H.E.S.S. and Fermi-LAT analysis of the region around PSR J1813-1749

Authors: T. Wach, A. M. W. Mitchell, V. Joshi, S. Funk

Abstract: HESS J1813-178 is one of the brightest sources detected during the first HESS Galactic Plane survey. The compact source, also detected by MAGIC, is believed to be a pulsar wind nebula powered by one of the most powerful pulsars known in the Galaxy, PSR J1813-1749 with a spin-down luminosity of $\dot{\mathrm{E}} = 5.6 \cdot 10^{37}\,\mathrm{erg}\,\mathrm{s}^{-1}$. With its extreme physical properti… ▽ More HESS J1813-178 is one of the brightest sources detected during the first HESS Galactic Plane survey. The compact source, also detected by MAGIC, is believed to be a pulsar wind nebula powered by one of the most powerful pulsars known in the Galaxy, PSR J1813-1749 with a spin-down luminosity of $\dot{\mathrm{E}} = 5.6 \cdot 10^{37}\,\mathrm{erg}\,\mathrm{s}^{-1}$. With its extreme physical properties, as well as the pulsar's young age of 5.6 kyrs, the $γ$-rays detected in this region allow us to study the evolution of a highly atypical system. Previous studies of the region in the GeV energy range show emission extended beyond the size of the compact H.E.S.S. source. Using the archival H.E.S.S. data with improved background methods, we perform a detailed morphological and spectral analysis of the region. Additionally to the compact, bright emission component, we find significantly extended emission, whose position is coincident with HESS J1813-178. We reanalyse the region in GeV and derive a joint-model in order to find a continuous description of the emission in the region from GeV to TeV. Using the results derived in this analysis, as well as X-ray and radio data of the region, we perform multi-wavelength spectral modeling. Possible hadronic or leptonic origins of the $γ$-ray emission are investigated, and the diffusion parameters necessary to explain the extended emission are examined. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: 8 pages, 4 figures, 1 table, In proceedings of ICRC2023

Journal ref: Proceedings of the 38th International Cosmic Ray Conference, PoS(ICRC2023)589

arXiv:2308.16669 [pdf, other]

Modelling of highly extended Gamma-ray emission around the Geminga Pulsar as detected with H.E.S.S

Authors: A. M. W. Mitchell, S. Caroff

Abstract: Geminga is an enigmatic radio-quiet gamma-ray pulsar located at a mere 250 pc distance from Earth. Extended very-high-energy gamma-ray emission around the pulsar has been detected by multiple water Cherenkov detector based instruments. However, the detection of extended TeV gamma-ray emission around the Geminga pulsar has proven challenging for IACTs due to the angular scale exceeding the typical… ▽ More Geminga is an enigmatic radio-quiet gamma-ray pulsar located at a mere 250 pc distance from Earth. Extended very-high-energy gamma-ray emission around the pulsar has been detected by multiple water Cherenkov detector based instruments. However, the detection of extended TeV gamma-ray emission around the Geminga pulsar has proven challenging for IACTs due to the angular scale exceeding the typical field-of-view. By detailed studies of background estimation techniques and characterising systematic effects, a detection of highly extended TeV gamma-ray emission could be confirmed by the H.E.S.S. IACT array. Building on the previously announced detection, in this contribution we further characterise the emission and apply an electron diffusion model to the combined gamma-ray data from the H.E.S.S. and HAWC experiments, as well as X-ray data from XMM-Newton. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: 8 pages, 5 figures. In proceedings of ICRC2023 (see also arXiv:2304.02631)

Journal ref: Proceedings of the 38th International Cosmic Ray Conference (ICRC2023) PoS(ICRC2023)590

arXiv:2308.14214 [pdf, other]

doi 10.1103/PhysRevApplied.21.034054

Multiparameter quantum sensing and magnetic communications with a hybrid dc and rf optically pumped magnetometer

Authors: Michał Lipka, Aleksandra Sierant, Charikleia Troullinou, Morgan Mitchell

Abstract: We introduce and demonstrate a hybrid optically pumped magnetometer (HOPM) that simultaneously measures one dc field component and one rf field component quadrature with a single atomic spin ensemble. The HOPM achieves sub-pT/$\sqrt{\mathrm{Hz}}$ sensitivity for both dc and rf fields, and is limited in sensitivity by spin projection noise at low frequencies and by photon shot noise at high frequen… ▽ More We introduce and demonstrate a hybrid optically pumped magnetometer (HOPM) that simultaneously measures one dc field component and one rf field component quadrature with a single atomic spin ensemble. The HOPM achieves sub-pT/$\sqrt{\mathrm{Hz}}$ sensitivity for both dc and rf fields, and is limited in sensitivity by spin projection noise at low frequencies and by photon shot noise at high frequencies. We demonstrate with the HOPM a new application of multiparameter quantum sensing: background-cancelling spread spectrum magnetic communication. We encode a digital message as rf amplitude, spread among sixteen channels from \SI{29}{\kilo\hertz} to \SI{33}{\kilo\hertz} in a noisy magnetic environment, and observe quantum-noise-limited rf magnetic signal recovery enabled by quantum-noise-limited dc noise cancellation, reaching noise rejection of \SI{15}{\decibel} at \SI{100}{\hertz} and more than \SI{20}{\decibel} at \SI{60}{\hertz} and below. We measure signal fidelity versus signal strength and extrinsic noise in communication of a short text message. The combination of high sensitivity, quantum-noise-limited performance, and real-world application potential makes the HOPM ideally suited for study of high-performance multiparameter quantum sensing. △ Less

Submitted 26 March, 2024; v1 submitted 27 August, 2023; originally announced August 2023.

Comments: 10 pages, 6 figures

Journal ref: Phys. Rev. Applied 21, 034054, 2024

arXiv:2308.13090 [pdf, ps, other]

doi 10.1103/PhysRevA.108.052822

Inter-species spin-noise correlations in hot atomic vapors

Authors: K. Mouloudakis, F. Vouzinas, A. Margaritakis, A. Koutsimpela, G. Mouloudakis, V. Koutrouli, M. Skotiniotis, G. P. Tsironis, M. Loulakis, M. W. Mitchell, G. Vasilakis, I. K. Kominis

Abstract: We report an experimental and theoretical study of spin noise correlations in a $^{87}$Rb-$^{133}$Cs unpolarized alkali-metal vapor dominated by spin-exchange collisions. We observe strong unequal-time inter-species correlations and account for these with a first-principles theoretical model. Since the two atomic species have different spin precession frequencies, the dual-species vapor enables th… ▽ More We report an experimental and theoretical study of spin noise correlations in a $^{87}$Rb-$^{133}$Cs unpolarized alkali-metal vapor dominated by spin-exchange collisions. We observe strong unequal-time inter-species correlations and account for these with a first-principles theoretical model. Since the two atomic species have different spin precession frequencies, the dual-species vapor enables the use of an additional experimental handle, the applied magnetic field, for untangling various sub-types of spin correlations. In particular, the measured cross-correlation and auto-correlation spectra shed light on a number of spin-dynamic effects involving intra-atom, inter-atom, intra-species and inter-species correlations. Cross-correlation coefficients exceeding $60\%$ have been observed at low-magnetic fields, where the two spin species couple strongly via spin-exchange collisions. The understanding of such spontaneously generated correlations can motivate the design of quantum-enhanced measurements with single or multi-species spin-polarized alkali-metal vapors used in quantum sensing applications. △ Less

Submitted 15 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.12933 [pdf, other]

doi 10.1103/PhysRevLett.131.133602

Quantum-enhanced magnetometry at optimal number density

Authors: Charikleia Troullinou, Vito Giovanni Lucivero, Morgan W. Mitchell

Abstract: We study the use of squeezed probe light and evasion of measurement back-action to enhance the sensitivity and measurement bandwidth of an optically-pumped magnetometer (OPM) at sensitivity-optimal atom number density. By experimental observation, and in agreement with quantum noise modeling, a spin-exchange-limited OPM probed with off-resonance laser light is shown to have an optimal sensitivity… ▽ More We study the use of squeezed probe light and evasion of measurement back-action to enhance the sensitivity and measurement bandwidth of an optically-pumped magnetometer (OPM) at sensitivity-optimal atom number density. By experimental observation, and in agreement with quantum noise modeling, a spin-exchange-limited OPM probed with off-resonance laser light is shown to have an optimal sensitivity determined by density-dependent quantum noise contributions. Application of squeezed probe light boosts the OPM sensitivity beyond this laser-light optimum, allowing the OPM to achieve sensitivities that it cannot reach with coherent-state probing at any density. The observed quantum sensitivity enhancement at optimal number density is enabled by measurement back-action evasion. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: 5 pages + 3 supplementary, 5 figures

Journal ref: Phys. Rev. Lett. 131, 133602 (2023)

arXiv:2308.09130 [pdf, other]

Feedback Enhanced Phonon Lasing of a Microwave Frequency Resonator

Authors: Peyman Parsa, Prasoon Kumar Shandilya, David P. Lake, Matthew E. Mitchell, Paul E. Barclay

Abstract: The amplitude of self-oscillating mechanical resonators in cavity optomechanical systems is typically limited by nonlinearities arising from the cavity's finite optical bandwidth. We propose and demonstrate a feedback technique for increasing this limit. By modulating the cavity input field with a signal derived from its output intensity, we increase the amplitude of a self-oscillating GHz frequen… ▽ More The amplitude of self-oscillating mechanical resonators in cavity optomechanical systems is typically limited by nonlinearities arising from the cavity's finite optical bandwidth. We propose and demonstrate a feedback technique for increasing this limit. By modulating the cavity input field with a signal derived from its output intensity, we increase the amplitude of a self-oscillating GHz frequency mechanical resonator by $22\%$ (increase in coherent phonon number of $50\%$) limited only by the achievable optomechanical cooperativity of the system. This technique will advance applications dependent on high dynamic mechanical stress, such as coherent spin-phonon coupling, as well as implementations of sensors based on self-oscillating resonators. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.01509 [pdf, other]

The Influence of Satellite Trails on H.E.S.S. Gamma-Ray Astronomical Observations

Authors: Samuel T. Spencer, Thomas Lang, Alison M. W. Mitchell

Abstract: The number of satellites launched into low earth orbit has almost tripled (to over 4000) in the last three years due to the increasing commercialisation of space. Satellite constellations with a total of over 400,000 satellites are proposed to be launched in the near future. Many of these satellites are highly reflective, resulting in a high optical brightness that affects ground-based astronomica… ▽ More The number of satellites launched into low earth orbit has almost tripled (to over 4000) in the last three years due to the increasing commercialisation of space. Satellite constellations with a total of over 400,000 satellites are proposed to be launched in the near future. Many of these satellites are highly reflective, resulting in a high optical brightness that affects ground-based astronomical observations across the electromagnetic spectrum. Despite this, the potential effect of these satellites on Imaging Atmospheric Cherenkov Telescopes (IACTs) has so far been assumed to be negligible due to their nanosecond integration times. This has, however, never been verified. We aim to identify satellite trails in data taken by the High Energy Stereoscopic System (H.E.S.S.) IACT array in Namibia, using Night Sky Background (NSB) data from the CT5 camera installed in 2019. We determine which observation times and pointing directions are affected the most, and evaluate the impact on Hillas parameters used for classification and reconstruction of high-energy Extensive Air Shower events. Finally, we predict how future planned satellite launches will affect gamma-ray observations with IACTs. △ Less

Submitted 2 August, 2023; originally announced August 2023.

Comments: 8 pages, 5 figures, 1 table. Short summary of a full paper which can be found here: arXiv:2307.13293

Journal ref: Proceedings of the 38th International Cosmic Ray Conference (ICRC2023). PoS(ICRC2023)694

arXiv:2307.16869 [pdf, other]

doi 10.1103/PhysRevA.109.L040802

Anomalous noise spectra in a spin-exchange-relaxation-free alkali-metal vapor

Authors: K. Mouloudakis, J. Kong, A. Sierant, E. Arkin, M. Hernández Ruiz, R. Jiménez-Martínez, M. W. Mitchell

Abstract: We perform spin-noise spectroscopy on an unpolarized $^{87}\mathrm{Rb}$ vapor in the spin-exchange-relaxation-free (SERF) regime. We observe noise spectral distributions that deviate strongly from Lorentzian models that accurately describe lower-density regimes. For example, at magnetic fields of $\sim 1 \mathrm{μT}$ and $^{87}\mathrm{Rb}$ densities $\gtrsim 1 \times 10^{14} \rm{atoms/cm^{3}}$ we… ▽ More We perform spin-noise spectroscopy on an unpolarized $^{87}\mathrm{Rb}$ vapor in the spin-exchange-relaxation-free (SERF) regime. We observe noise spectral distributions that deviate strongly from Lorentzian models that accurately describe lower-density regimes. For example, at magnetic fields of $\sim 1 \mathrm{μT}$ and $^{87}\mathrm{Rb}$ densities $\gtrsim 1 \times 10^{14} \rm{atoms/cm^{3}}$ we observe an asymmetric spin-noise distribution in which the resonance line is depleted by about half its power, with the diverted power becoming a broad spectral component that could be mistaken for optical shot noise. The results are in good agreement with recent models accounting for correlations between the ground hyperfine states. We discuss implications for quantum sensing and absolute noise calibration in spin-squeezing and entanglement detection. △ Less

Submitted 28 March, 2024; v1 submitted 31 July, 2023; originally announced July 2023.

arXiv:2307.14347 [pdf, other]

doi 10.3847/2515-5172/ace41f

Photometry of Type II Supernova SN 2023ixf with a Worldwide Citizen Science Network

Authors: Lauren A. Sgro, Thomas M. Esposito, Guillaume Blaclard, Sebastian Gomez, Franck Marchis, Alexei V. Filippenko, Daniel O'Conner Peluso, Stephen S. Lawrence, Aad Verveen, Andreas Wagner, Anouchka Nardi, Barbara Wiart, Benjamin Mirwald, Bill Christensen, Bob Eramia, Bruce Parker, Bruno Guillet, Byungki Kim, Chelsey A. Logan, Christopher C. M. Kyba, Christopher Toulmin, Claudio G. Vantaggiato, Dana Adhis, Dave Gary, Dave Goodey , et al. (66 additional authors not shown)

Abstract: We present highly sampled photometry of the supernova (SN) 2023ixf, a Type II SN in M101, beginning 2 days before its first known detection. To gather these data, we enlisted the global Unistellar Network of citizen scientists. These 252 observations from 115 telescopes show the SN's rising brightness associated with shock emergence followed by gradual decay. We measure a peak $M_{V}$ = -18.18… ▽ More We present highly sampled photometry of the supernova (SN) 2023ixf, a Type II SN in M101, beginning 2 days before its first known detection. To gather these data, we enlisted the global Unistellar Network of citizen scientists. These 252 observations from 115 telescopes show the SN's rising brightness associated with shock emergence followed by gradual decay. We measure a peak $M_{V}$ = -18.18 $\pm$ 0.09 mag at 2023-05-25 21:37 UTC in agreement with previously published analyses. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: 4 pages, 1 figure

Journal ref: Res. Notes AAS 7 141 (2023)

arXiv:2307.14213 [pdf, other]

Soft Air Pocket Force Sensors for Large Scale Flexible Robots

Authors: Michael R. Mitchell, Ciera McFarland, Margaret M. Coad

Abstract: Flexible robots have advantages over rigid robots in their ability to conform physically to their environment and to form a wide variety of shapes. Sensing the force applied by or to flexible robots is useful for both navigation and manipulation tasks, but it is challenging due to the need for the sensors to withstand the robots' shape change without encumbering their functionality. Also, for robo… ▽ More Flexible robots have advantages over rigid robots in their ability to conform physically to their environment and to form a wide variety of shapes. Sensing the force applied by or to flexible robots is useful for both navigation and manipulation tasks, but it is challenging due to the need for the sensors to withstand the robots' shape change without encumbering their functionality. Also, for robots with long or large bodies, the number of sensors required to cover the entire surface area of the robot body can be prohibitive due to high cost and complexity. We present a novel soft air pocket force sensor that is highly flexible, lightweight, relatively inexpensive, and easily scalable to various sizes. Our sensor produces a change in internal pressure that is linear with the applied force. We present results of experimental testing of how uncontrollable factors (contact location and contact area) and controllable factors (initial internal pressure, thickness, size, and number of interior seals) affect the sensitivity. We demonstrate our sensor applied to a vine robot-a soft inflatable robot that "grows" from the tip via eversion-and we show that the robot can successfully grow and steer towards an object with which it senses contact. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: M. R. Mitchell, C. McFarland, and M. M. Coad, "Soft Air Pocket Force Sensors for Large Scale Flexible Robots," in IEEE International Conference on Soft Robotics, 2023, pp. 1-8. Video: https://youtu.be/2De0htilW74

arXiv:2307.13905 [pdf, other]

Reinforcement Learning for Sequential Decoding of Generalized LDPC Codes

Authors: Salman Habib, David G. M. Mitchell

Abstract: In this work, we propose reinforcement learning (RL) for sequential decoding of moderate length generalized low-density parity-check (GLDPC) codes. Here, sequential decoding refers to scheduling all the generalized constraint nodes (GCNs) and single parity-check nodes (SPCNs) of a GLDPC code serially in each iteration. A GLDPC decoding environment is modeled as a finite Markov decision process (MD… ▽ More In this work, we propose reinforcement learning (RL) for sequential decoding of moderate length generalized low-density parity-check (GLDPC) codes. Here, sequential decoding refers to scheduling all the generalized constraint nodes (GCNs) and single parity-check nodes (SPCNs) of a GLDPC code serially in each iteration. A GLDPC decoding environment is modeled as a finite Markov decision process (MDP) in which the state-space comprises of all possible sequences of hard-decision values of the variables nodes (VNs) connected to the scheduled GCN or SPCN, and the action-space of the MDP consists of all possible actions (GCN and SPCN scheduling). The goal of RL is to determine an optimized scheduling policy, i.e., one that results in a decoded codeword by minimizing the complexity of the belief propagation (BP) decoder. For training, we consider the proportion of correct bits at the output of the GCN or SPCN as a reward once it is scheduled. The expected rewards for scheduling all the GCNs/SPCNs in the code's Tanner graph are earned via BP decoding during the RL phase. The proposed RL-based decoding scheme is shown to significantly outperform the standard BP flooding decoder, as well as a sequential decoder in which the GCNs/SPCNs are scheduled randomly. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: accepted for publication at ISTC 2023. arXiv admin note: text overlap with arXiv:2112.13934

arXiv:2307.13293 [pdf, other]

doi 10.1051/0004-6361/202347200

Impact of Satellite Trails on H.E.S.S. Astronomical Observations

Authors: Thomas Lang, Samuel T. Spencer, Alison M. W. Mitchell

Abstract: The number of satellites launched into Earth's orbit has almost tripled in the last three years due to the increasing commercialisation of space. Multiple satellite constellations, consisting of over 400,000 individual satellites, have either been partially launched or are proposed for launch in the near future. Many of these satellites are highly reflective, resulting in a high optical brightness… ▽ More The number of satellites launched into Earth's orbit has almost tripled in the last three years due to the increasing commercialisation of space. Multiple satellite constellations, consisting of over 400,000 individual satellites, have either been partially launched or are proposed for launch in the near future. Many of these satellites are highly reflective, resulting in a high optical brightness that affects ground-based astronomical observations. Despite this caveat, the potential effect of these satellites on gamma-ray-observing Imaging Atmospheric Cherenkov Telescopes (IACTs) has largely been assumed to be negligible due to their nanosecond-scale integration times. However, this assumption has not been verified to date. As IACTs are sensitive to optical wavelength light, we aim to identify satellite trails in data taken by the High Energy Stereoscopic System (H.E.S.S.) IACT array. In particular, this study is aimed at quantifying the potential effects on data quality and extensive air shower event classification and reconstruction. Using night sky background measurements from H.E.S.S., we determined which observation times and pointing directions are affected most by these satellite trails. We then evaluated their impact on the standard Hillas parameter variables used for event analysis. Due to the brightest trails, false trigger events can occur, however, for most modern analyses, the effect on astronomical results will be minimal. We observe a mild increase in the rate of trail detections over time, which is partially correlated with the number of satellite launches. Overall, the fraction of H.E.S.S. data affected is currently minimal. We note that these trails could still have a non-negligible effect on future Cherenkov Telescope Array observations if advanced analysis techniques designed to lower the energy threshold of the instrument are applied. △ Less

Submitted 21 September, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: 11 pages, 14 figures, 5 tables. Replaced with published version. Reproduced with permission from Astronomy & Astrophysics, $©$ ESO

Journal ref: A&A 677, A141 (2023)

arXiv:2306.09924 [pdf, other]

Cosmic ray processes in galactic ecosystems

Authors: Ellis R. Owen, Kinwah Wu, Yoshiyuki Inoue, H. -Y. Karen Yang, Alison M. W. Mitchell

Abstract: Galaxy evolution is an important topic, and our physical understanding must be complete to establish a correct picture. This includes a thorough treatment of feedback. The effects of thermal-mechanical and radiative feedback have been widely considered, however cosmic rays (CRs) are also powerful energy carriers in galactic ecosystems. Resolving the capability of CRs to operate as a feedback agent… ▽ More Galaxy evolution is an important topic, and our physical understanding must be complete to establish a correct picture. This includes a thorough treatment of feedback. The effects of thermal-mechanical and radiative feedback have been widely considered, however cosmic rays (CRs) are also powerful energy carriers in galactic ecosystems. Resolving the capability of CRs to operate as a feedback agent is therefore essential to advance our understanding of the processes regulating galaxies. The effects of CRs are yet to be fully understood, and their complex multi-channel feedback mechanisms operating across the hierarchy of galaxy structures pose a significant technical challenge. This review examines the role of CRs in galaxies, from the scale of molecular clouds to the circum-galactic medium. An overview of their interaction processes, their implications for galaxy evolution, and their observable signatures is provided and their capability to modify the thermal and hydrodynamic configuration of galactic ecosystems is discussed. We present recent advancements in our understanding of CR processes and interpretation of their signatures, and highlight where technical challenges and unresolved questions persist. We discuss how these may be addressed with upcoming opportunities. △ Less

Submitted 12 July, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

Comments: 78 pages, 11 figures, Review Article accepted for publication in Galaxies. Special Issue "A Trip across the Universe: Our Present Knowledge and Future Perspectives"

Report number: RIKEN-iTHEMS-Report-23

arXiv:2306.05949 [pdf, other]

Evaluating the Social Impact of Generative AI Systems in Systems and Society

Authors: Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Canyu Chen, Hal Daumé III, Jesse Dodge, Isabella Duan, Ellie Evans, Felix Friedrich, Avijit Ghosh, Usman Gohar, Sara Hooker, Yacine Jernite, Ria Kalluri, Alberto Lusoli, Alina Leidinger, Michelle Lin, Xiuzhu Lin, Sasha Luccioni, Jennifer Mickel, Margaret Mitchell, Jessica Newman , et al. (6 additional authors not shown)

Abstract: Generative AI systems across modalities, ranging from text (including code), image, audio, and video, have broad social impacts, but there is no official standard for means of evaluating those impacts or for which impacts should be evaluated. In this paper, we present a guide that moves toward a standard approach in evaluating a base generative AI system for any modality in two overarching categor… ▽ More Generative AI systems across modalities, ranging from text (including code), image, audio, and video, have broad social impacts, but there is no official standard for means of evaluating those impacts or for which impacts should be evaluated. In this paper, we present a guide that moves toward a standard approach in evaluating a base generative AI system for any modality in two overarching categories: what can be evaluated in a base system independent of context and what can be evaluated in a societal context. Importantly, this refers to base systems that have no predetermined application or deployment context, including a model itself, as well as system components, such as training data. Our framework for a base system defines seven categories of social impact: bias, stereotypes, and representational harms; cultural values and sensitive content; disparate performance; privacy and data protection; financial costs; environmental costs; and data and content moderation labor costs. Suggested methods for evaluation apply to listed generative modalities and analyses of the limitations of existing evaluations serve as a starting point for necessary investment in future evaluations. We offer five overarching categories for what can be evaluated in a broader societal context, each with its own subcategories: trustworthiness and autonomy; inequality, marginalization, and violence; concentration of authority; labor and creativity; and ecosystem and environment. Each subcategory includes recommendations for mitigating harm. △ Less

Submitted 28 June, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: Forthcoming in Hacker, Engel, Hammer, Mittelstadt (eds), Oxford Handbook on the Foundations and Regulation of Generative AI. Oxford University Press

arXiv:2305.18615 [pdf, other]

doi 10.1145/3593013.3594002

Stronger Together: on the Articulation of Ethical Charters, Legal Tools, and Technical Documentation in ML

Authors: Giada Pistilli, Carlos Munoz Ferrandis, Yacine Jernite, Margaret Mitchell

Abstract: The growing need for accountability of the people behind AI systems can be addressed by leveraging processes in three fields of study: ethics, law, and computer science. While these fields are often considered in isolation, they rely on complementary notions in their interpretation and implementation. In this work, we detail this interdependence and motivate the necessary role of collaborative gov… ▽ More The growing need for accountability of the people behind AI systems can be addressed by leveraging processes in three fields of study: ethics, law, and computer science. While these fields are often considered in isolation, they rely on complementary notions in their interpretation and implementation. In this work, we detail this interdependence and motivate the necessary role of collaborative governance tools in sha** a positive evolution of AI. We first contrast notions of compliance in the ethical, legal, and technical fields; we outline both their differences and where they complement each other, with a particular focus on the roles of ethical charters, licenses, and technical documentation in these interactions. We then focus on the role of values in articulating the synergies between the fields and outline specific mechanisms of interaction between them in practice. We identify how these mechanisms have played out in several open governance fora: an open collaborative workshop, a responsible licensing initiative, and a proposed regulatory framework. By leveraging complementary notions of compliance in these three domains, we can create a more comprehensive framework for governing AI systems that jointly takes into account their technical capabilities, their impact on society, and how technical specifications can inform relevant regulations. Our analysis thus underlines the necessity of joint consideration of the ethical, legal, and technical in AI ethics frameworks to be used on a larger scale to govern AI systems and how the thinking in each of these areas can inform the others. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2305.07141 [pdf, other]

The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain

Authors: Arseny Moskvichev, Victor Vikram Odouard, Melanie Mitchell

Abstract: The abilities to form and abstract concepts is key to human intelligence, but such abilities remain lacking in state-of-the-art AI systems. There has been substantial research on conceptual abstraction in AI, particularly using idealized domains such as Raven's Progressive Matrices and Bongard problems, but even when AI systems succeed on such problems, the systems are rarely evaluated in depth to… ▽ More The abilities to form and abstract concepts is key to human intelligence, but such abilities remain lacking in state-of-the-art AI systems. There has been substantial research on conceptual abstraction in AI, particularly using idealized domains such as Raven's Progressive Matrices and Bongard problems, but even when AI systems succeed on such problems, the systems are rarely evaluated in depth to see if they have actually grasped the concepts they are meant to capture. In this paper we describe an in-depth evaluation benchmark for the Abstraction and Reasoning Corpus (ARC), a collection of few-shot abstraction and analogy problems developed by Chollet [2019]. In particular, we describe ConceptARC, a new, publicly available benchmark in the ARC domain that systematically assesses abstraction and generalization abilities on a number of basic spatial and semantic concepts. ConceptARC differs from the original ARC dataset in that it is specifically organized around "concept groups" -- sets of problems that focus on specific concepts and that are vary in complexity and level of abstraction. We report results on testing humans on this benchmark as well as three machine solvers: the top two programs from a 2021 ARC competition and OpenAI's GPT-4. Our results show that humans substantially outperform the machine solvers on this benchmark, showing abilities to abstract and generalize concepts that are not yet captured by AI systems. We believe that this benchmark will spur improvements in the development of AI systems for conceptual abstraction and in the effective evaluation of such systems. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Journal ref: Transactions on Machine Learning Research, 8/2023

arXiv:2304.13626 [pdf, other]

The Roles of Symbols in Neural-based AI: They are Not What You Think!

Authors: Daniel L. Silver, Tom M. Mitchell

Abstract: We propose that symbols are first and foremost external communication tools used between intelligent agents that allow knowledge to be transferred in a more efficient and effective manner than having to experience the world directly. But, they are also used internally within an agent through a form of self-communication to help formulate, describe and justify subsymbolic patterns of neural activit… ▽ More We propose that symbols are first and foremost external communication tools used between intelligent agents that allow knowledge to be transferred in a more efficient and effective manner than having to experience the world directly. But, they are also used internally within an agent through a form of self-communication to help formulate, describe and justify subsymbolic patterns of neural activity that truly implement thinking. Symbols, and our languages that make use of them, not only allow us to explain our thinking to others and ourselves, but also provide beneficial constraints (inductive bias) on learning about the world. In this paper we present relevant insights from neuroscience and cognitive science, about how the human brain represents symbols and the concepts they refer to, and how today's artificial neural networks can do the same. We then present a novel neuro-symbolic hypothesis and a plausible architecture for intelligent agents that combines subsymbolic representations for symbols and concepts for learning and reasoning. Our hypothesis and associated architecture imply that symbols will remain critical to the future of intelligent systems NOT because they are the fundamental building blocks of thought, but because they are characterizations of subsymbolic processes that constitute thought. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: 28 pages

arXiv:2303.17853 [pdf, other]

Can AI Put Gamma-Ray Astrophysicists Out of a Job?

Authors: Samuel T. Spencer, Vikas Joshi, Alison M. W. Mitchell

Abstract: In what will likely be a litany of generative-model-themed arXiv submissions celebrating April the 1st, we evaluate the capacity of state-of-the-art transformer models to create a paper detailing the detection of a Pulsar Wind Nebula with a non-existent Imaging Atmospheric Cherenkov Telescope (IACT) Array. We do this to evaluate the ability of such models to interpret astronomical observations and… ▽ More In what will likely be a litany of generative-model-themed arXiv submissions celebrating April the 1st, we evaluate the capacity of state-of-the-art transformer models to create a paper detailing the detection of a Pulsar Wind Nebula with a non-existent Imaging Atmospheric Cherenkov Telescope (IACT) Array. We do this to evaluate the ability of such models to interpret astronomical observations and sources based on language information alone, and to assess potential means by which fraudulently generated scientific papers could be identified during peer review (given that reliable generative model watermarking has yet to be deployed for these tools). We conclude that our jobs as astronomers are safe for the time being. From this point on, prompts given to ChatGPT and Stable Diffusion are shown in orange, text generated by ChatGPT is shown in black, whereas analysis by the (human) authors is in blue. △ Less

Submitted 4 April, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

arXiv:2303.17007 [pdf]

doi 10.1103/PhysRevD.107.112012

Impact of cross-section uncertainties on supernova neutrino spectral parameter fitting in the Deep Underground Neutrino Experiment

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, Z. Ahmad, J. Ahmed, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, P. Amedo, J. Anderson, D. A. Andrade , et al. (1294 additional authors not shown)

Abstract: A primary goal of the upcoming Deep Underground Neutrino Experiment (DUNE) is to measure the $\mathcal{O}(10)$ MeV neutrinos produced by a Galactic core-collapse supernova if one should occur during the lifetime of the experiment. The liquid-argon-based detectors planned for DUNE are expected to be uniquely sensitive to the $ν_e$ component of the supernova flux, enabling a wide variety of physics… ▽ More A primary goal of the upcoming Deep Underground Neutrino Experiment (DUNE) is to measure the $\mathcal{O}(10)$ MeV neutrinos produced by a Galactic core-collapse supernova if one should occur during the lifetime of the experiment. The liquid-argon-based detectors planned for DUNE are expected to be uniquely sensitive to the $ν_e$ component of the supernova flux, enabling a wide variety of physics and astrophysics measurements. A key requirement for a correct interpretation of these measurements is a good understanding of the energy-dependent total cross section $σ(E_ν)$ for charged-current $ν_e$ absorption on argon. In the context of a simulated extraction of supernova $ν_e$ spectral parameters from a toy analysis, we investigate the impact of $σ(E_ν)$ modeling uncertainties on DUNE's supernova neutrino physics sensitivity for the first time. We find that the currently large theoretical uncertainties on $σ(E_ν)$ must be substantially reduced before the $ν_e$ flux parameters can be extracted reliably: in the absence of external constraints, a measurement of the integrated neutrino luminosity with less than 10\% bias with DUNE requires $σ(E_ν)$ to be known to about 5%. The neutrino spectral shape parameters can be known to better than 10% for a 20% uncertainty on the cross-section scale, although they will be sensitive to uncertainties on the shape of $σ(E_ν)$. A direct measurement of low-energy $ν_e$-argon scattering would be invaluable for improving the theoretical precision to the needed level. △ Less

Submitted 7 July, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

Comments: 25 pages, 21 figures

Report number: FERMILAB-PUB-23-132-CSAID-LBNF-ND-T

Journal ref: Phys. Rev. D 107, 112012 (2023)

arXiv:2303.11408 [pdf, other]

Stable Bias: Analyzing Societal Representations in Diffusion Models

Authors: Alexandra Sasha Luccioni, Christopher Akiki, Margaret Mitchell, Yacine Jernite

Abstract: As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity a… ▽ More As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity are grounded in social categories of people living in the world, whereas the artificial depictions of fictive humans created by these systems have no inherent gender or ethnicity. To address this need, we propose a new method for exploring the social biases in TTI systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts, and comparing it to the variation engendered by spanning different professions. This allows us to (1) identify specific bias trends, (2) provide targeted scores to directly compare models in terms of diversity and representation, and (3) jointly model interdependent social variables to support a multidimensional analysis. We leverage this method to analyze images generated by 3 popular TTI systems (Dall-E 2, Stable Diffusion v 1.4 and 2) and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents. We also release the datasets and low-code interactive bias exploration platforms developed for this work, as well as the necessary tools to similarly evaluate additional TTI systems. △ Less

Submitted 9 November, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Accepted to NeurIPS Datasets and Benchmarks 2023 (spotlight)

Showing 1–50 of 408 results for author: Mitchell, M