Search | arXiv e-print repository

AtLAST Science Overview Report

Authors: Mark Booth, Pamela Klaassen, Claudia Cicone, Tony Mroczkowski, Martin A. Cordiner, Luca Di Mascolo, Doug Johnstone, Eelco van Kampen, Minju M. Lee, Daizhong Liu, John Orlowski-Scherer, Amélie Saintonge, Matthew W. L. Smith, Alexander Thelen, Sven Wedemeyer, Kazunori Akiyama, Stefano Andreon, Doris Arzoumanian, Tom J. L. C. Bakx, Caroline Bot, Geoffrey Bower, Roman Brajša, Chian-Chou Chen, Elisabete da Cunha, David Eden , et al. (59 additional authors not shown)

Abstract: Submillimeter and millimeter wavelengths provide a unique view of the Universe, from the gas and dust that fills and surrounds galaxies to the chromosphere of our own Sun. Current single-dish facilities have presented a tantalising view of the brightest (sub-)mm sources, and interferometers have provided the exquisite resolution necessary to analyse the details in small fields, but there are still… ▽ More Submillimeter and millimeter wavelengths provide a unique view of the Universe, from the gas and dust that fills and surrounds galaxies to the chromosphere of our own Sun. Current single-dish facilities have presented a tantalising view of the brightest (sub-)mm sources, and interferometers have provided the exquisite resolution necessary to analyse the details in small fields, but there are still many open questions that cannot be answered with current facilities. In this report we summarise the science that is guiding the design of the Atacama Large Aperture Submillimeter Telescope (AtLAST). We demonstrate how tranformational advances in topics including star formation in high redshift galaxies, the diffuse circumgalactic medium, Galactic ecology, cometary compositions and solar flares motivate the need for a 50m, single-dish telescope with a 1-2 degree field of view and a new generation of highly multiplexed continuum and spectral cameras. AtLAST will have the resolution to drastically lower the confusion limit compared to current single-dish facilities, whilst also being able to rapidly map large areas of the sky and detect extended, diffuse structures. Its high sensitivity and large field of view will open up the field of submillimeter transient science by increasing the probability of serendipitous detections. Finally, the science cases listed here motivate the need for a highly flexible operations model capable of short observations of individual targets, large surveys, monitoring programmes, target of opportunity observations and coordinated observations with other observatories. AtLAST aims to be a sustainable, upgradeable, multipurpose facility that will deliver orders of magnitude increases in sensitivity and map** speeds over current and planned submillimeter observatories. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 47 pages, 12 figures. For further details on AtLAST see https://atlast.uio.no

arXiv:2406.16241 [pdf, other]

Position: Benchmarking is Limited in Reinforcement Learning Research

Authors: Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas

Abstract: Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is… ▽ More Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is that conducting rigorous benchmarking experiments requires substantial computational time. This work investigates the sources of increased computation costs in rigorous experiment designs. We show that conducting rigorous performance benchmarks will likely have computational costs that are often prohibitive. As a result, we argue for using an additional experimentation paradigm to overcome the limitations of benchmarking. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 19 pages, 13 figures, The Forty-first International Conference on Machine Learning (ICML 2024)

arXiv:2406.16146 [pdf, other]

Optical Tweezer Arrays of Erbium Atoms

Authors: D. S. Grün, S. J. M. White, A. Ortu, A. Di Carli, H. Edri, M. Lepers, M. J. Mark, F. Ferlaino

Abstract: We present the first successful trap** of single erbium atoms in an array of optical tweezers. Using a single narrow-line optical transition, we achieve deep cooling for direct tweezer loading, pairwise ejection, and continous imaging without additional recoil suppression techniques. Our tweezer wavelength choice enables us to reach the magic trap** condition by tuning the ellipticity of the t… ▽ More We present the first successful trap** of single erbium atoms in an array of optical tweezers. Using a single narrow-line optical transition, we achieve deep cooling for direct tweezer loading, pairwise ejection, and continous imaging without additional recoil suppression techniques. Our tweezer wavelength choice enables us to reach the magic trap** condition by tuning the ellipticity of the trap** light. Additionally, we implement an ultrafast high-fidelity fluorescence imaging scheme using a broad transition, allowing time-resolved study of the tweezer population dynamics from many to single atoms during light-assisted collisions. In particular, we extract a pair-ejection rate that qualitatively agrees with the semiclassical predictions by the Gallagher-Pritchard model. This work represents a promising starting point for the exploration of erbium as a powerful resource for quantum simulation in optical tweezers. △ Less

Submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.14638 [pdf, other]

Giant post-flare loops in active regions with extremely strong coronal magnetic fields

Authors: Costas E. Alissandrakis, Gregory D. Fleishman, Viktor V. Fedenev, Stephen M. White, Alexander T. Altyntsev

Abstract: We report for the first time the detection of thermal free-free emission from post-flare loops at 34GHz in images from the Nobeyama Radioheliograph (NoRH). We studied 8 loops, 7 of which were from regions with extremely strong coronal magnetic field reported by Fedenev et al. (2023). Loop emission was observed in a wide range of wavelength bands, up to soft X-rays, confirming their multi-temperatu… ▽ More We report for the first time the detection of thermal free-free emission from post-flare loops at 34GHz in images from the Nobeyama Radioheliograph (NoRH). We studied 8 loops, 7 of which were from regions with extremely strong coronal magnetic field reported by Fedenev et al. (2023). Loop emission was observed in a wide range of wavelength bands, up to soft X-rays, confirming their multi-temperature structure and was associated with noise storm emission in metric wavelengths. The comparison of the 17GHz emission with that at 34GHz, after a calibration correction of the latter, showed that the emission was optically thin at both frequencies. We describe the structure and evolution of the loops and we computed their density, obtaining values for the top of the loops between 1 and 6 x 10^10 cm^-3, noticeably varying from one loop to another and in the course of the evolution of the same loop system; these values have only a weak dependence on the assumed temperature, 2 x 10^6 K in our case, as we are in the optically thin regime. Our density values are above those reported from EUV observations, which go up to about 10^10 cm^-3. This difference could be due to the fact that different emitting regions are sampled in the two domains and/or due to the more accurate diagnostics in the radio range, which do not suffer from inherent uncertainties arising from abundances and non-LTE excitation/ionization equilibria. We also estimated the magnetic field in the loop tops to be in the range of 10 to 30G. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Accepted for publication in the Astrophysical Journal

arXiv:2406.13079 [pdf, other]

Predicting the 21 cm field with a Hybrid Effective Field Theory approach

Authors: Danial Baradaran, Boryana Hadzhiyska, Martin J. White, Noah Sailer

Abstract: A detection of the 21 cm signal can provide a unique window of opportunity for uncovering complex astrophysical phenomena at the epoch of reionization and placing constraints on cosmology at high redshifts, which are usually elusive to large-scale structure surveys. In this work, we provide a theoretical model based on a quadratic bias expansion capable of recovering the 21 cm power spectrum with… ▽ More A detection of the 21 cm signal can provide a unique window of opportunity for uncovering complex astrophysical phenomena at the epoch of reionization and placing constraints on cosmology at high redshifts, which are usually elusive to large-scale structure surveys. In this work, we provide a theoretical model based on a quadratic bias expansion capable of recovering the 21 cm power spectrum with high accuracy sufficient for upcoming ground-based radio interferometer experiments. In particular, we develop a hybrid effective field theory (HEFT) model in redshift space that leverages the accuracy of $N$-body simulations with the predictive power of analytical bias expansion models, and test it against the Thesan suite of radiative transfer hydrodynamical simulations. We make predictions of the 21 cm brightness temperature field at several distinct redshifts, ranging between $z = 6.5$ and 11, thus probing a large fraction of the reionization history of the Universe ($x_{\rm HI} = 0.3 \sim 0.9$), and compare our model to the `true' 21 cm brightness in terms of the correlation coefficient, power spectrum and modeling error. We find percent-level agreement at large and intermediate scales, $k \lesssim 0.5 h/{\rm Mpc}$, and favorable behavior down to small scales, $k \sim 1 h/{\rm Mpc}$, outperforming pure perturbation-theory-based models. To put our findings into context, we show that even in the absence of any foreground contamination the thermal noise of a futuristic HERA-like experiment is comparable with the theoretical uncertainty in our model in the allowed `wedge' of observations, providing further evidence in support of using HEFT-based models to approximate a range of cosmological observables. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 14 pages, 11 figures, submitted to PRD

arXiv:2406.12284 [pdf, other]

Demystifying the Recency Heuristic in Temporal-Difference Learning

Authors: Brett Daley, Marlos C. Machado, Martha White

Abstract: The recency heuristic in reinforcement learning is the assumption that stimuli that occurred closer in time to an acquired reward should be more heavily reinforced. The recency heuristic is one of the key assumptions made by TD($λ$), which reinforces recent experiences according to an exponentially decaying weighting. In fact, all other widely used return estimators for TD learning, such as $n$-st… ▽ More The recency heuristic in reinforcement learning is the assumption that stimuli that occurred closer in time to an acquired reward should be more heavily reinforced. The recency heuristic is one of the key assumptions made by TD($λ$), which reinforces recent experiences according to an exponentially decaying weighting. In fact, all other widely used return estimators for TD learning, such as $n$-step returns, satisfy a weaker (i.e., non-monotonic) recency heuristic. Why is the recency heuristic effective for temporal credit assignment? What happens when credit is assigned in a way that violates this heuristic? In this paper, we analyze the specific mathematical implications of adopting the recency heuristic in TD learning. We prove that any return estimator satisfying this heuristic: 1) is guaranteed to converge to the correct value function, 2) has a relatively fast contraction rate, and 3) has a long window of effective credit assignment, yet bounded worst-case variance. We also give a counterexample where on-policy, tabular TD methods violating the recency heuristic diverge. Our results offer some of the first theoretical evidence that credit assignment based on the recency heuristic facilitates learning. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: RLC 2024. 18 pages, 8 figures, 1 table

arXiv:2406.08807 [pdf, other]

doi 10.1017/pasa.2024.48

Optimising an Array of Cherenkov Telescopes in Australia for the Detection of TeV Gamma-Ray Transients

Authors: Simon Lee, Sabrina Einecke, Gavin Rowell, Csaba Balazs, Jose A. Bellido, Shi Dai, Miroslav Filipović, Violet M. Harvey, Padric McGee, Peter Marinos, Nicholas Tothill, Martin White

Abstract: As TeV gamma-ray astronomy progresses into the era of the Cherenkov Telescope Array (CTA), instantaneously following up on gamma-ray transients is becoming more important than ever. To this end, a worldwide network of Imaging Atmospheric Cherenkov Telescopes has been proposed. Australia is ideally suited to provide coverage of part of the Southern Hemisphere sky inaccessible to H.E.S.S. in Namibia… ▽ More As TeV gamma-ray astronomy progresses into the era of the Cherenkov Telescope Array (CTA), instantaneously following up on gamma-ray transients is becoming more important than ever. To this end, a worldwide network of Imaging Atmospheric Cherenkov Telescopes has been proposed. Australia is ideally suited to provide coverage of part of the Southern Hemisphere sky inaccessible to H.E.S.S. in Namibia and the upcoming CTA-South in Chile. This study assesses the sources detectable by a small, transient-focused array in Australia based on CTA telescope designs. The TeV emission of extragalactic sources (including the majority of gamma-ray transients) can suffer significant absorption by the extragalactic background light. As such, we explored the improvements possible by implementing stereoscopic and topological triggers, as well as lowered image cleaning thresholds, to access lower energies. We modelled flaring gamma-ray sources based on past measurements from the satellite-based gamma-ray telescope Fermi-LAT. We estimate that an array of four Medium-Sized Telescopes (MSTs) would detect $\sim$24 active galactic nucleus flares >5$σ$ per year, up to a redshift of $z\approx1.5$. Two MSTs achieved $\sim$80-90% of the detections of four MSTs. The modelled Galactic transients were detectable within the observation time of one night, 11 of the 21 modelled gamma-ray bursts were detectable, as were $\sim$10% of unidentified transients. An array of MST-class telescopes would thus be a valuable complementary telescope array for transient TeV gamma-ray astronomy. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 13 pages, 13 figures, 4 tables, accepted for publication in PASA

arXiv:2406.08540 [pdf, other]

Ray-tracing vs. Born approximation in full-sky weak lensing simulations of the MillenniumTNG project

Authors: Fulvio Ferlito, Christopher T. Davies, Volker Springel, Martin Reinecke, Alessandro Greco, Ana Maria Delgado, Simon D. M. White, César Hernández-Aguayo, Sownak Bose, Lars Hernquist

Abstract: Weak gravitational lensing is a powerful tool for precision tests of cosmology. As the expected deflection angles are small, predictions based on non-linear N-body simulations are commonly computed with the Born approximation. Here we examine this assumption using ${\small DORIAN}$, a newly developed full-sky ray-tracing scheme applied to high-resolution mass-shell outputs of the two largest simul… ▽ More Weak gravitational lensing is a powerful tool for precision tests of cosmology. As the expected deflection angles are small, predictions based on non-linear N-body simulations are commonly computed with the Born approximation. Here we examine this assumption using ${\small DORIAN}$, a newly developed full-sky ray-tracing scheme applied to high-resolution mass-shell outputs of the two largest simulations in the MillenniumTNG suite, each with a 3000 Mpc box containing almost 1.1 trillion cold dark matter particles in addition to 16.7 billion particles representing massive neutrinos. We examine simple two-point statistics like the angular power spectrum of the convergence field, as well as statistics sensitive to higher order correlations such as peak and minimum statistics, void statistics, and Minkowski functionals of the convergence maps. Overall, we find only small differences between the Born approximation and a full ray-tracing treatment. While these are negligibly small at power-spectrum level, some higher order statistics show more sizable effects; ray-tracing is necessary to achieve percent level precision. At the resolution reached here, full-sky maps with 0.8 billion pixels and an angular resolution of 0.43 arcmin, we find that interpolation accuracy can introduce appreciable errors in ray-tracing results. We therefore implemented an interpolation method based on nonuniform fast Fourier transforms (NUFFT) along with more traditional methods. Bilinear interpolation introduces significant smoothing, while nearest grid point sampling agrees well with NUFFT, at least for our fiducial source redshift, $z_s=1.0$, and for the 1 arcmin smoothing we use for higher-order statistics. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 13 pages, 7 figures, submitted to MNRAS

arXiv:2406.07321 [pdf, other]

The magic of entangled top quarks

Authors: Chris D. White, Martin J. White

Abstract: Recent years have seen an increasing body of work examining how quantum entanglement can be measured at high energy particle physics experiments, thereby complementing traditional table-top experiments. This raises the question of whether more concepts from quantum computation can be examined at colliders, and we here consider the property of magic, which distinguishes those quantum states which h… ▽ More Recent years have seen an increasing body of work examining how quantum entanglement can be measured at high energy particle physics experiments, thereby complementing traditional table-top experiments. This raises the question of whether more concepts from quantum computation can be examined at colliders, and we here consider the property of magic, which distinguishes those quantum states which have a genuine computational advantage over classical states. We examine top anti-top pair production at the LHC, showing that nature chooses to produce magic tops, where the amount of magic varies with the kinematics of the final state. We compare results for individual partonic channels and at proton-level, showing that averaging over final states typically increases magic. This is in contrast to entanglement measures, such as the concurrence, which typically decrease. Our results create new links between the quantum information and particle physics literatures, providing practical insights for further study. △ Less

Submitted 12 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

Comments: 22 pages, 6 figures

Report number: ADP-24-10/T1249

arXiv:2406.05050 [pdf, other]

The Dark Energy Survey Supernova Program: Slow supernovae show cosmological time dilation out to $z \sim 1$

Authors: Ryan M. T. White, Tamara M. Davis, Geraint F. Lewis, Christopher Lidman, Paul Shah, T. M. C. Abbott, M. Aguena, S. Allam, F. Andrade-Oliveira, J. Asorey, D. Bacon, S. Bocquet, D. Brooks, D. Brout, E. Buckley-Geer, D. L. Burke, A. Carnero Rosell, D. Carollo, J. Carretero, L. N. da Costa, M. E. S. Pereira, J. De Vicente, S. Desai, H. T. Diehl, S. Everett , et al. (42 additional authors not shown)

Abstract: We present a precise measurement of cosmological time dilation using the light curves of 1504 type Ia supernovae from the Dark Energy Survey spanning a redshift range $0.1\lesssim z\lesssim 1.2$. We find that the width of supernova light curves is proportional to $(1+z)$, as expected for time dilation due to the expansion of the Universe. Assuming type Ia supernovae light curves are emitted with a… ▽ More We present a precise measurement of cosmological time dilation using the light curves of 1504 type Ia supernovae from the Dark Energy Survey spanning a redshift range $0.1\lesssim z\lesssim 1.2$. We find that the width of supernova light curves is proportional to $(1+z)$, as expected for time dilation due to the expansion of the Universe. Assuming type Ia supernovae light curves are emitted with a consistent duration $Δt_{\rm em}$, and parameterising the observed duration as $Δt_{\rm obs}=Δt_{\rm em}(1+z)^b$, we fit for the form of time dilation using two methods. Firstly, we find that a power of $b \approx 1$ minimises the flux scatter in stacked subsamples of light curves across different redshifts. Secondly, we fit each target supernova to a stacked light curve (stacking all supernovae with observed bandpasses matching that of the target light curve) and find $b=1.003\pm0.005$ (stat) $\pm\,0.010$ (sys). Thanks to the large number of supernovae and large redshift-range of the sample, this analysis gives the most precise measurement of cosmological time dilation to date, ruling out any non-time-dilating cosmological models at very high significance. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 14 pages, 13 figures

Report number: FERMILAB-PUB-24-0293-PPD, DES-2024-0831

arXiv:2406.04804 [pdf, other]

Mitigation of DESI fiber assignment incompleteness effect on two-point clustering with small angular scale truncated estimators

Authors: M. Pinon, A. de Mattia, P. McDonald, E. Burtin, V. Ruhlmann-Kleider, M. White, D. Bianchi, A. J. Ross, J. Aguilar, S. Ahlen, D. Brooks, E. Chaussidon, T. Claybaugh, S. Cole, A. de la Macorra, B. Dey, P. Doel, K. Fanning, J. E. Forero-Romero, E. Gaztañaga, S. Gontcho A Gontcho, C. Howlett, D. Kirkby, T. Kisner, A. Kremin , et al. (27 additional authors not shown)

Abstract: We present a method to mitigate the effects of fiber assignment incompleteness in two-point power spectrum and correlation function measurements from galaxy spectroscopic surveys, by truncating small angular scales from estimators. We derive the corresponding modified correlation function and power spectrum windows to account for the small angular scale truncation in the theory prediction. We vali… ▽ More We present a method to mitigate the effects of fiber assignment incompleteness in two-point power spectrum and correlation function measurements from galaxy spectroscopic surveys, by truncating small angular scales from estimators. We derive the corresponding modified correlation function and power spectrum windows to account for the small angular scale truncation in the theory prediction. We validate this approach on simulations reproducing the Dark Energy Spectroscopic Instrument (DESI) Data Release 1 (DR1) with and without fiber assignment. We show that we recover unbiased cosmological constraints using small angular scale truncated estimators from simulations with fiber assignment incompleteness, with respect to standard estimators from complete simulations. Additionally, we present an approach to remove the sensitivity of the fits to high $k$ modes in the theoretical power spectrum, by applying a transformation to the data vector and window matrix. We find that our method efficiently mitigates the effect of fiber assignment incompleteness in two-point correlation function and power spectrum measurements, at low computational cost and with little statistical loss. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 32 pages, 22 figures

arXiv:2406.02228 [pdf, other]

Constrained cosmological simulations of the Local Group using Bayesian hierarchical field-level inference

Authors: Ewoud Wempe, Guilhem Lavaux, Simon D. M. White, Amina Helmi, Jens Jasche, Stephen Stopyra

Abstract: We present a novel approach based on Bayesian field-level inference capable of resolving individual galaxies within the Local Group (LG), enabling detailed studies of its structure and formation via posterior simulations. We extend the Bayesian Origin Reconstruction from Galaxies (BORG) algorithm with a multi-resolution approach, allowing us to reach smaller mass scales and apply observational con… ▽ More We present a novel approach based on Bayesian field-level inference capable of resolving individual galaxies within the Local Group (LG), enabling detailed studies of its structure and formation via posterior simulations. We extend the Bayesian Origin Reconstruction from Galaxies (BORG) algorithm with a multi-resolution approach, allowing us to reach smaller mass scales and apply observational constraints based on LG galaxies. Our updated data model simultaneously accounts for observations of mass tracers within the dark haloes of the Milky Way (MW) and M31, their observed separation and relative velocity, and the quiet surrounding Hubble flow represented through the positions and velocities of galaxies at distances from one to four Mpc. Our approach delivers representative posterior samples of $Λ$CDM realisations that are statistically and simultaneously consistent with all these observations, leading to significantly tighter mass constraints than found if the individual datasets are considered separately. In particular, we estimate the virial masses of the MW and M31 to be $\log_{10}(M_{200c}/M_\odot) = 12.07\pm0.08$ and $12.33\pm0.10$, respectively, their sum to be $\log_{10}(ΣM_{200c}/M_\odot)= 12.52\pm0.07$, and the enclosed mass within spheres of radius $R$ to be $\log_{10}(M(R)/M_\odot)= 12.71\pm0.06$ and $12.96\pm0.08$ for $R=1$ Mpc and 3 Mpc, respectively. The M31-MW orbit is nearly radial for most of our $Λ$CDM LG's, and most lie in a dark matter sheet that aligns approximately with the Supergalactic Plane, even though the surrounding density field was not used explicitly as a constraint. The approximate simulations employed in our inference are accurately reproduced by high-fidelity structure formation simulations, demonstrating the potential for future high-resolution, full-physics $Λ$CDM posterior simulations of LG look-alikes. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 23 pages, 17 figures, 4 tables, submitted to A&A. Comments welcome!

arXiv:2406.01803 [pdf, other]

The clustering of Lyman Alpha Emitting galaxies at z=2-3

Authors: M. White, A. Raichoor, Arjun Dey, Lehman H. Garrison, Eric Gawiser, D. Lang, Kyoung-soo Lee, A. D. Myers, D. Schlegel, F. Valdes, J. Aguilar, S. Ahlen, D. Brooks, E. Chaussidon, T. Claybaugh, K. Dawson, A. de la Macorra, Biprateep Dey, P. Doel, K. Fanning, A. Font-Ribera, J. E. Forero-Romero, S. Gontcho A Gontcho, G. Gutierrez, J. Guy , et al. (30 additional authors not shown)

Abstract: We measure the clustering of Lyman Alpha Emitting galaxies (LAEs) selected from the One-hundred-square-degree DECam Imaging in Narrowbands (ODIN) survey, with spectroscopic follow-up from Dark Energy Spectroscopic Instrument (DESI). We use DESI spectroscopy to optimize our selection and to constrain the interloper fraction and redshift distribution of our narrow-band selected sources. We select sa… ▽ More We measure the clustering of Lyman Alpha Emitting galaxies (LAEs) selected from the One-hundred-square-degree DECam Imaging in Narrowbands (ODIN) survey, with spectroscopic follow-up from Dark Energy Spectroscopic Instrument (DESI). We use DESI spectroscopy to optimize our selection and to constrain the interloper fraction and redshift distribution of our narrow-band selected sources. We select samples at z=2.45 and 3.1 in the COSMOS field with median Ly-alpha fluxes of 10^{-16}erg/s/cm2. Covariances and cosmological inferences are obtained from a series of mock catalogs built upon high-resolution N-body simulations that match the footprint, number density, redshift distribution and observed clustering of the sample. We find that both samples have a correlation length of r_0=3.0+/-0.2 Mpc/h. Within our fiducial cosmology these correspond to 3D number densities of 10^{-3}h3/Mpc3 and, from our mock catalogs, biases of 1.7 and 2.0 at z=2.45 and 3.1, respectively. We discuss the implications of these measurements for the use of LAEs as large-scale structure tracers for high-redshift cosmology. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 26 pages, 12 figures

arXiv:2406.01562 [pdf, other]

A New View on Planning in Online Reinforcement Learning

Authors: Kevin Roice, Parham Mohammad Panahi, Scott M. Jordan, Adam White, Martha White

Abstract: This paper investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives, such as Double DQN, even though the former uses significantly more memory and computation. The fundament… ▽ More This paper investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives, such as Double DQN, even though the former uses significantly more memory and computation. The fundamental problem is that learned models can be inaccurate and often generate invalid states, especially when iterated many steps. In this paper, we avoid this limitation by constraining background planning to a set of (abstract) subgoals and learning only local, subgoal-conditioned models. This goal-space planning (GSP) approach is more computationally efficient, naturally incorporates temporal abstraction for faster long-horizon planning and avoids learning the transition dynamics entirely. We show that our GSP algorithm can propagate value from an abstract space in a manner that helps a variety of base learners learn significantly faster in different domains. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Published in the Planning and Reinforcement Learning Workshop at ICAPS 2024. arXiv admin note: text overlap with arXiv:2206.02902

arXiv:2406.00103 [pdf, other]

The Bispectrum in Lagrangian Perturbation Theory

Authors: Shi-Fan Chen, Zvonimir Vlah, Martin White

Abstract: We study the bispectrum in Lagrangian perturbation theory. Extending past results for the power spectrum, we describe a method to efficiently compute the bispectrum in LPT, focusing on the Zeldovich approximation, in which contributions due to linear displacements are captured to all orders in a manifestly infrared (IR) safe way. We then isolate the effects of these linear displacements on oscilla… ▽ More We study the bispectrum in Lagrangian perturbation theory. Extending past results for the power spectrum, we describe a method to efficiently compute the bispectrum in LPT, focusing on the Zeldovich approximation, in which contributions due to linear displacements are captured to all orders in a manifestly infrared (IR) safe way. We then isolate the effects of these linear displacements on oscillatory components of the power spectrum like baryon acoustic oscillations or inflationary primordial features and show that the Eulerian perturbation theory (EPT) prescription wherein their effects are resummed by a Gaussian dam** of the oscillations arise as a saddle-point approximation of our calculation. These two methods of IR resummation are in excellent agreement at 1-loop in the bispectrum. At tree level, resummed EPT does less well to capture the nonlinear dam** of the oscillations, and the LPT calculation does not require an artificial split of the power spectrum into smooth and oscillatory components, making the latter particularly useful for modeling exotic features. We finish by extending our analysis of IR resummation in LPT to N-point functions of arbitrary order. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: 45 pages, 13 figures, to be submitted to JCAP

arXiv:2405.18456 [pdf]

doi 10.3389/fonc.2024.1343070

The association between neighborhood obesogenic factors and prostate cancer risk and mortality: the Southern Community Cohort Study

Authors: Fekede Asefa Kumsa, Jay H. Fowke, Soheil Hashtarkhani, Brianna M. White, Martha J. Shrubsole, Arash Shaban-Nejad

Abstract: Prostate cancer is one of the leading causes of cancer-related mortality among men in the U.S. We examined the role of neighborhood obesogenic attributes on prostate cancer risk and mortality in the Southern Community Cohort Study (SCCS). From 34,166 SCCS male participants, 28,356 were included in the analysis. We assessed relationship between neighborhood socioeconomic status (nSES) and neighborh… ▽ More Prostate cancer is one of the leading causes of cancer-related mortality among men in the U.S. We examined the role of neighborhood obesogenic attributes on prostate cancer risk and mortality in the Southern Community Cohort Study (SCCS). From 34,166 SCCS male participants, 28,356 were included in the analysis. We assessed relationship between neighborhood socioeconomic status (nSES) and neighborhood obesogenic environment indices including restaurant environment index, retail food environment index, parks, recreational facilities, and businesses and prostate cancer risk and mortality by controlling for individual-level factors using a multivariable Cox proportional hazards model. We further stratified prostate cancer risk analysis by race and body mass index (BMI). Median follow-up time was 133 months, and mean age was 51.62 years. There were 1,524 (5.37%) prostate cancer diagnoses and 98 (6.43%) prostate cancer deaths during follow-up. Compared to participants residing in wealthiest quintile, those residing in the poorest quintile had a higher risk of prostate cancer, particularly among non-obese men with a BMI less than 30. The restaurant environment index was associated with a higher prostate cancer risk in overweight (BMI equal or greater 25) White men. Obese Black individuals without any neighborhood recreational facilities had a 42% higher risk compared to those with any access. Compared to residents in wealthiest quintile and most walkable area, those residing within the poorest quintile or the least walkable area had a higher risk of prostate cancer death. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 18 Pages, 1 Figure, 8 Tables

MSC Class: 62Q05

Journal ref: Front Oncol Frontiers in Oncology, 2024 Apr 9:14:1343070

arXiv:2405.04324 [pdf, other]

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabilities, including code generation, fixing bugs, explaining and documenting code, maintaining repositories, and more. In this work, we introduce the Granite series of decoder-only code models for code generative tasks, trained with code written in 116 programming languages. The Granite Code models family consists of models ranging in size from 3 to 34 billion parameters, suitable for applications ranging from complex application modernization tasks to on-device memory-constrained use cases. Evaluation on a comprehensive set of tasks demonstrates that Granite Code models consistently reaches state-of-the-art performance among available open-source code LLMs. The Granite Code model family was optimized for enterprise software development workflows and performs well across a range of coding tasks (e.g. code generation, fixing and explanation), making it a versatile all around code model. We release all our Granite Code models under an Apache 2.0 license for both research and commercial use. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

arXiv:2405.02912 [pdf, other]

Close Encounters of Wide Binaries Induced by the Galactic Tide: Implications for Stellar Mergers and Gravitational-Wave Sources

Authors: Jakob Stegmann, Alejandro Vigna-Gómez, Antti Rantala, Tom Wagg, Lorenz Zwick, Mathieu Renzo, Lieke A. C. van Son, Selma E. de Mink, Simon D. M. White

Abstract: A substantial fraction of stars can be found in wide binaries with projected separations between $\sim10^2$ and $10^5\,\rm AU$. In the standard lore of binary physics, these would evolve as effectively single stars that remotely orbit one another on stationary Keplerian ellipses. However, embedded in their Galactic environment their low binding energy makes them exceptionally prone to perturbation… ▽ More A substantial fraction of stars can be found in wide binaries with projected separations between $\sim10^2$ and $10^5\,\rm AU$. In the standard lore of binary physics, these would evolve as effectively single stars that remotely orbit one another on stationary Keplerian ellipses. However, embedded in their Galactic environment their low binding energy makes them exceptionally prone to perturbations from the gravitational potential of the Milky Way and encounters with passing stars. Employing a fully relativistic $N$-body integration scheme, we study the impact of these perturbations on the orbital evolution of wide binaries along their trajectory through the Milky Way. Our analysis reveals that the torques exerted by the Galaxy can cause large-amplitude oscillations of the binary eccentricity to $1-e\lesssim10^{-8}$. As a consequence, the wide binary members pass close to each other at periapsis, which, depending on the type of binary, potentially leads to a mass transfer or collision of stars or to an inspiral and subsequent merger of compact remnants due to gravitational-wave radiation. Based on a simulation of $10^5$ wide binaries across the Galactic field, we find that this mechanism could significantly contribute to the rate of stellar collisions and binary black hole mergers as inferred from observations of Luminous Red Novae and gravitational-wave events by LIGO/Virgo/Kagra. We conclude that the dynamics of wide binaries, despite their large mean separation, can give rise to extreme interactions between stars and compact remnants. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 20 pages, 9 figures

arXiv:2405.01755 [pdf, ps, other]

Electron Cyclotron Maser Emission and the Brightest Solar Radio Bursts

Authors: Stephen M. White, Masumi Shimojo, Kazumasa Iwai, Timothy S. Bastian, Gregory D. Fleishman, Dale E. Gary, Jasmina Magdalenic, Angelos Vourlidas

Abstract: This paper investigates the incidence of coherent emission in solar radio bursts, using a revised catalog of 3800 solar radio bursts observed by the Nobeyama Radio Polarimeters from 1988 to 2023. We focus on the 1.0 and 2.0 GHz data, where radio fluxes of order 10 billion Jansky have been observed. Previous work has suggested that these bursts are due to electron cyclotron maser (ECM) emission. In… ▽ More This paper investigates the incidence of coherent emission in solar radio bursts, using a revised catalog of 3800 solar radio bursts observed by the Nobeyama Radio Polarimeters from 1988 to 2023. We focus on the 1.0 and 2.0 GHz data, where radio fluxes of order 10 billion Jansky have been observed. Previous work has suggested that these bursts are due to electron cyclotron maser (ECM) emission. In at least one well studied case, the bright emission at 1 GHz consists of narrowband spikes of millisecond duration. Coherent emission at 1 GHz can be distinguished from traditional incoherent gyrosynchrotron flare emission based on the radio spectrum: gyrosynchrotron emission at 1 GHz usually has a spectrum rising with frequency, so bursts in which 1 GHz is stronger than higher frequency measurements are unlikely to be incoherent gyrosynchrotron. Based on this criterion it is found that, for bursts exceeding 100 sfu, three-quarters of all bursts at 1 GHz and half of all 2 GHz bursts have a dominant coherent emission component, assumed to be ECM. The majority of the very bright bursts at 1 GHz are highly circularly polarized, consistent with a coherent emission mechanism, but not always 100% polarized. The frequency range from 1 to 2 GHz is heavily utilized for terrestrial applications, and these results are relevant for understanding the extreme flux levels that may impact such applications. Further, they provide a reference for comparison with the study of ECM emission from other stars and potentially exoplanets. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: 24 pages, 6 figures, in press

Journal ref: Astrophysical Journal, 2024

arXiv:2405.00959 [pdf]

Solar Radio Bursts and Space Weather

Authors: Stephen M. White

Abstract: Space Weather is the study of the conditions in the solar wind that can affect life on the surface of the Earth, particularly the increasingly technologically sophisticated devices that are part of modern life. Solar radio observations are relevant to such phenomena because they generally originate as events in the solar atmosphere, including flares, coronal mass ejections and shocks, that produce… ▽ More Space Weather is the study of the conditions in the solar wind that can affect life on the surface of the Earth, particularly the increasingly technologically sophisticated devices that are part of modern life. Solar radio observations are relevant to such phenomena because they generally originate as events in the solar atmosphere, including flares, coronal mass ejections and shocks, that produce electromagnetic and particle radiations that impact the Earth. Low frequency solar radio emission arises in the solar atmosphere at the levels where these events occur: we can use frequency as a direct measure of density, and an indirect measure of height, in the atmosphere. The main radio burst types are described and illustrated using data from the Green Bank Solar Radio Burst Spectrometer, and their potential use as diagnostics of Space Weather is discussed. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: 18 pages, 12 figures. Added to arxiv to provide appropriate reference

Journal ref: Asian Journal of Physics, 16, 189-207, 2007

arXiv:2404.07312 [pdf, other]

An analysis of parameter compression and full-modeling techniques with Velocileptors for DESI 2024 and beyond

Authors: M. Maus, S. Chen, M. White, J. Aguilar, S. Ahlen, A. Aviles, S. Brieden, D. Brooks, T. Claybaugh, S. Cole, A. de la Macorra, Arjun Dey, P. Doel, S. Ferraro, N. Findlay, J. E. Forero-Romero, E. Gaztañaga, H. Gil-Marín, S. Gontcho A Gontcho, C. Hahn, K. Honscheid, C. Howlett, M. Ishak, S. Juneau, A. Kremin , et al. (30 additional authors not shown)

Abstract: In anticipation of forthcoming data releases of current and future spectroscopic surveys, we present the validation tests and analysis of systematic effects within \texttt{velocileptors} modeling pipeline when fitting mock data from the \texttt{AbacusSummit} N-body simulations. We compare the constraints obtained from parameter compression methods to the direct fitting (Full-Modeling) approaches o… ▽ More In anticipation of forthcoming data releases of current and future spectroscopic surveys, we present the validation tests and analysis of systematic effects within \texttt{velocileptors} modeling pipeline when fitting mock data from the \texttt{AbacusSummit} N-body simulations. We compare the constraints obtained from parameter compression methods to the direct fitting (Full-Modeling) approaches of modeling the galaxy power spectra, and show that the ShapeFit extension to the traditional template method is consistent with the Full-Modeling method within the standard $Λ$CDM parameter space. We show the dependence on scale cuts when fitting the different redshift bins using the ShapeFit and Full-Modeling methods. We test the ability to jointly fit data from multiple redshift bins as well as joint analysis of the pre-reconstruction power spectrum with the post-reconstruction BAO correlation function signal. We further demonstrate the behavior of the model when opening up the parameter space beyond $Λ$CDM and also when combining likelihoods with external datasets, namely the Planck CMB priors. Finally, we describe different parametrization options for the galaxy bias, counterterm, and stochastic parameters, and employ the halo model in order to physically motivate suitable priors that are necessary to ensure the stability of the perturbation theory. △ Less

Submitted 17 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 56 pages, 23 figures. Supporting publication of DESI 2024 V: Analysis of the full shape of two-point clustering statistics from galaxies and quasars

arXiv:2404.07272 [pdf, other]

A comparison of effective field theory models of redshift space galaxy power spectra for DESI 2024 and future surveys

Authors: M. Maus, Y. Lai, H. E. Noriega, S. Ramirez-Solano, A. Aviles, S. Chen, S. Fromenteau, H. Gil-Marín, C. Howlett, M. Vargas-Magaña, M. White, P. Zarrouk, J. Aguilar, S. Ahlen, O. Alves, S. Brieden, D. Brooks, E. Burtin, T. Claybaugh, S. Cole, K. Dawson, M. Icaza-Lizaola, A. de la Macorra, A. de Mattia, P. Doel , et al. (32 additional authors not shown)

Abstract: In preparation for the next generation of galaxy redshift surveys, and in particular the year-one data release from the Dark Energy Spectroscopic Instrument (DESI), we investigate the consistency of a variety of effective field theory models that describe the galaxy-galaxy power spectra in redshift space into the quasi-linear regime using 1-loop perturbation theory. These models are employed in th… ▽ More In preparation for the next generation of galaxy redshift surveys, and in particular the year-one data release from the Dark Energy Spectroscopic Instrument (DESI), we investigate the consistency of a variety of effective field theory models that describe the galaxy-galaxy power spectra in redshift space into the quasi-linear regime using 1-loop perturbation theory. These models are employed in the pipelines \texttt{velocileptors}, \texttt{PyBird}, and \texttt{Folps$ν$}. While these models have been validated independently, a detailed comparison with consistent choices has not been attempted. After briefly discussing the theoretical differences between the models we describe how to provide a more apples-to-apples comparison between them. We present the results of fitting mock spectra from the \texttt{AbacusSummit} suite of N-body simulations provided in three redshift bins to mimic the types of dark time tracers targeted by the DESI survey. We show that the theories behave similarly and give consistent constraints in both the forward-modeling and ShapeFit compressed fitting approaches. We additionally generate (noiseless) synthetic data from each pipeline to be fit by the others, varying the scale cuts in order to show that the models agree within the range of scales for which we expect 1-loop perturbation theory to be applicable. This work lays the foundation of Full-Shape analysis with DESI Y1 galaxy samples where in the tests we performed, we found no systematic error associated with the modeling of the galaxy redshift space power spectrum for this volume. △ Less

Submitted 6 June, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 33 pages, 12 figures. Supporting publication of DESI 2024 V: Analysis of the full shape of two-point clustering statistics from galaxies and quasars

arXiv:2404.03569 [pdf, other]

High redshift LBGs from deep broadband imaging for future spectroscopic surveys

Authors: Vanina Ruhlmann-Kleider, Christophe Yèche, Christophe Magneville, Henri Coquinot, Eric Armengaud, Nathalie Palanque-Delabrouille, Anand Raichoor, Jessica Nicole Aguilar, Steven Ahlen, Stéphane Arnouts, David Brooks, Edmond Chaussidon, Todd Claybaugh, Kyle Dawson, Axel de la Macorra, Arjun Dey, Biprateep Dey, Peter Doel, Kevin Fanning, Simone Ferraro, Jaime E. Forero-Romero, Satya Gontcho A Gontcho, Gaston Gutierrez, Stephen Gwyn, Klaus Honscheid , et al. (38 additional authors not shown)

Abstract: Lyman break galaxies (LBGs) are promising probes for clustering measurements at high redshift, $z>2$, a region only covered so far by Lyman-$α$ forest measurements. In this paper, we investigate the feasibility of selecting LBGs by exploiting the existence of a strong deficit of flux shortward of the Lyman limit, due to various absorption processes along the line of sight. The target selection rel… ▽ More Lyman break galaxies (LBGs) are promising probes for clustering measurements at high redshift, $z>2$, a region only covered so far by Lyman-$α$ forest measurements. In this paper, we investigate the feasibility of selecting LBGs by exploiting the existence of a strong deficit of flux shortward of the Lyman limit, due to various absorption processes along the line of sight. The target selection relies on deep imaging data from the HSC and CLAUDS surveys in the $g,r,z$ and $u$ bands, respectively, with median depths reaching 27 AB in all bands. The selections were validated by several dedicated spectroscopic observation campaigns with DESI. Visual inspection of spectra has enabled us to develop an automated spectroscopic ty** and redshift estimation algorithm specific to LBGs. Based on these data and tools, we assess the efficiency and purity of target selections optimised for different purposes. Selections providing a wide redshift coverage retain $57\%$ of the observed targets after spectroscopic confirmation with DESI, and provide an efficiency for LBGs of $83\pm3\%$, for a purity of the selected LBG sample of $90\pm2\%$. This would deliver a confirmed LBG density of $\sim 620$ deg$^{-2}$ in the range $2.3<z<3.5$ for a $r$-band limiting magnitude $r<24.2$. Selections optimised for high redshift efficiency retain $73\%$ of the observed targets after spectroscopic confirmation, with $89\pm4\%$ efficiency for $97\pm2\%$ purity. This would provide a confirmed LBG density of $\sim 470$ deg$^{-2}$ in the range $2.8<z<3.5$ for a $r$-band limiting magnitude $r<24.5$. A preliminary study of the LBG sample 3d-clustering properties is also presented and used to estimate the LBG linear bias. A value of $b_{LBG} = 3.3 \pm 0.2 (stat.)$ is obtained for a mean redshift of 2.9 and a limiting magnitude in $r$ of 24.2, in agreement with results reported in the literature. △ Less

Submitted 26 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 45 pages, 29 figures, submitted to JCAP

arXiv:2404.03002 [pdf, other]

DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, D. M. Alexander, M. Alvarez, O. Alves, A. Anand, U. Andrade, E. Armengaud, S. Avila, A. Aviles, H. Awan, B. Bahr-Kalus, S. Bailey, C. Baltay, A. Bault, J. Behera, S. BenZvi, A. Bera, F. Beutler, D. Bianchi, C. Blake, R. Blum , et al. (178 additional authors not shown)

Abstract: We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the s… ▽ More We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the sound horizon, in seven redshift bins from over 6 million extragalactic objects in the redshift range $0.1<z<4.2$. DESI BAO data alone are consistent with the standard flat $Λ$CDM cosmological model with a matter density $Ω_\mathrm{m}=0.295\pm 0.015$. Paired with a BBN prior and the robustly measured acoustic angular scale from the CMB, DESI requires $H_0=(68.52\pm0.62)$ km/s/Mpc. In conjunction with CMB anisotropies from Planck and CMB lensing data from Planck and ACT, we find $Ω_\mathrm{m}=0.307\pm 0.005$ and $H_0=(67.97\pm0.38)$ km/s/Mpc. Extending the baseline model with a constant dark energy equation of state parameter $w$, DESI BAO alone require $w=-0.99^{+0.15}_{-0.13}$. In models with a time-varying dark energy equation of state parametrized by $w_0$ and $w_a$, combinations of DESI with CMB or with SN~Ia individually prefer $w_0>-1$ and $w_a<0$. This preference is 2.6$σ$ for the DESI+CMB combination, and persists or grows when SN~Ia are added in, giving results discrepant with the $Λ$CDM model at the $2.5σ$, $3.5σ$ or $3.9σ$ levels for the addition of Pantheon+, Union3, or DES-SN5YR datasets respectively. For the flat $Λ$CDM model with the sum of neutrino mass $\sum m_ν$ free, combining the DESI and CMB data yields an upper limit $\sum m_ν< 0.072$ $(0.113)$ eV at 95% confidence for a $\sum m_ν>0$ $(\sum m_ν>0.059)$ eV prior. These neutrino-mass constraints are substantially relaxed in models beyond $Λ$CDM. [Abridged.] △ Less

Submitted 24 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: This DESI Collaboration Key Publication is part of the 2024 publication series using the first year of observations (see https://data.desi.lbl.gov/doc/papers). Typos corrected and a new figure and discussion added to Appendix A

arXiv:2404.03001 [pdf, other]

DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman Alpha Forest

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, D. M. Alexander, M. Alvarez, O. Alves, A. Anand, U. Andrade, E. Armengaud, S. Avila, A. Aviles, H. Awan, S. Bailey, C. Baltay, A. Bault, J. Bautista, J. Behera, S. BenZvi, F. Beutler, D. Bianchi, C. Blake, R. Blum, S. Brieden , et al. (174 additional authors not shown)

Abstract: We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a… ▽ More We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a new analysis methodology on a blinded dataset. We conducted rigorous tests using synthetic data to ensure the reliability of our methodology and findings before unblinding. Additionally, we conducted multiple data splits to assess the consistency of the results and scrutinized various analysis approaches to confirm their robustness. For a given value of the sound horizon ($r_d$), we measure the expansion at $z_{\rm eff}=2.33$ with 2\% precision, $H(z_{\rm eff}) = (239.2 \pm 4.8) (147.09~{\rm Mpc} /r_d)$ km/s/Mpc. Similarly, we present a 2.4\% measurement of the transverse comoving distance to the same redshift, $D_M(z_{\rm eff}) = (5.84 \pm 0.14) (r_d/147.09~{\rm Mpc})$ Gpc. Together with other DESI BAO measurements at lower redshifts, these results are used in a companion paper to constrain cosmological parameters. △ Less

Submitted 12 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: This DESI Collaboration Key Publication is part of the 2024 publication series using the first year of observations (see https://data.desi.lbl.gov/doc/papers)

arXiv:2404.03000 [pdf, other]

DESI 2024 III: Baryon Acoustic Oscillations from Galaxies and Quasars

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, D. M. Alexander, M. Alvarez, O. Alves, A. Anand, U. Andrade, E. Armengaud, S. Avila, A. Aviles, H. Awan, S. Bailey, C. Baltay, A. Bault, J. Behera, S. BenZvi, F. Beutler, D. Bianchi, C. Blake, R. Blum, S. Brieden, A. Brodzeller , et al. (171 additional authors not shown)

Abstract: We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 qu… ▽ More We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 quasars with 0.8<z<2.1, over a ~7,500 square degree footprint. The analysis was blinded at the catalog-level to avoid confirmation bias. All fiducial choices of the BAO fitting and reconstruction methodology, as well as the size of the systematic errors, were determined on the basis of the tests with mock catalogs and the blinded data catalogs. We present several improvements to the BAO analysis pipeline, including enhancing the BAO fitting and reconstruction methods in a more physically-motivated direction, and also present results using combinations of tracers. We present a re-analysis of SDSS BOSS and eBOSS results applying the improved DESI methodology and find scatter consistent with the level of the quoted SDSS theoretical systematic uncertainties. With the total effective survey volume of ~ 18 Gpc$^3$, the combined precision of the BAO measurements across the six different redshift bins is ~0.52%, marking a 1.2-fold improvement over the previous state-of-the-art results using only first-year data. We detect the BAO in all of these six redshift bins. The highest significance of BAO detection is $9.1σ$ at the effective redshift of 0.93, with a constraint of 0.86% placed on the BAO scale. We find our measurements are systematically larger than the prediction of Planck-2018 LCDM model at z<0.8. We translate the results into transverse comoving distance and radial Hubble distance measurements, which are used to constrain cosmological models in our companion paper [abridged]. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: This DESI Collaboration Key Publication is part of the 2024 publication series using the first year of observations (see https://data.desi.lbl.gov/doc/papers)

arXiv:2404.02113 [pdf, other]

K-percent Evaluation for Lifelong RL

Authors: Golnaz Mesbahi, Parham Mohammad Panahi, Olya Mastikhina, Martha White, Adam White

Abstract: In continual or lifelong reinforcement learning, access to the environment should be limited. If we aspire to design algorithms that can run for long periods, continually adapting to new, unexpected situations, then we must be willing to deploy our agents without tuning their hyperparameters over the agent's entire lifetime. The standard practice in deep RL, and even continual RL, is to assume unf… ▽ More In continual or lifelong reinforcement learning, access to the environment should be limited. If we aspire to design algorithms that can run for long periods, continually adapting to new, unexpected situations, then we must be willing to deploy our agents without tuning their hyperparameters over the agent's entire lifetime. The standard practice in deep RL, and even continual RL, is to assume unfettered access to the deployment environment for the full lifetime of the agent. In this paper, we propose a new approach for evaluating lifelong RL agents where only k percent of the experiment data can be used for hyperparameter tuning. We then conduct an empirical study of DQN and SAC across a variety of continuing and non-stationary domains. We find agents generally perform poorly when restricted to k-percent tuning, whereas several algorithmic mitigations designed to maintain network plasticity perform surprisingly well. △ Less

Submitted 25 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.17044 [pdf, other]

The Influence of Baryons on Low-mass Haloes

Authors: Haonan Zheng, Sownak Bose, Carlos S. Frenk, Liang Gao, Adrian Jenkins, Shihong Liao, Volker Springel, Jie Wang, Simon D. M. White

Abstract: The Voids-within-Voids-within-Voids (VVV) project used dark-matter-only simulations to study the abundance and structure of dark matter haloes over the full mass range populated in the standard $Λ\mathrm{CDM}$ cosmology. Here we explore how baryonic effects modify these results for $z=0$ halo masses in the range $10^4$ to $10^7~\mathrm{M_\odot}$, below the threshold for galaxy formation. Our main… ▽ More The Voids-within-Voids-within-Voids (VVV) project used dark-matter-only simulations to study the abundance and structure of dark matter haloes over the full mass range populated in the standard $Λ\mathrm{CDM}$ cosmology. Here we explore how baryonic effects modify these results for $z=0$ halo masses in the range $10^4$ to $10^7~\mathrm{M_\odot}$, below the threshold for galaxy formation. Our main study focuses on three simulations from identical initial conditions at $z=127$, one following dark matter only, one including non-radiative gas, and one additionally including the baryonic physics relevant in this halo mass range (cooling and photoheating). In the non-radiative simulation, above $10^{5.5}~\mathrm{M_\odot}$, halo abundance and internal structure are very similar to the dark-matter-only simulation, and the baryon to dark matter ratio is everywhere close to the cosmic value. At lower mass, this ratio drops and haloes are less concentrated and less massive in the non-radiative case. Test simulations at higher resolution show this to be mainly a resolution effect; the expected drop in baryon content due to residual pressure effects only becomes substantial for $z=0$ haloes below $\sim 10^{2.7}~\mathrm{M_\odot}$. However, gas is heated by reionization at $z=6$ in our ``full physics'' run, and this results in almost complete expulsion of gas from all haloes in our simulated mass range. This suppresses the halo mass function by $\sim 30 \%$, lowers halo concentration, and consequently weakens the dark matter annihilation signal by $\sim 40-60 \%$. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 12+2 pages, 9 figures

arXiv:2403.13784 [pdf, ps, other]

The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial Intelligence

Authors: Matt White, Ibrahim Haddad, Cailean Osborne, Xiao-Yang Liu Yanglet, Ahmed Abdelmonsef, Sachin Varghese

Abstract: Generative AI (GAI) offers unprecedented opportunities for research and innovation, but its commercialization has raised concerns about transparency, reproducibility, and safety. Many open GAI models lack the necessary components for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be ``open-source''. To address these concerns, we propose the Model Openn… ▽ More Generative AI (GAI) offers unprecedented opportunities for research and innovation, but its commercialization has raised concerns about transparency, reproducibility, and safety. Many open GAI models lack the necessary components for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be ``open-source''. To address these concerns, we propose the Model Openness Framework (MOF), a ranked classification system that rates machine learning models based on their completeness and openness, following principles of open science, open source, open data, and open access. The MOF requires specific components of the model development lifecycle to be included and released under appropriate open licenses. This framework aims to prevent misrepresentation of models claiming to be open, guide researchers and developers in providing all model components under permissive licenses, and help individuals and organizations identify models that can be safely adopted without restrictions. By promoting transparency and reproducibility, the MOF combats ``openwashing'' practices and establishes completeness and openness as primary criteria alongside the core tenets of responsible AI. Wide adoption of the MOF will foster a more open AI ecosystem, benefiting research, innovation, and adoption of state-of-the-art models. △ Less

Submitted 3 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Comments: 22 pages

arXiv:2403.08117 [pdf, other]

On the resolution of the sign of gluon polarization in the proton

Authors: N. T. Hunt-Smith, C. Cocuzza, W. Melnitchouk, N. Sato, A. W Thomas, M. J. White

Abstract: Recently the possible existence of negative gluon helicity, $Δg$, has been observed to be compatible with existing empirical constraints, including from jet production in polarized proton-proton collisions at RHIC, and lattice QCD data on polarized gluon Ioffe time distributions. We perform a new global analysis of polarized parton distributions in the proton with new constraints from the high-… ▽ More Recently the possible existence of negative gluon helicity, $Δg$, has been observed to be compatible with existing empirical constraints, including from jet production in polarized proton-proton collisions at RHIC, and lattice QCD data on polarized gluon Ioffe time distributions. We perform a new global analysis of polarized parton distributions in the proton with new constraints from the high-$x$ region of deep-inelastic scattering (DIS). A dramatic reduction in the quality of the fit for the negative $Δg$ replicas compared to those with positive $Δg$ suggest that the negative $Δg$ solution cannot simultaneously account for high-$x$ polarized DIS data along with lattice and polarized jet data. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.04857 [pdf, other]

Dark Matter Line Searches with the Cherenkov Telescope Array

Authors: S. Abe, J. Abhir, A. Abhishek, F. Acero, A. Acharyya, R. Adam, A. Aguasca-Cabot, I. Agudo, A. Aguirre-Santaella, J. Alfaro, R. Alfaro, N. Alvarez-Crespo, R. Alves Batista, J. -P. Amans, E. Amato, G. Ambrosi, L. Angel, C. Aramo, C. Arcaro, T. T. H. Arnesen, L. Arrabito, K. Asano, Y. Ascasibar, J. Aschersleben, H. Ashkar , et al. (540 additional authors not shown)

Abstract: Monochromatic gamma-ray signals constitute a potential smoking gun signature for annihilating or decaying dark matter particles that could relatively easily be distinguished from astrophysical or instrumental backgrounds. We provide an updated assessment of the sensitivity of the Cherenkov Telescope Array (CTA) to such signals, based on observations of the Galactic centre region as well as of sele… ▽ More Monochromatic gamma-ray signals constitute a potential smoking gun signature for annihilating or decaying dark matter particles that could relatively easily be distinguished from astrophysical or instrumental backgrounds. We provide an updated assessment of the sensitivity of the Cherenkov Telescope Array (CTA) to such signals, based on observations of the Galactic centre region as well as of selected dwarf spheroidal galaxies. We find that current limits and detection prospects for dark matter masses above 300 GeV will be significantly improved, by up to an order of magnitude in the multi-TeV range. This demonstrates that CTA will set a new standard for gamma-ray astronomy also in this respect, as the world's largest and most sensitive high-energy gamma-ray observatory, in particular due to its exquisite energy resolution at TeV energies and the adopted observational strategy focussing on regions with large dark matter densities. Throughout our analysis, we use up-to-date instrument response functions, and we thoroughly model the effect of instrumental systematic uncertainties in our statistical treatment. We further present results for other potential signatures with sharp spectral features, e.g.~box-shaped spectra, that would likewise very clearly point to a particle dark matter origin. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 43 pages JCAP style (excluding author list and references), 19 figures

arXiv:2403.02414 [pdf, other]

Examining Lyman-alpha Emitters through MillenniumTNG in anticipation of DESI-II

Authors: Jyotsna Ravi, Boryana Hadzhiyska, Martin White, Lars Hernquist, Sownak Bose

Abstract: The goal of this study is to conduct a timely analysis of the high-redshift star-forming galaxy populations, which will be informative in designing next-generation experiments and their extragalactic targets. We use the hydrodynamical simulation MillenniumTNG (MTNG) to model Lyman-alpha Emitting (LAE) galaxies to extract key properties such as their clustering and occupation statistics. We define… ▽ More The goal of this study is to conduct a timely analysis of the high-redshift star-forming galaxy populations, which will be informative in designing next-generation experiments and their extragalactic targets. We use the hydrodynamical simulation MillenniumTNG (MTNG) to model Lyman-alpha Emitting (LAE) galaxies to extract key properties such as their clustering and occupation statistics. We define LAEs through an empirical relation between star formation rate (SFR) and Lyman-alpha flux. We also explore two other definitions, finding that imposing an additional cut on the maximum stellar mass of the galaxy sample, which approximates the effect of a low escape fraction at high halo mass, leads to a 5-10\% decrease of the linear bias of the population. As expected, we find that the HOD mass parameters rapidly decrease with increasing number density. Additionally, the HOD parameter $σ$ also decreases with number density, implying that the SFR-halo mass relationship becomes tighter for low-luminosity objects. Surprisingly, the non-linear clustering, estimated by the parameter $r_0$, is fixed at fixed number density, whereas the linear bias parameter varies with redshift as $b(z) \propto (1 + z)$, suggesting that our LAE samples are relatively stable and long-lived. Finally, we study the amount of galaxy assembly bias present at $z = 2, \ 3$ and find that while at $z = 2$ it is roughly $\lesssim$10\%, at $z = 3$ it decreases significantly to $\lesssim$5\%. This suggests that assembly bias effects become less important at high $z$ likely due to the lower number of cumulative two-halo interactions (mergers, splashback, strip**, etc.). While our study is based on a single full-physics simulation, we expect our results to reflect the properties of LAEs in the Universe. We demonstrate that our findings are in good agreement with previous results using both observations and simulations. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: 19 pages, 10 figures, 1 table

arXiv:2402.14070 [pdf, other]

Baryon Acoustic Oscillation Theory and Modelling Systematics for the DESI 2024 results

Authors: Shi-Fan Chen, Cullan Howlett, Martin White, Patrick McDonald, Ashley J. Ross, Hee-Jong Seo, Nikhil Padmanabhan, J. Aguilar, S. Ahlen, S. Alam, O. Alves, R. Blum, D. Brooks, X. Chen, S. Cole, T. M. Davis, K. Dawson, A. de la Macorra, Arjun Dey, Z. Ding, P. Doel, S. Ferraro, A. Font-Ribera, D. Forero-Sánchez, J. E. Forero-Romero , et al. (33 additional authors not shown)

Abstract: This paper provides a comprehensive overview of how fitting of Baryon Acoustic Oscillations (BAO) is carried out within the upcoming Dark Energy Spectroscopic Instrument's (DESI) 2024 results using its DR1 dataset, and the associated systematic error budget from theory and modelling of the BAO. We derive new results showing how non-linearities in the clustering of galaxies can cause potential bias… ▽ More This paper provides a comprehensive overview of how fitting of Baryon Acoustic Oscillations (BAO) is carried out within the upcoming Dark Energy Spectroscopic Instrument's (DESI) 2024 results using its DR1 dataset, and the associated systematic error budget from theory and modelling of the BAO. We derive new results showing how non-linearities in the clustering of galaxies can cause potential biases in measurements of the isotropic ($α_{\mathrm{iso}}$) and anisotropic ($α_{\mathrm{ap}}$) BAO distance scales, and how these can be effectively removed with an appropriate choice of reconstruction algorithm. We then demonstrate how theory leads to a clear choice for how to model the BAO and develop, implement and validate a new model for the remaining smooth-broadband (i.e., without BAO) component of the galaxy clustering. Finally, we explore the impact of all remaining modelling choices on the BAO constraints from DESI using a suite of high-precision simulations, arriving at a set of best-practices for DESI BAO fits, and an associated theory and modelling systematic error. Overall, our results demonstrate the remarkable robustness of the BAO to all our modelling choices and motivate a combined theory and modelling systematic error contribution to the post-reconstruction DESI BAO measurements of no more than $0.1\%$ ($0.2\%$) for its isotropic (anisotropic) distance measurements. We expect the theory and best-practices laid out to here to be applicable to other BAO experiments in the era of DESI and beyond. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 29 pages, 18 figures, 1 table, submitted to MNRAS

arXiv:2402.13425 [pdf, other]

Investigating the Histogram Loss in Regression

Authors: Ehsan Imani, Kai Luedemann, Sam Scholnick-Hughes, Esraa Elelimy, Martha White

Abstract: It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction. This additional modeling often comes with performance gain and the reasons behind the improvement are not fully known. This paper investigates a recent approach to regression, the Histogram Loss, which involves learning the conditional distr… ▽ More It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction. This additional modeling often comes with performance gain and the reasons behind the improvement are not fully known. This paper investigates a recent approach to regression, the Histogram Loss, which involves learning the conditional distribution of the target variable by minimizing the cross-entropy between a target distribution and a flexible histogram prediction. We design theoretical and empirical analyses to determine why and when this performance gain appears, and how different components of the loss contribute to it. Our results suggest that the benefits of learning distributions in this setup come from improvements in optimization rather than learning a better representation. We then demonstrate the viability of the Histogram Loss in common deep learning applications without a need for costly hyperparameter tuning. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 50 pages

arXiv:2402.13205 [pdf, other]

Momentum-space Observation of Optically Excited Non-Thermal Electrons in Graphene with Persistent Pseudospin Polarization

Authors: ** Bakalis, Sergii Chernov, Ziling Li, Alice Kunin, Zachary H. Withers, Shuyu Cheng, Alexander Adler, Peng Zhao, Christopher Corder, Michael G. White, Gerd Schönhense, Xu Du, Roland Kawkami, Thomas K. Allison

Abstract: The unique optical properties of graphene, with broadband absorption and ultrafast response, make it a critical component of optoelectronic and spintronic devices. Using time-resolved momentum microscopy with high data rate and high dynamic range, we report momentum-space measurements of electrons promoted to the graphene conduction band with visible light, and their subsequent relaxation. We obse… ▽ More The unique optical properties of graphene, with broadband absorption and ultrafast response, make it a critical component of optoelectronic and spintronic devices. Using time-resolved momentum microscopy with high data rate and high dynamic range, we report momentum-space measurements of electrons promoted to the graphene conduction band with visible light, and their subsequent relaxation. We observe a pronounced non-thermal distribution of nascent photoexcited electrons with lattice pseudospin polarization in remarkable agreement with results of simple tight-binding theory. By varying the excitation fluence, we vary the relative importance of electron-electron vs. electron-phonon scattering in the relaxation of the initial distribution. Increasing the excitation fluence results in increased noncollinear electron-electron scattering and reduced pseudospin polarization, although up-scattered electrons retain a degree of polarization. These detailed momentum-resolved electron dynamics in graphene demonstrate the capabilities of high-performance time-resolved momentum microscopy in the study of 2D materials and can inform the design of graphene devices. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 19 pages, 5 figures

arXiv:2402.10890 [pdf, other]

When is Tree Search Useful for LLM Planning? It Depends on the Discriminator

Authors: Ziru Chen, Michael White, Raymond Mooney, Ali Payani, Yu Su, Huan Sun

Abstract: In this paper, we examine how large language models (LLMs) solve multi-step problems under a language agent framework with three components: a generator, a discriminator, and a planning method. We investigate the practical utility of two advanced planning methods, iterative correction and tree search. We present a comprehensive analysis of how discrimination accuracy affects the overall performanc… ▽ More In this paper, we examine how large language models (LLMs) solve multi-step problems under a language agent framework with three components: a generator, a discriminator, and a planning method. We investigate the practical utility of two advanced planning methods, iterative correction and tree search. We present a comprehensive analysis of how discrimination accuracy affects the overall performance of agents when using these two methods or a simpler method, re-ranking. Experiments on two tasks, text-to-SQL parsing and mathematical reasoning, show that: (1) advanced planning methods demand discriminators with at least 90% accuracy to achieve significant improvements over re-ranking; (2) current LLMs' discrimination abilities have not met the needs of advanced planning methods to achieve such improvements; (3) with LLM-based discriminators, advanced planning methods may not adequately balance accuracy and efficiency. For example, compared to the other two methods, tree search is at least 10--20 times slower but leads to negligible performance gains, which hinders its real-world applications. Code and data are available at https://github.com/OSU-NLP-Group/llm-planning-eval. △ Less

Submitted 6 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: ACL 2024 main

arXiv:2402.10339 [pdf, other]

What to Do When Your Discrete Optimization Is the Size of a Neural Network?

Authors: Hugo Silva, Martha White

Abstract: Oftentimes, machine learning applications using neural networks involve solving discrete optimization problems, such as in pruning, parameter-isolation-based continual learning and training of binary networks. Still, these discrete problems are combinatorial in nature and are also not amenable to gradient-based optimization. Additionally, classical approaches used in discrete settings do not scale… ▽ More Oftentimes, machine learning applications using neural networks involve solving discrete optimization problems, such as in pruning, parameter-isolation-based continual learning and training of binary networks. Still, these discrete problems are combinatorial in nature and are also not amenable to gradient-based optimization. Additionally, classical approaches used in discrete settings do not scale well to large neural networks, forcing scientists and empiricists to rely on alternative methods. Among these, two main distinct sources of top-down information can be used to lead the model to good solutions: (1) extrapolating gradient information from points outside of the solution set (2) comparing evaluations between members of a subset of the valid solutions. We take continuation path (CP) methods to represent using purely the former and Monte Carlo (MC) methods to represent the latter, while also noting that some hybrid methods combine the two. The main goal of this work is to compare both approaches. For that purpose, we first overview the two classes while also discussing some of their drawbacks analytically. Then, on the experimental section, we compare their performance, starting with smaller microworld experiments, which allow more fine-grained control of problem variables, and gradually moving towards larger problems, including neural network regression and neural network pruning for image classification, where we additionally compare against magnitude-based pruning. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: Submitted to JMLR

arXiv:2402.03903 [pdf, other]

Averaging $n$-step Returns Reduces Variance in Reinforcement Learning

Authors: Brett Daley, Martha White, Marlos C. Machado

Abstract: Multistep returns, such as $n$-step returns and $λ$-returns, are commonly used to improve the sample efficiency of reinforcement learning (RL) methods. The variance of the multistep returns becomes the limiting factor in their length; looking too far into the future increases variance and reverses the benefits of multistep learning. In our work, we demonstrate the ability of compound returns -- we… ▽ More Multistep returns, such as $n$-step returns and $λ$-returns, are commonly used to improve the sample efficiency of reinforcement learning (RL) methods. The variance of the multistep returns becomes the limiting factor in their length; looking too far into the future increases variance and reverses the benefits of multistep learning. In our work, we demonstrate the ability of compound returns -- weighted averages of $n$-step returns -- to reduce variance. We prove for the first time that any compound return with the same contraction modulus as a given $n$-step return has strictly lower variance. We additionally prove that this variance-reduction property improves the finite-sample complexity of temporal-difference learning under linear function approximation. Because general compound returns can be expensive to implement, we introduce two-bootstrap returns which reduce variance while remaining efficient, even when using minibatched experience replay. We conduct experiments showing that compound returns often increase the sample efficiency of $n$-step deep RL agents like DQN and PPO. △ Less

Submitted 5 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: ICML 2024. 27 pages, 7 figures, 3 tables

arXiv:2402.02015 [pdf, other]

SkyMapper Southern Survey: Data Release 4

Authors: Christopher A. Onken, Christian Wolf, Michael S. Bessell, Seo-Won Chang, Lance C. Luvaul, John L. Tonry, Marc C. White, Gary S. Da Costa

Abstract: We present the fourth data release (DR4) of the SkyMapper Southern Survey (SMSS), the last major step in our hemispheric survey with six optical filters: u, v, g, r, i, z. SMSS DR4 covers 26,000 sq.deg from over 400,000 images acquired by the 1.3m SkyMapper telescope between 2014-03 and 2021-09. The 6-band sky coverage extends from the South Celestial Pole to Dec = +16deg, with some images reachin… ▽ More We present the fourth data release (DR4) of the SkyMapper Southern Survey (SMSS), the last major step in our hemispheric survey with six optical filters: u, v, g, r, i, z. SMSS DR4 covers 26,000 sq.deg from over 400,000 images acquired by the 1.3m SkyMapper telescope between 2014-03 and 2021-09. The 6-band sky coverage extends from the South Celestial Pole to Dec = +16deg, with some images reaching Dec ~ +28deg. In contrast to previous DRs, we include all good-quality images from the facility taken during that time span, not only those explicitly taken for the public Survey. From the image dataset, we produce a catalogue of nearly 13 billion detections made from ~700 million unique astrophysical objects. The typical 10sigma depths for each field range between 18.5 and 20.5 mag, depending on the filter, but certain sky regions include longer exposures that reach as deep as 22 mag in some filters. As with previous SMSS catalogues, we have cross-matched with a host of other imaging and spectroscopic datasets to facilitate additional science outcomes. SMSS DR4 is now available to the worldwide astronomical community. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 32 pages. Submitted to the Publications of the Astronomical Society of Australia

arXiv:2401.13166 [pdf, other]

Cosmology before noon with multiple galaxy populations

Authors: Haruki Ebina, Martin White

Abstract: Near-future facilities observing the high-redshift universe ($2<z<5$) will have an opportunity to take advantage of "multi-tracer" cosmology by observing multiple tracers of the matter density field: Lyman alpha emitters (LAE), Lyman break galaxies (LBG), and CMB lensing $κ$. In this work we use Fisher forecasts to investigate the effect of multi-tracers on next-generation facilities. In agreement… ▽ More Near-future facilities observing the high-redshift universe ($2<z<5$) will have an opportunity to take advantage of "multi-tracer" cosmology by observing multiple tracers of the matter density field: Lyman alpha emitters (LAE), Lyman break galaxies (LBG), and CMB lensing $κ$. In this work we use Fisher forecasts to investigate the effect of multi-tracers on next-generation facilities. In agreement with previous work, we show that multiple tracers improve constraints primarily from degeneracy breaking, instead of the traditional intuition of sample variance cancellation. Then, we forecast that for both BBN and CMB primary priors, the addition of lensing and LAEs onto a LBG-only sample will gain 25\% or more in many parameters, with the largest gains being factor of $\sim10$ improvement for $f_{\rm EDE}$. We include a preliminary approach towards modelling the impact of radiative transfer (RT) on forecasts involving LAEs by introducing a simplified model at linear theory level. Our results, albeit preliminary, show that the while RT influences LAE-only forecasts strongly, its effect on composite multi-tracer forecasts are limited. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 37 pages, 16 figures

arXiv:2401.11967 [pdf, other]

Exploring descriptors for titanium microstructure via digital fingerprints from variational autoencoders

Authors: Michael D. White, Gowtham Nimmal Haribabu, Jeyapriya Thimukonda Jegadeesan, Bikramjit Basu, Philip J. Withers, Chris P. Race

Abstract: Microstructure is key to controlling and understanding the properties of metallic materials, but traditional approaches to describing microstructure capture only a small number of features. To enable data-centric approaches to materials discovery, allow efficient storage of microstructural data and assist in quality control in metals processing, we require more complete descriptors of microstructu… ▽ More Microstructure is key to controlling and understanding the properties of metallic materials, but traditional approaches to describing microstructure capture only a small number of features. To enable data-centric approaches to materials discovery, allow efficient storage of microstructural data and assist in quality control in metals processing, we require more complete descriptors of microstructure. The concept of microstructural fingerprinting, using machine learning (ML) to develop quantitative, low-dimensional descriptors of microstructures, has recently attracted significant attention. However, it is difficult to interpret conclusions drawn by ML algorithms, which are commonly referred to as "black boxes". Here we explore variational autoencoders (VAEs), which can be trained to produce microstructural fingerprints in a continuous latent space. VAEs enable the reconstruction of images from fingerprints, allowing us to explore how key features of microstructure are encoded. We develop a VAE architecture based on ResNet18 and train it on Ti-6Al-4V optical micrographs as an example of an industrially important alloy where microstructural control is critical to performance. The latent space is explored in several ways, including by supplying interpolated and randomly perturbed fingerprints to the trained decoder and via dimensionality reduction to explore the distribution of microstructural features within the latent space of fingerprints. We show that the VAE fingerprints exhibit smooth, interpolable behaviour with stability to local perturbations, supporting their suitability as general purpose descriptors for microstructure. We also show that key properties of the microstructures are strongly correlated with position in the latent space, supporting the use of VAE fingerprints for quantitative exploration of process-structure-property relationships. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2312.17493 [pdf, other]

Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning

Authors: Xiao-Yang Liu, Rongyi Zhu, Daochen Zha, Jiechao Gao, Shan Zhong, Matt White, Meikang Qiu

Abstract: The surge in interest and application of large language models (LLMs) has sparked a drive to fine-tune these models to suit specific applications, such as finance and medical science. However, concerns regarding data privacy have emerged, especially when multiple stakeholders aim to collaboratively enhance LLMs using sensitive data. In this scenario, federated learning becomes a natural choice, al… ▽ More The surge in interest and application of large language models (LLMs) has sparked a drive to fine-tune these models to suit specific applications, such as finance and medical science. However, concerns regarding data privacy have emerged, especially when multiple stakeholders aim to collaboratively enhance LLMs using sensitive data. In this scenario, federated learning becomes a natural choice, allowing decentralized fine-tuning without exposing raw data to central servers. Motivated by this, we investigate how data privacy can be ensured in LLM fine-tuning through practical federated learning approaches, enabling secure contributions from multiple parties to enhance LLMs. Yet, challenges arise: 1) despite avoiding raw data exposure, there is a risk of inferring sensitive information from model outputs, and 2) federated learning for LLMs incurs notable communication overhead. To address these challenges, this article introduces DP-LoRA, a novel federated learning algorithm tailored for LLMs. DP-LoRA preserves data privacy by employing a Gaussian mechanism that adds noise in weight updates, maintaining individual data privacy while facilitating collaborative model training. Moreover, DP-LoRA optimizes communication efficiency via low-rank adaptation, minimizing the transmission of updated weights during distributed training. The experimental results across medical, financial, and general datasets using various LLMs demonstrate that DP-LoRA effectively ensures strict privacy constraints while minimizing communication overhead. △ Less

Submitted 2 June, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

Comments: 21 pages, 1 figure, 19 tables

arXiv:2312.12285 [pdf, other]

Harmonic analysis of discrete tracers of large-scale structure

Authors: Antón Baleato Lizancos, Martin White

Abstract: It is commonplace in cosmology to analyze fields projected onto the celestial sphere, and in particular density fields that are defined by a set of points e.g. galaxies. When performing an harmonic-space analysis of such data (e.g. an angular power spectrum) using a pixelized map one has to deal with aliasing of small-scale power and pixel window functions. We compare and contrast the approaches t… ▽ More It is commonplace in cosmology to analyze fields projected onto the celestial sphere, and in particular density fields that are defined by a set of points e.g. galaxies. When performing an harmonic-space analysis of such data (e.g. an angular power spectrum) using a pixelized map one has to deal with aliasing of small-scale power and pixel window functions. We compare and contrast the approaches to this problem taken in the cosmic microwave background and large-scale structure communities, and advocate for a direct approach that avoids pixelization. We describe a method for performing a pseudo-spectrum analysis of a galaxy data set and show that it can be implemented efficiently using well-known algorithms for special functions that are suited to acceleration by graphics processing units (GPUs). The method returns the same spectra as the more traditional map-based approach if in the latter the number of pixels is taken to be sufficiently large and the mask is well sampled. The method is readily generalizable to cross-spectra and higher-order functions. It also provides a convenient route for distributing the information in a galaxy catalog directly in harmonic space, as a complement to releasing the configuration-space positions and weights. We make public a code enabling the application of our method to existing and upcoming datasets. △ Less

Submitted 1 April, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

Comments: 11 pages + appendices & bibliography. 7 figures. Matches version published in JCAP. Public code (DirectSHT) available at https://github.com/martinjameswhite/directsht/tree/main

arXiv:2312.06027 [pdf, other]

Extending Global Fits of 4D Composite Higgs Models with Partially Composite Leptons

Authors: Ethan Carragher, Kenn Shern Goh, Wei Su, Martin White, Anthony G. Williams

Abstract: We perform the first convergent Bayesian global fits of 4D Composite Higgs Models with partially-composite third generation quarks and leptons based on the minimal $SO(5) \rightarrow SO(4)$ symmetry breaking pattern. We consider two models with the $τ$ lepton and its associated neutrino in different representations of $SO(5)$. Fitting each model with a wide array of experimental constraints allows… ▽ More We perform the first convergent Bayesian global fits of 4D Composite Higgs Models with partially-composite third generation quarks and leptons based on the minimal $SO(5) \rightarrow SO(4)$ symmetry breaking pattern. We consider two models with the $τ$ lepton and its associated neutrino in different representations of $SO(5)$. Fitting each model with a wide array of experimental constraints allows us to analyse the Bayesian evidence and currently-observed fine-tuning of each model by calculating the Kullback-Leibler divergence between their respective priors and posteriors. Notably both models are found to be capable of satisfying all constraints simultaneously at the $3σ$ level at scales of $< 5$ TeV. From a Bayesian viewpoint of naturalness the model with leptons in the $\mathbf{14}$ and $\mathbf{10}$ representations is preferred over those in the $\mathbf{5}$ representation due to its lower fine-tuning. Finally, we consider the experimental signatures for the preferred parameters in these models, including lepton partner decay signatures and gluon-fusion produced Higgs signal strengths, and discuss their potential phenomenology at future high-luminosity LHC runs. △ Less

Submitted 30 June, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

Comments: 32+12 pages, 13+8 figures; version 2 with improved content following comments from referee; ver 3 improved clarity

arXiv:2312.02953 [pdf]

Longitudinal Assessment of Seasonal Impacts and Depression Associations on Circadian Rhythm Using Multimodal Wearable Sensing

Authors: Yuezhou Zhang, Amos A Folarin, Shaoxiong Sun, Nicholas Cummins, Yatharth Ranjan, Zulqarnain Rashid, Callum Stewart, Pauline Conde, Heet Sankesara, Petroula Laiou, Faith Matcham, Katie M White, Carolin Oetzmann, Femke Lamers, Sara Siddi, Sara Simblett, Srinivasan Vairavan, Inez Myin-Germeys, David C. Mohr, Til Wykes, Josep Maria Haro, Peter Annas, Brenda WJH Penninx, Vaibhav A Narayan, Matthew Hotopf , et al. (2 additional authors not shown)

Abstract: Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-… ▽ More Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-8]) and behaviors (monitored by Fitbit) were tracked for up to two years. Twelve features were extracted from Fitbit recordings to approximate circadian rhythms. Three nested linear mixed-effects models were employed for each feature: (1) incorporating the PHQ-8 score as an independent variable; (2) adding the season variable; and (3) adding an interaction term between season and the PHQ-8 score. Results: This study analyzed 10,018 PHQ-8 records with Fitbit data from 543 participants. Upon adjusting for seasonal effects, higher PHQ-8 scores were associated with reduced activity, irregular behaviors, and delayed rhythms. Notably, the negative association with daily step counts was stronger in summer and spring than in winter, and the positive association with the onset of the most active continuous 10-hour period was significant only during summer. Furthermore, participants had shorter and later sleep, more activity, and delayed circadian rhythms in summer compared to winter. Discussion and Conclusions: Our findings underscore the significant seasonal impacts on human circadian rhythms and their associations with depression and indicate that wearable-measured circadian rhythms have the potential to be the digital biomarkers of depression. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2312.02355 [pdf, other]

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

Authors: Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White

Abstract: Offline reinforcement learning algorithms often require careful hyperparameter tuning. Consequently, before deployment, we need to select amongst a set of candidate policies. As yet, however, there is little understanding about the fundamental limits of this offline policy selection (OPS) problem. In this work we aim to provide clarity on when sample efficient OPS is possible, primarily by connect… ▽ More Offline reinforcement learning algorithms often require careful hyperparameter tuning. Consequently, before deployment, we need to select amongst a set of candidate policies. As yet, however, there is little understanding about the fundamental limits of this offline policy selection (OPS) problem. In this work we aim to provide clarity on when sample efficient OPS is possible, primarily by connecting OPS to off-policy policy evaluation (OPE) and Bellman error (BE) estimation. We first show a hardness result, that in the worst case, OPS is just as hard as OPE, by proving a reduction of OPE to OPS. As a result, no OPS method can be more sample efficient than OPE in the worst case. We then propose a BE method for OPS, called Identifiable BE Selection (IBES), that has a straightforward method for selecting its own hyperparameters. We highlight that using IBES for OPS generally has more requirements than OPE methods, but if satisfied, can be more sample efficient. We conclude with an empirical study comparing OPE and IBES, and by showing the difficulty of OPS on an offline Atari benchmark dataset. △ Less

Submitted 4 December, 2023; originally announced December 2023.

arXiv:2312.01624 [pdf, other]

GVFs in the Real World: Making Predictions Online for Water Treatment

Authors: Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White

Abstract: In this paper we investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant. Develo** such a prediction system is a critical step on the path to optimizing and automating water treatment. Before that, there are many questions to answer about the predictability of the data, suitable neural network architectures, how to overcome partial obse… ▽ More In this paper we investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant. Develo** such a prediction system is a critical step on the path to optimizing and automating water treatment. Before that, there are many questions to answer about the predictability of the data, suitable neural network architectures, how to overcome partial observability and more. We first describe this dataset, and highlight challenges with seasonality, nonstationarity, partial observability, and heterogeneity across sensors and operation modes of the plant. We then describe General Value Function (GVF) predictions -- discounted cumulative sums of observations -- and highlight why they might be preferable to classical n-step predictions common in time series prediction. We discuss how to use offline data to appropriately pre-train our temporal difference learning (TD) agents that learn these GVF predictions, including how to select hyperparameters for online fine-tuning in deployment. We find that the TD-prediction agent obtains an overall lower normalized mean-squared error than the n-step prediction agent. Finally, we show the importance of learning in deployment, by comparing a TD agent trained purely offline with no online updating to a TD agent that learns online. This final result is one of the first to motivate the importance of adapting predictions in real-time, for non-stationary high-volume systems in the real world. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: Published in Machine Learning (2023)

Journal ref: Machine Learning (2023): 1-31

arXiv:2311.16162 [pdf, other]

Leveraging Artificial Intelligence Technology for Map** Research to Sustainable Development Goals: A Case Study

Authors: Hui Yin, Amir Aryani, Gavin Lambert, Marcus White, Luis Salvador-Carulla, Shazia Sadiq, Elvira Sojli, Jennifer Boddy, Greg Murray, Wing Wah Tham

Abstract: The number of publications related to the Sustainable Development Goals (SDGs) continues to grow. These publications cover a diverse spectrum of research, from humanities and social sciences to engineering and health. Given the imperative of funding bodies to monitor outcomes and impacts, linking publications to relevant SDGs is critical but remains time-consuming and difficult given the breadth a… ▽ More The number of publications related to the Sustainable Development Goals (SDGs) continues to grow. These publications cover a diverse spectrum of research, from humanities and social sciences to engineering and health. Given the imperative of funding bodies to monitor outcomes and impacts, linking publications to relevant SDGs is critical but remains time-consuming and difficult given the breadth and complexity of the SDGs. A publication may relate to several goals (interconnection feature of goals), and therefore require multidisciplinary knowledge to tag accurately. Machine learning approaches are promising and have proven particularly valuable for tasks such as manual data labeling and text classification. In this study, we employed over 82,000 publications from an Australian university as a case study. We utilized a similarity measure to map these publications onto Sustainable Development Goals (SDGs). Additionally, we leveraged the OpenAI GPT model to conduct the same task, facilitating a comparative analysis between the two approaches. Experimental results show that about 82.89% of the results obtained by the similarity measure overlap (at least one tag) with the outputs of the GPT model. The adopted model (similarity measure) can complement GPT model for SDG classification. Furthermore, deep learning methods, which include the similarity measure used here, are more accessible and trusted for dealing with sensitive data without the use of commercial AI services or the deployment of expensive computing resources to operate large language models. Our study demonstrates how a crafted combination of the two methods can achieve reliable results for map** research to the SDGs. △ Less

Submitted 9 November, 2023; originally announced November 2023.

ACM Class: I.2.7

arXiv:2311.05076 [pdf, other]

Evaluating diversion and treatment policies for opioid use disorder

Authors: Veronica M. White, Laura A. Albert

Abstract: The United States opioid crisis contributed to 80,411 fatalities in 2021. It has strained hospitals, treatment facilities, and law enforcement agencies due to the enormous resources and procedures needed to respond to the crisis. As a result, many individuals who use opioids never receive or finish the treatment they need and instead have many interactions with hospitals or the criminal justice sy… ▽ More The United States opioid crisis contributed to 80,411 fatalities in 2021. It has strained hospitals, treatment facilities, and law enforcement agencies due to the enormous resources and procedures needed to respond to the crisis. As a result, many individuals who use opioids never receive or finish the treatment they need and instead have many interactions with hospitals or the criminal justice system. This paper introduces a discrete event simulation model that evaluates three opioid use disorder treatment policies: arrest diversion, re-entry case management, and overdose diversion. Publicly available data from 2011 to 2019 in Dane County, Wisconsin, was used to forecast opioid-related outcomes through 2032. Through analyzing a variety of policy-mix implementations, the study offers a versatile framework for evaluating policies at various implementation levels. The results demonstrate that treatment policies that create new pathways and programming by utilizing treatment services and successfully divert at least 20% of eligible individuals can lead to more opioid-resilient communities. The benefits increase when more policies are enacted and/or are offered to more individuals. We assume communities invest in increasing treatment capacity to meet increased treatment demand. In policy-mixes where societal savings from decreased opioid use, hospital encounters, and opioid-related arrests outweigh the costs of opioid use disorder treatment, the 2032 total savings range from $7.04 to $29.73 million. To reverse the opioid crisis within a community, treatment policies may need to be combined with other strategies, such as harm reduction, supply reduction, and use prevention. △ Less

Submitted 1 December, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

arXiv:2309.15295 [pdf, other]

Scaling solutions as Early Dark Energy resolutions to the Hubble tension

Authors: Edmund J. Copeland, Adam Moss, Sergio Sevillano Muñoz, Jade M. M. White

Abstract: A wide class of scalar field models including Quintessence and K-essence have the attractive property of tracker regimes, where the energy density stored in the field evolves so as to mimic that of the dominant background component for a period of time. During this evolution, for a brief period of time there is an increase in the energy density of the field as it spirals in towards it's attractor… ▽ More A wide class of scalar field models including Quintessence and K-essence have the attractive property of tracker regimes, where the energy density stored in the field evolves so as to mimic that of the dominant background component for a period of time. During this evolution, for a brief period of time there is an increase in the energy density of the field as it spirals in towards it's attractor solution. We show that when the peak of this energy density occurs around the epoch of equality, we can address a key requirement of early dark energy (EDE), postulated as a solution to the Hubble tension. In particular we demonstrate how this can occur in a wide class of Quintessence, axion and K-essence models, before showing that the Quintessence models suffer in that they generally lead to sound speeds incompatible with the requirements of EDE, whereas the K-essence and axion models can do a better job of fitting the data. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 26 pages, 12 figures

Showing 1–50 of 1,201 results for author: White, M