-
Unveiling the internal structure and formation history of the three planets transiting HIP 29442 (TOI-469) with CHEOPS
Authors:
J. A. Egger,
H. P. Osborn,
D. Kubyshkina,
C. Mordasini,
Y. Alibert,
M. N. Günther,
M. Lendl,
A. Brandeker,
A. Heitzmann,
A. Leleu,
M. Damasso,
A. Bonfanti,
T. G. Wilson,
S. G. Sousa,
J. Haldemann,
L. Delrez,
M. J. Hooton,
T. Zingales,
R. Luque,
R. Alonso,
J. Asquier,
T. Bárczy,
D. Barrado Navascues,
S. C. C. Barros,
W. Baumjohann
, et al. (69 additional authors not shown)
Abstract:
Multiplanetary systems spanning the radius valley are ideal testing grounds for exploring the proposed explanations for the observed bimodality in the radius distribution of close-in exoplanets. One such system is HIP 29442 (TOI-469), an evolved K0V star hosting two super-Earths and a sub-Neptune. We observe HIP 29442 with CHEOPS for a total of 9.6 days, which we model jointly with 2 sectors of TE…
▽ More
Multiplanetary systems spanning the radius valley are ideal testing grounds for exploring the proposed explanations for the observed bimodality in the radius distribution of close-in exoplanets. One such system is HIP 29442 (TOI-469), an evolved K0V star hosting two super-Earths and a sub-Neptune. We observe HIP 29442 with CHEOPS for a total of 9.6 days, which we model jointly with 2 sectors of TESS data to derive planetary radii of $3.410\pm0.046$, $1.551\pm0.045$ and $1.538\pm0.049$ R$_\oplus$ for planets b, c and d, which orbit HIP 29442 with periods of 13.6, 3.5 and 6.4 days. For planet d, this value deviates by more than 3 sigma from the median value reported in the discovery paper, leading us to conclude that caution is required when using TESS photometry to determine the radii of small planets with low per-transit S/N and large gaps between observations. Given the high precision of these new radii, combining them with published RVs from ESPRESSO and HIRES provides us with ideal conditions to investigate the internal structure and formation pathways of the planets in the system. We introduce the publicly available code plaNETic, a fast and robust neural network-based Bayesian internal structure modelling framework. We then apply hydrodynamic models to explore the upper atmospheric properties of these inferred structures. Finally, we identify planetary system analogues in a synthetic population generated with the Bern model for planet formation and evolution. Based on this analysis, we find that the planets likely formed on opposing sides of the water iceline from a protoplanetary disk with an intermediate solid mass. We finally report that the observed parameters of the HIP 29442 system are compatible with both a scenario where the second peak in the bimodal radius distribution corresponds to sub-Neptunes with a pure H/He envelope as well as a scenario with water-rich sub-Neptunes.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Dynamical mass determination and partial eclipses of the heartbeat star HD 181793
Authors:
Laura E. Uronen,
Andrew Collier Cameron,
Thomas G. Wilson
Abstract:
We identify the bright Am-type star HD 181793 to be a previously-unknown eclipsing, chemically peculiar heartbeat binary, the second of its kind known. The system carries an orbital period of $P = 11.47578275 \pm 0.00000055$ days. We use TESS photometry and LCOGT NRES radial velocity data to build a self-consistent orbital model and determine the fundamental stellar characteristics of the primary.…
▽ More
We identify the bright Am-type star HD 181793 to be a previously-unknown eclipsing, chemically peculiar heartbeat binary, the second of its kind known. The system carries an orbital period of $P = 11.47578275 \pm 0.00000055$ days. We use TESS photometry and LCOGT NRES radial velocity data to build a self-consistent orbital model and determine the fundamental stellar characteristics of the primary. We use a spectral separation method to unveil the secondary and measure the masses of both stars. The radial velocity amplitude of the primary, $K_1 = 47.41+0.13-0.12 km s^{-1}$, gives a mass $M_1 = 1.57 \pm 0.01 $ Msun. The secondary radial velocity amplitude $K_2 = 84.95+0.12-0.09 km s^{-1}$ yields a mass ratio $q = 0.558 \pm 0.002$ and a secondary mass $M_2 = 0.87 \pm 0.01 $ Msun. From the spectral energy distribution and Gaia parallax we find a radius $R_1 = 2.04 \pm 0.05$ Rsun. The grazing transit profile and spectroscopic luminosity ratio indicate $R_2 = 1.04+0.15-0.10$ Rsun, suggesting an early-K spectral type. We show that the heartbeat feature in the TESS light curve can be explained by time-varying ellipsoidal variation, driven by the orbital eccentricity of $e = 0.3056+0.0024-0.0026$, and relativistic beaming of the light of the primary. We find no evidence of tidally-excited oscillations.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Just How Flexible are Neural Networks in Practice?
Authors:
Ravid Shwartz-Ziv,
Micah Goldblum,
Arpit Bansal,
C. Bayan Bruss,
Yann LeCun,
Andrew Gordon Wilson
Abstract:
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessible via our training procedure, including the optimizer and regularizers, limiting flexibility. Moreover, the exact parameterization of the function c…
▽ More
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessible via our training procedure, including the optimizer and regularizers, limiting flexibility. Moreover, the exact parameterization of the function class, built into an architecture, shapes its loss surface and impacts the minima we find. In this work, we examine the ability of neural networks to fit data in practice. Our findings indicate that: (1) standard optimizers find minima where the model can only fit training sets with significantly fewer samples than it has parameters; (2) convolutional networks are more parameter-efficient than MLPs and ViTs, even on randomly labeled data; (3) while stochastic training is thought to have a regularizing effect, SGD actually finds minima that fit more training data than full-batch gradient descent; (4) the difference in capacity to fit correctly labeled and incorrectly labeled samples can be predictive of generalization; (5) ReLU activation functions result in finding minima that fit more data despite being designed to avoid vanishing and exploding gradients in deep architectures.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Exploration by Learning Diverse Skills through Successor State Measures
Authors:
Paul-Antoine Le Tolguenec,
Yann Besse,
Florent Teichteil-Konigsbuch,
Dennis G. Wilson,
Emmanuel Rachelson
Abstract:
The ability to perform different skills can encourage agents to explore. In this work, we aim to construct a set of diverse skills which uniformly cover the state space. We propose a formalization of this search for diverse skills, building on a previous definition based on the mutual information between states and skills. We consider the distribution of states reached by a policy conditioned on e…
▽ More
The ability to perform different skills can encourage agents to explore. In this work, we aim to construct a set of diverse skills which uniformly cover the state space. We propose a formalization of this search for diverse skills, building on a previous definition based on the mutual information between states and skills. We consider the distribution of states reached by a policy conditioned on each skill and leverage the successor state measure to maximize the difference between these skill distributions. We call this approach LEADS: Learning Diverse Skills through Successor States. We demonstrate our approach on a set of maze navigation and robotic control tasks which show that our method is capable of constructing a diverse set of skills which exhaustively cover the state space without relying on reward or exploration bonuses. Our findings demonstrate that this new formalization promotes more robust and efficient exploration by combining mutual information maximization and exploration bonuses.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency
Authors:
Alan Nawzad Amin,
Andrew Gordon Wilson
Abstract:
To make accurate predictions, understand mechanisms, and design interventions in systems of many variables, we wish to learn causal graphs from large scale data. Unfortunately the space of all possible causal graphs is enormous so scalably and accurately searching for the best fit to the data is a challenge. In principle we could substantially decrease the search space, or learn the graph entirely…
▽ More
To make accurate predictions, understand mechanisms, and design interventions in systems of many variables, we wish to learn causal graphs from large scale data. Unfortunately the space of all possible causal graphs is enormous so scalably and accurately searching for the best fit to the data is a challenge. In principle we could substantially decrease the search space, or learn the graph entirely, by testing the conditional independence of variables. However, deciding if two variables are adjacent in a causal graph may require an exponential number of tests. Here we build a scalable and flexible method to evaluate if two variables are adjacent in a causal graph, the Differentiable Adjacency Test (DAT). DAT replaces an exponential number of tests with a provably equivalent relaxed problem. It then solves this problem by training two neural networks. We build a graph learning method based on DAT, DAT-Graph, that can also learn from data with interventions. DAT-Graph can learn graphs of 1000 variables with state of the art accuracy. Using the graph learned by DAT-Graph, we also build models that make much more accurate predictions of the effects of interventions on large scale RNA sequencing data.
△ Less
Submitted 18 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Large Language Models Must Be Taught to Know What They Don't Know
Authors:
Sanyam Kapoor,
Nate Gruver,
Manley Roberts,
Katherine Collins,
Arka Pal,
Umang Bhatt,
Adrian Weller,
Samuel Dooley,
Micah Goldblum,
Andrew Gordon Wilson
Abstract:
When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce sampling methods that can be prohibitively expensive. In this work, we first argue that prompting on its own is insufficient to achieve good calibrati…
▽ More
When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce sampling methods that can be prohibitively expensive. In this work, we first argue that prompting on its own is insufficient to achieve good calibration and then show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead. We show that a thousand graded examples are sufficient to outperform baseline methods and that training through the features of a model is necessary for good performance and tractable for large open-source models when using LoRA. We also investigate the mechanisms that enable reliable LLM uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators, applicable not just to their own uncertainties but also the uncertainty of other models. Lastly, we show that uncertainty estimates inform human use of LLMs in human-AI collaborative settings through a user study.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Transferring Knowledge from Large Foundation Models to Small Downstream Models
Authors:
Shikai Qiu,
Boran Han,
Danielle C. Maddix,
Shuai Zhang,
Yuyang Wang,
Andrew Gordon Wilson
Abstract:
How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes combining multiple pre-trained models that learn co…
▽ More
How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes combining multiple pre-trained models that learn complementary information. To address these shortcomings, we introduce Adaptive Feature Transfer (AFT). Instead of transferring weights, AFT operates purely on features, thereby decoupling the choice of the pre-trained model from the smaller downstream model. Rather than indiscriminately compressing all pre-trained features, AFT adaptively transfers pre-trained features that are most useful for performing the downstream task, using a simple regularization that adds minimal overhead. Across multiple vision, language, and multi-modal datasets, AFT achieves significantly better downstream performance compared to alternatives with a similar computational cost. Furthermore, AFT reliably translates improvement in pre-trained models into improvement in downstream performance, even if the downstream model is over $50\times$ smaller, and can effectively transfer complementary information learned by multiple pre-trained models.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
All-sky three-dimensional dust density and extinction Maps of the Milky Way out to 2.8 kpc
Authors:
T. E. Dharmawardena,
C. A. L. Bailer-Jones,
M. Fouesneau,
D. Foreman-Mackey,
P. Coronica,
T. Colnaghi,
T. Müller,
A. G. Wilson
Abstract:
Three-dimensional dust density maps are crucial for understanding the structure of the interstellar medium of the Milky Way and the processes that shape it. However, constructing these maps requires large datasets and the methods used to analyse them are computationally expensive and difficult to scale up. As a result it is has only recently become possible to map kiloparsec-scale regions of our G…
▽ More
Three-dimensional dust density maps are crucial for understanding the structure of the interstellar medium of the Milky Way and the processes that shape it. However, constructing these maps requires large datasets and the methods used to analyse them are computationally expensive and difficult to scale up. As a result it is has only recently become possible to map kiloparsec-scale regions of our Galaxy at parsec-scale grid sampling. We present all-sky three-dimensional dust density and extinction maps of the Milky Way out to 2.8~kpc in distance from the Sun using the fast and scalable Gaussian Process algorithm \DustT. The sampling of the three-dimensional map is $l,b,d = 1^{\circ} \times1^{\circ} \times 1.7$~pc. The input extinction and distance catalogue contains 120 million stars with photometry and astrometry from Gaia DR2, 2MASS and AllWISE. This combines the strengths of optical and infrared data to probe deeper into the dusty regions of the Milky Way. We compare our maps with other published 3D dust maps. All maps quantitatively agree at the $0.001$~mag~pc$^{-1}$ scale with many qualitatively similar features, although each map also has its own features. We recover Galactic features previously identified in the literature. Moreover, we also see a large under-density that may correspond to an inter-arm or -spur gap towards the Galactic Centre.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Authors:
Shikai Qiu,
Andres Potapczynski,
Marc Finzi,
Micah Goldblum,
Andrew Gordon Wilson
Abstract:
Dense linear layers are the dominant computational bottleneck in foundation models. Identifying more efficient alternatives to dense matrices has enormous potential for building more compute-efficient models, as exemplified by the success of convolutional networks in the image domain. In this work, we systematically explore structured matrices as replacements for dense matrices. We show that diffe…
▽ More
Dense linear layers are the dominant computational bottleneck in foundation models. Identifying more efficient alternatives to dense matrices has enormous potential for building more compute-efficient models, as exemplified by the success of convolutional networks in the image domain. In this work, we systematically explore structured matrices as replacements for dense matrices. We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance, especially as models scale. Using insights from the Maximal Update Parameterization, we determine the optimal scaling for initialization and learning rates of these unconventional layers. Finally, we measure the scaling laws of different structures to compare how quickly their performance improves with compute. We propose a novel matrix family containing Monarch matrices, the Block Tensor-Train (BTT), which we show performs better than dense matrices for the same compute on multiple tasks. On CIFAR-10/100 with augmentation, BTT achieves exponentially lower training loss than dense when training MLPs and ViTs. BTT matches dense ViT-S/32 performance on ImageNet-1k with 3.8 times less compute and is more efficient than dense for training small GPT-2 language models.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
CHEOPS in-flight performance: A comprehensive look at the first 3.5 years of operations
Authors:
A. Fortier,
A. E. Simon,
C. Broeg,
G. Olofsson,
A. Deline,
T. G. Wilson,
P. F. L. Maxted,
A. Brandeker,
A. Collier Cameron,
M. Beck,
A. Bekkelien,
N. Billot,
A. Bonfanti,
G. Bruno,
J. Cabrera,
L. Delrez,
B. -O. Demory,
D. Futyan,
H. -G. Florén,
M. N. Günther,
A. Heitzmann,
S. Hoyer,
K. G. Isaak,
S. G. Sousa,
M. Stalport
, et al. (106 additional authors not shown)
Abstract:
CHEOPS is a space telescope specifically designed to monitor transiting exoplanets orbiting bright stars. In September 2023, CHEOPS completed its nominal mission and remains in excellent operational conditions. The mission has been extended until the end of 2026. Scientific and instrumental data have been collected throughout in-orbit commissioning and nominal operations, enabling a comprehensive…
▽ More
CHEOPS is a space telescope specifically designed to monitor transiting exoplanets orbiting bright stars. In September 2023, CHEOPS completed its nominal mission and remains in excellent operational conditions. The mission has been extended until the end of 2026. Scientific and instrumental data have been collected throughout in-orbit commissioning and nominal operations, enabling a comprehensive analysis of the mission's performance. In this article, we present the results of this analysis with a twofold goal. First, we aim to inform the scientific community about the present status of the mission and what can be expected as the instrument ages. Secondly, we intend for this publication to serve as a legacy document for future missions, providing insights and lessons learned from the successful operation of CHEOPS. To evaluate the instrument performance in flight, we developed a comprehensive monitoring and characterisation programme. It consists of dedicated observations that allow us to characterise the instrument's response. In addition to the standard collection of nominal science and housekee** data, these observations provide input for detecting, modelling, and correcting instrument systematics, discovering and addressing anomalies, and comparing the instrument's actual performance with expectations. The precision of the CHEOPS measurements has enabled the mission objectives to be met and exceeded. Careful modelling of the instrumental systematics allows the data quality to be significantly improved during the light curve analysis phase, resulting in more precise scientific measurements. CHEOPS is compliant with the driving scientific requirements of the mission. Although visible, the ageing of the instrument has not affected the mission's performance.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
HIP 41378 observed by CHEOPS: Where is planet d?
Authors:
S. Sulis,
L. Borsato,
S. Grouffal,
H. P. Osborn,
A. Santerne,
A. Brandeker,
M. N. Günther,
A. Heitzmann,
M. Lendl,
M. Fridlund,
D. Gandolfi,
Y. Alibert,
R. Alonso,
T. Bárczy,
D. Barrado Navascues,
S. C. Barros,
W. Baumjohann,
T. Beck,
W. Benz,
M. Bergomi,
N. Billot,
A. Bonfanti,
C. Broeg,
A. Collier Cameron,
C. Corral van Damme
, et al. (62 additional authors not shown)
Abstract:
HIP 41378 d is a long-period planet that has only been observed to transit twice, three years apart, with K2. According to stability considerations and a partial detection of the Rossiter-McLaughlin effect, $P_\mathrm{d} = 278.36$ d has been determined to be the most likely orbital period. We targeted HIP 41378 d with CHEOPS at the predicted transit timing based on $P_\mathrm{d}= 278.36$ d, but th…
▽ More
HIP 41378 d is a long-period planet that has only been observed to transit twice, three years apart, with K2. According to stability considerations and a partial detection of the Rossiter-McLaughlin effect, $P_\mathrm{d} = 278.36$ d has been determined to be the most likely orbital period. We targeted HIP 41378 d with CHEOPS at the predicted transit timing based on $P_\mathrm{d}= 278.36$ d, but the observations show no transit. We find that large ($>22.4$ hours) transit timing variations (TTVs) could explain this non-detection during the CHEOPS observation window. We also investigated the possibility of an incorrect orbital solution, which would have major implications for our knowledge of this system. If $P_\mathrm{d} \neq 278.36$ d, the periods that minimize the eccentricity would be $101.22$ d and $371.14$ d. The shortest orbital period will be tested by TESS, which will observe HIP 41378 in Sector 88 starting in January 2025. Our study shows the importance of a mission like CHEOPS, which today is the only mission able to make long observations (i.e., from space) to track the ephemeris of long-period planets possibly affected by large TTVs.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Environmental Effects on the Stellar Mass Function in a z~3.3 Overdensity of Galaxies in the COSMOS Field
Authors:
Ben Forrest,
Brian C. Lemaux,
Ekta A. Shah,
Priti Staab,
Roy R. Gal,
Lori M. Lubin,
M. C. Cooper,
Olga Cucciati,
Denise Hung,
Ian McConachie,
Adam Muzzin,
Gillian Wilson,
Sandro Bardelli,
Letizia P. Cassarà,
Wenjun Chang,
Finn Giddings,
Emmet Golden-Marx,
Nimish Hathi,
Stephanie M. Urbano Stawinski,
Elena Zucca
Abstract:
We present an analysis of the number density of galaxies as a function of stellar mass (i.e., the stellar mass function, SMF) in the COSMOS field at z~3.3, making a comparison between the SMF in overdense environments and the SMF in the coeval field. In particular, this region contains the Elentári proto-supercluster, a system of 6 extended overdensities spanning ~70 cMpc on a side. A clear differ…
▽ More
We present an analysis of the number density of galaxies as a function of stellar mass (i.e., the stellar mass function, SMF) in the COSMOS field at z~3.3, making a comparison between the SMF in overdense environments and the SMF in the coeval field. In particular, this region contains the Elentári proto-supercluster, a system of 6 extended overdensities spanning ~70 cMpc on a side. A clear difference is seen in the high-mass slope of these SMFs, with overdense regions showing an increase in the ratio of high-mass galaxies to low-mass galaxies relative to the field, indicating a more rapid build-up of stellar mass in overdense environments. This result qualitatively agrees with analyses of clusters at z~1, though the differences between protocluster and field SMFs at z~3.3 are smaller. While this is consistent with overdensities enhancing the evolution of their member galaxies, potentially through increased merger rates, whether this enhancement begins in protocluster environments or even earlier in group environments is still unclear. Though the measured fractions of quiescent galaxies between the field and overdense environments do not vary significantly, implying that this stellar mass enhancement is ongoing and any starbursts triggered by merger activity have not yet quenched, we note that spectroscopic observations are biased towards star-forming populations, particularly for low-mass galaxies. If mergers are indeed responsible, high resolution imaging of Elentári and similar structures at these early epochs should then reveal increased merger rates relative to the field. Larger samples of well-characterized overdensities are necessary to draw broader conclusions in these areas.
△ Less
Submitted 3 June, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Photo-dynamical characterisation of the TOI-178 resonant chain
Authors:
A. Leleu,
J. -B. Delisle,
L. Delrez,
E. M. Bryant,
A. Brandeker,
H. P. Osborn,
N. Hara,
T. G. Wilson,
N. Billot,
M. Lendl,
D. Ehrenreich,
H. Chakraborty,
M. N. Günther,
M. J. Hooton,
Y. Alibert,
R. Alonso,
D. R. Alves,
D. R. Anderson,
I. Apergis,
D. Armstrong,
T. Bárczy,
D. Barrado Navascues,
S. C. C. Barros,
M. P. Battley,
W. Baumjohann
, et al. (82 additional authors not shown)
Abstract:
The TOI-178 system consists of a nearby late K-dwarf transited by six planets in the super-Earth to mini-Neptune regime, with radii ranging from 1.2 to 2.9 earth radius and orbital periods between 1.9 and 20.7 days. All planets but the innermost one form a chain of Laplace resonances. The fine-tuning and fragility of such orbital configurations ensure that no significant scattering or collision ev…
▽ More
The TOI-178 system consists of a nearby late K-dwarf transited by six planets in the super-Earth to mini-Neptune regime, with radii ranging from 1.2 to 2.9 earth radius and orbital periods between 1.9 and 20.7 days. All planets but the innermost one form a chain of Laplace resonances. The fine-tuning and fragility of such orbital configurations ensure that no significant scattering or collision event has taken place since the formation and migration of the planets in the protoplanetary disc, hence providing important anchors for planet formation models. We aim to improve the characterisation of the architecture of this key system, and in particular the masses and radii of its planets. In addition, since this system is one of the few resonant chains that can be characterised by both photometry and radial velocities, we aim to use it as a test bench for the robustness of the planetary mass determination with each technique. We perform a global analysis of all available photometry and radial velocity. We also try different sets of priors on the masses and eccentricity, as well as different stellar activity models, to study their effects on the masses estimated by each method. We show how stellar activity is preventing us from obtaining a robust mass estimation for the three outer planets using radial velocity data alone. We also show that our joint photo-dynamical and radial velocity analysis resulted in a robust mass determination for planets c to g, with precision of 12% for the mass of planet c, and better than 10% for planets d to g. The new precisions on the radii range from 2 to 3%. The understanding of this synergy between photometric and radial velocity measurements will be valuable during the PLATO mission. We also show that TOI-178 is indeed currently locked in the resonant configuration, librating around an equilibrium of the chain.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Gliese 12 b, A Temperate Earth-sized Planet at 12 Parsecs Discovered with TESS and CHEOPS
Authors:
Shishir Dholakia,
Larissa Palethorpe,
Alexander Venner,
Annelies Mortier,
Thomas G. Wilson,
Chelsea X. Huang,
Ken Rice,
Vincent Van Eylen,
Emma Nabbie,
Ryan Cloutier,
Walter Boschin,
David Ciardi,
Laetitia Delrez,
Georgina Dransfield,
Elsa Ducrot,
Zahra Essack,
Mark E. Everett,
Michaël Gillon,
Matthew J. Hooton,
Michelle Kunimoto,
David W. Latham,
Mercedes López-Morales,
Bin Li,
Fan Li,
Scott McDermott
, et al. (11 additional authors not shown)
Abstract:
We report on the discovery of Gliese 12 b, the nearest transiting temperate, Earth-sized planet found to date. Gliese 12 is a bright ($V=12.6$ mag, $K=7.8$ mag) metal-poor M4V star only $12.162\pm0.005$ pc away from the Solar System with one of the lowest stellar activity levels known for an M-dwarf. A planet candidate was detected by TESS based on only 3 transits in sectors 42, 43, and 57, with a…
▽ More
We report on the discovery of Gliese 12 b, the nearest transiting temperate, Earth-sized planet found to date. Gliese 12 is a bright ($V=12.6$ mag, $K=7.8$ mag) metal-poor M4V star only $12.162\pm0.005$ pc away from the Solar System with one of the lowest stellar activity levels known for an M-dwarf. A planet candidate was detected by TESS based on only 3 transits in sectors 42, 43, and 57, with an ambiguity in the orbital period due to observational gaps. We performed follow-up transit observations with CHEOPS and ground-based photometry with MINERVA-Australis, SPECULOOS, and Purple Mountain Observatory, as well as further TESS observations in sector 70. We statistically validate Gliese 12 b as a planet with an orbital period of $12.76144\pm0.00006$ days and a radius of $1.0\pm{0.1}$ R$_\oplus$, resulting in an equilibrium temperature of $\sim$315K. Gliese 12 b has excellent future prospects for precise mass measurement, which may inform how planetary internal structure is affected by the stellar compositional environment. Gliese 12 b also represents one of the best targets to study whether Earth-like planets orbiting cool stars can retain their atmospheres, a crucial step to advance our understanding of habitability on Earth and across the Galaxy.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies
Authors:
Paul Templier,
Emmanuel Rachelson,
Antoine Cully,
Dennis G. Wilson
Abstract:
Evolutionary Algorithms (EA) have been successfully used for the optimization of neural networks for policy search, but they still remain sample inefficient and underperforming in some cases compared to gradient-based reinforcement learning (RL). Various methods combine the two approaches, many of them training a RL algorithm on data from EA evaluations and injecting the RL actor into the EA popul…
▽ More
Evolutionary Algorithms (EA) have been successfully used for the optimization of neural networks for policy search, but they still remain sample inefficient and underperforming in some cases compared to gradient-based reinforcement learning (RL). Various methods combine the two approaches, many of them training a RL algorithm on data from EA evaluations and injecting the RL actor into the EA population. However, when using Evolution Strategies (ES) as the EA, the RL actor can drift genetically far from the the ES distribution and injection can cause a collapse of the ES performance. Here, we highlight the phenomenon of genetic drift where the actor genome and the ES population distribution progressively drift apart, leading to injection having a negative impact on the ES. We introduce Genetic Drift Regularization (GDR), a simple regularization method in the actor training loss that prevents the actor genome from drifting away from the ES. We show that GDR can improve ES convergence on problems where RL learns well, but also helps RL training on other tasks, , fixes the injection issues better than previous controlled injection methods.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Quality with Just Enough Diversity in Evolutionary Policy Search
Authors:
Paul Templier,
Luca Grillotti,
Emmanuel Rachelson,
Dennis G. Wilson,
Antoine Cully
Abstract:
Evolution Strategies (ES) are effective gradient-free optimization methods that can be competitive with gradient-based approaches for policy search. ES only rely on the total episodic scores of solutions in their population, from which they estimate fitness gradients for their update with no access to true gradient information. However this makes them sensitive to deceptive fitness landscapes, and…
▽ More
Evolution Strategies (ES) are effective gradient-free optimization methods that can be competitive with gradient-based approaches for policy search. ES only rely on the total episodic scores of solutions in their population, from which they estimate fitness gradients for their update with no access to true gradient information. However this makes them sensitive to deceptive fitness landscapes, and they tend to only explore one way to solve a problem. Quality-Diversity methods such as MAP-Elites introduced additional information with behavior descriptors (BD) to return a population of diverse solutions, which helps exploration but leads to a large part of the evaluation budget not being focused on finding the best performing solution. Here we show that behavior information can also be leveraged to find the best policy by identifying promising search areas which can then be efficiently explored with ES. We introduce the framework of Quality with Just Enough Diversity (JEDi) which learns the relationship between behavior and fitness to focus evaluations on solutions that matter. When trying to reach higher fitness values, JEDi outperforms both QD and ES methods on hard exploration tasks like mazes and on complex control problems with large policies.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Uncovering implementable dormant pruning decisions from three different stakeholder perspectives
Authors:
Deanna Flynn,
Abhinav Jain,
Heather Knight,
Cristina G. Wilson,
Cindy Grimm
Abstract:
Dormant pruning, or the removal of unproductive portions of a tree while a tree is not actively growing, is an important orchard task to help maintain yield, requiring years to build expertise. Because of long training periods and an increasing labor shortage in agricultural jobs, pruning could benefit from robotic automation. However, to program robots to prune branches, we first need to understa…
▽ More
Dormant pruning, or the removal of unproductive portions of a tree while a tree is not actively growing, is an important orchard task to help maintain yield, requiring years to build expertise. Because of long training periods and an increasing labor shortage in agricultural jobs, pruning could benefit from robotic automation. However, to program robots to prune branches, we first need to understand how pruning decisions are made, and what variables in the environment (e.g., branch size and thickness) we need to capture. Working directly with three pruning stakeholders -- horticulturists, growers, and pruners -- we find that each group of human experts approaches pruning decision-making differently. To capture this knowledge, we present three studies and two extracted pruning protocols from field work conducted in Prosser, Washington in January 2022 and 2023. We interviewed six stakeholders (two in each group) and observed pruning across three cultivars -- Bing Cherries, Envy Apples, and Jazz Apples -- and two tree architectures -- Upright Fruiting Offshoot and V-Trellis. Leveraging participant interviews and video data, this analysis uses grounded coding to extract pruning terminology, discover horticultural contexts that influence pruning decisions, and find implementable pruning heuristics for autonomous systems. The results include a validated terminology set, which we offer for use by both pruning stakeholders and roboticists, to communicate general pruning concepts and heuristics. The results also highlight seven pruning heuristics utilizing this terminology set that would be relevant for use by future autonomous robot pruning systems, and characterize three discovered horticultural contexts (i.e., environmental management, crop-load management, and replacement wood) across all three cultivars.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
It Will Never Work in Theory
Authors:
Greg Wilson,
Jorge Aranda,
Michael Hoye,
Brittany Johnson
Abstract:
We have been trying to get software engineering researchers and practitioners to talk to one another for over a decade. This paper describes what we have done, assesses our impact, and recommends an approach that we hope will have greater success.
We have been trying to get software engineering researchers and practitioners to talk to one another for over a decade. This paper describes what we have done, assesses our impact, and recommends an approach that we hope will have greater success.
△ Less
Submitted 16 February, 2024;
originally announced May 2024.
-
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Authors:
Samuel Lavoie,
Polina Kirichenko,
Mark Ibrahim,
Mahmoud Assran,
Andrew Gordon Wilson,
Aaron Courville,
Nicolas Ballas
Abstract:
There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by map** an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's v…
▽ More
There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by map** an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's vision encoder outputs a set of visual features that are mixed into a final representation by conditioning on information derived from the text. We show that Llip outperforms non-contextualized baselines like CLIP and SigLIP on a variety of tasks even with large-scale encoders. Llip improves zero-shot classification by an average of 2.9% zero-shot classification benchmarks with a ViT-G/14 encoder. Specifically, Llip attains a zero-shot top-1 accuracy of 83.5% on ImageNet outperforming a similarly sized CLIP by 1.4%. We also demonstrate improvement on zero-shot retrieval on MS-COCO by 6.0%. We provide a comprehensive analysis of the components introduced by the method and demonstrate that Llip leads to richer visual representations.
△ Less
Submitted 14 May, 2024; v1 submitted 29 April, 2024;
originally announced May 2024.
-
MAGAZ3NE: Massive, Extremely Dusty Galaxies at $z\sim2$ Lead to Photometric Overestimation of Number Densities of the Most Massive Galaxies at $3<z<4$
Authors:
Ben Forrest,
M. C. Cooper,
Adam Muzzin,
Gillian Wilson,
Danilo Marchesini,
Ian McConachie,
Percy Gomez,
Marianna Annunziatella,
Z. Cemile Marsan,
Joey Braspenning,
Wenjun Chang,
Gabriella de Lucia,
Fabio Fontanot,
Michaela Hirschmann,
Dylan Nelson,
Annalisa Pillepich,
Joop Schaye,
Stephanie M. Urbano Stawinski,
Mauro Stefanon,
Lizhi Xie
Abstract:
We present rest-frame optical spectra from Keck/MOSFIRE and Keck/NIRES of 16 candidate ultramassive galaxies targeted as part of the Massive Ancient Galaxies at $z>3$ Near-Infrared (MAGAZ3NE) Survey. These candidates were selected to have photometric redshifts $3\lesssim z_{\rm phot}<4$, photometric stellar masses log($M$/M$_\odot$)$>11.7$, and well-sampled photometric spectral energy distribution…
▽ More
We present rest-frame optical spectra from Keck/MOSFIRE and Keck/NIRES of 16 candidate ultramassive galaxies targeted as part of the Massive Ancient Galaxies at $z>3$ Near-Infrared (MAGAZ3NE) Survey. These candidates were selected to have photometric redshifts $3\lesssim z_{\rm phot}<4$, photometric stellar masses log($M$/M$_\odot$)$>11.7$, and well-sampled photometric spectral energy distributions (SEDs) from the UltraVISTA and VIDEO surveys. In contrast to previous spectroscopic observations of blue star-forming and post-starburst ultramassive galaxies, candidates in this sample have very red SEDs implying significant dust attenuation, old stellar ages, and/or active galactic nuclei (AGN). Of these galaxies, eight are revealed to be heavily dust-obscured $2.0<z<2.7$ galaxies with strong emission lines, some showing broad features indicative of AGN, three are Type I AGN hosts at $z>3$, one is a $z\sim1.2$ dusty galaxy, and four galaxies do not have a confirmed spectroscopic redshift. In fact, none of the sample has |$z_{\rm spec}-z_{\rm phot}$|$<0.5$, suggesting difficulties for photometric redshift programs in fitting similarly red SEDs. The prevalence of these red interloper galaxies suggests that the number densities of high-mass galaxies are overestimated at $z\gtrsim3$ in large photometric surveys, hel** to resolve the `impossibly early galaxy problem' and leading to much better agreement with cosmological galaxy simulations. A more complete spectroscopic survey of ultramassive galaxies is required to pin down the uncertainties on their number densities in the early universe.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Spectroscopic Confirmation of an Ultra-Massive Galaxy in a Protocluster at $z \sim 4.9$
Authors:
Stephanie M. Urbano Stawinski,
M. C. Cooper,
Ben Forrest,
Adam Muzzin,
Danilo Marchesini,
Gillian Wilson,
Percy Gomez,
Ian McConachie,
Z. Cemile Marsan,
Marianna Annuziatella,
Wenjun Chang
Abstract:
We present spectroscopic confirmation of an ultra-massive galaxy (UMG) with $\log(M_\star/M_\odot) = 10.98 \pm 0.07$ at $z_\mathrm{spec} = 4.8947$ in the Extended Groth Strip (EGS), based on deep observations of Ly$α$ emission with Keck/DEIMOS. The ultra-massive galaxy (UMG-28740) is the most massive member in one of the most significant overdensities in the EGS, with four additional photometric m…
▽ More
We present spectroscopic confirmation of an ultra-massive galaxy (UMG) with $\log(M_\star/M_\odot) = 10.98 \pm 0.07$ at $z_\mathrm{spec} = 4.8947$ in the Extended Groth Strip (EGS), based on deep observations of Ly$α$ emission with Keck/DEIMOS. The ultra-massive galaxy (UMG-28740) is the most massive member in one of the most significant overdensities in the EGS, with four additional photometric members with $\log(M_\star/M_\odot) > 10.5$ within $R_\mathrm{proj} \sim 1$ cMpc. The Ly$α$ profile is highly asymmetric ($A_f = 3.56$), suggesting the presence of neutral gas within the interstellar medium, circumgalactic medium, or via AGN-driven outflows. Spectral energy distribution (SED) fitting using a large suite of star formation histories and two sets of high-quality photometry from ground- and space-based facilities consistently estimates the stellar mass of UMG-28740 to be $\log(M_\star/M_\odot) \sim 11$ with a small standard deviation between measurements ($σ= 0.07$). While the best-fit SED models agree on stellar mass, we find discrepancies in the estimated star formation rate for UMG-28740, resulting in either a star-forming or quiescent system. JWST/NIRCam photometry of UMG-28740 strongly favors a quiescent scenario, demonstrating the need for high-quality mid-IR observations. Assuming the galaxy to be quiescent, UMG-28740 formed the bulk of its stars at $z > 10$ and is quenching at $z \sim 8$, resulting in a high star formation efficiency at high redshift ($ε\sim 0.2$ at $z \sim 5$ and $ε\gtrsim 1$ at $z \gtrsim 8$). As the most massive galaxy in its protocluster environment, UMG-28740 is a unique example of the impossibly early galaxy problem.
△ Less
Submitted 12 June, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Characterisation of the TOI-421 planetary system using CHEOPS, TESS, and archival radial velocity data
Authors:
A. F. Krenn,
D. Kubyshkina,
L. Fossati,
J. A. Egger,
A. Bonfanti,
A. Deline,
D. Ehrenreich,
M. Beck,
W. Benz,
J. Cabrera,
T. G. Wilson,
A. Leleu,
S. G. Sousa,
V. Adibekyan,
A. C. M. Correira,
Y. Alibert,
L. Delrez,
M. Lendl,
J. A. Patel,
J. Venturini,
R. Alonso,
G. Anglada,
J. Asquier,
T. Bárczy,
D. Barrado Navascues
, et al. (66 additional authors not shown)
Abstract:
The TOI-421 planetary system contains two sub-Neptune-type planets and is a prime target to study the formation and evolution of planets and their atmospheres. The inner planet is especially interesting as the existence of a hydrogen-dominated atmosphere at its orbital separation cannot be explained by current formation models without previous orbital migration. We jointly analysed photometric dat…
▽ More
The TOI-421 planetary system contains two sub-Neptune-type planets and is a prime target to study the formation and evolution of planets and their atmospheres. The inner planet is especially interesting as the existence of a hydrogen-dominated atmosphere at its orbital separation cannot be explained by current formation models without previous orbital migration. We jointly analysed photometric data of three TESS sectors and six CHEOPS visits as well as 156 radial velocity data points to retrieve improved planetary parameters. We also searched for TTVs and modelled the interior structure of the planets. Finally, we simulated the evolution of the primordial H-He atmospheres of the planets using two different modelling frameworks. We determine the planetary radii and masses of TOI-421 b and c to be $R_{\rm b} = 2.64 \pm 0.08 \, R_{\oplus}$, $M_{\rm b} = 6.7 \pm 0.6 \, M_{\oplus}$, $R_{\rm c} = 5.09 \pm 0.07 \, R_{\oplus}$, and $M_{\rm c} = 14.1 \pm 1.4 \, M_{\oplus}$. We do not detect any statistically significant TTV signals. Assuming the presence of a hydrogen-dominated atmosphere, the interior structure modelling results in both planets having extensive envelopes. While the modelling of the atmospheric evolution predicts for TOI-421 b to have lost any primordial atmosphere that it could have accreted at its current orbital position, TOI-421 c could have started out with an initial atmospheric mass fraction somewhere between 10 and 35%. We conclude that the low observed mean density of TOI-421 b can only be explained by either a bias in the measured planetary parameters (e.g. driven by high-altitude clouds) and/or in the context of orbital migration. We also find that the results of atmospheric evolution models are strongly dependent on the employed planetary structure model.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Resolved UV and optical color gradients reveal environmental influence on galaxy evolution at redshift z$\sim$1.6
Authors:
William J. Cramer,
A. G. Noble,
G. Rudnick,
A. Pigarelli,
G. Wilson,
Y. M. Bahé,
M. C. Cooper,
R. Demarco,
J. Matharu,
T. B. Miller,
A. Muzzin,
J. Nantais,
W. Sportsman,
E. van Kampen,
T. M. A. Webb,
H. K. C. Yee
Abstract:
The changes in colors across a galaxy are intimately connected to the galaxy's formation, growth, quenching history, and dust content. A particularly important epoch in the growth of galaxies is near $z \sim 2$ often referred to as 'cosmic noon', where galaxies on average reach the peak of their star formation. We study a population of 125 cluster galaxies at $z \sim 1.6$ in three Hubble Space Tel…
▽ More
The changes in colors across a galaxy are intimately connected to the galaxy's formation, growth, quenching history, and dust content. A particularly important epoch in the growth of galaxies is near $z \sim 2$ often referred to as 'cosmic noon', where galaxies on average reach the peak of their star formation. We study a population of 125 cluster galaxies at $z \sim 1.6$ in three Hubble Space Telescope (HST) filters, F475W, F625W, and F160W, roughly corresponding to the rest-frame FUV, NUV, and r band, respectively. By comparing to a control sample of 200 field galaxies at similar redshift, we reveal clear, statistically significant differences in the overall spatially resolved colors and color gradients in galaxies across these two different environments. On average, cluster galaxies have redder UV colors in both the inner and outer regions bounded by $r_{\mathrm{50}}$, as well as an overall wider dispersion of outside-in color gradients. The presence of these observed differences, along with evidence from ancillary data from previous studies, strongly suggests that the environment drives these population-level color differences, by affecting the stellar populations and/or dust content.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Evolution of the nuclear spin-orbit splitting explored via the $^{32}$Si($d$,$p$)$^{33}$Si reaction using SOLARIS
Authors:
J. Chen,
B. P. Kay,
C. R. Hoffman,
T. L. Tang,
I. A. Tolstukhin,
D. Bazin,
R. S. Lubna,
Y. Ayyad,
S. Beceiro-Novo,
B. J. Coombes,
S. J. Freeman,
L. P. Gaffney,
R. Garg,
H. Jayatissa,
A. N. Kuchera,
P. MacGregor,
A. J. Mitchell,
W. Mittig,
B. Monteagudo,
A. Munoz-Ramos,
C. Müller-Gatermann,
F. Recchia,
N. Rijal,
C. Santamaria,
M. Z. Serikow
, et al. (8 additional authors not shown)
Abstract:
The spin-orbit splitting between neutron 1$p$ orbitals at $^{33}$Si has been deduced using the single-neutron-adding ($d$,$p$) reaction in inverse kinematics with a beam of $^{32}$Si, a long-lived radioisotope. Reaction products were analyzed by the newly implemented SOLARIS spectrometer at the reaccelerated-beam facility at the National Superconducting Cyclotron Laboratory. The measurements show…
▽ More
The spin-orbit splitting between neutron 1$p$ orbitals at $^{33}$Si has been deduced using the single-neutron-adding ($d$,$p$) reaction in inverse kinematics with a beam of $^{32}$Si, a long-lived radioisotope. Reaction products were analyzed by the newly implemented SOLARIS spectrometer at the reaccelerated-beam facility at the National Superconducting Cyclotron Laboratory. The measurements show reasonable agreement with shell-model calculations that incorporate modern cross-shell interactions, but they contradict the prediction of proton density depletion based on relativistic mean-field theory. The evolution of the neutron 1$p$-shell orbitals is systematically studied using the present and existing data in the isotonic chains of $N=17$, 19, and 21. In each case, a smooth decrease in the separation of the $1p_{3/2}$-$1p_{1/2}$ orbitals is seen as the respective $p$-orbitals approach zero binding, suggesting that the finite nuclear potential strongly influences the evolution of nuclear structure in this region.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Improved Constraints on Mergers with SZ, Hydrodynamical simulations, Optical, and X-ray (ICM-SHOX). Paper II: Galaxy cluster sample overview
Authors:
Emily M. Silich,
Elena Bellomi,
Jack Sayers,
John ZuHone,
Urmila Chadayammuri,
Sunil Golwala,
David Hughes,
Alfredo Montaña,
Tony Mroczkowski,
Daisuke Nagai,
David Sánchez,
S. A. Stanford,
Grant Wilson,
Michael Zemcov,
Adi Zitrin
Abstract:
Galaxy cluster mergers are representative of a wide range of physics, making them an excellent probe of the properties of dark matter and the ionized plasma of the intracluster medium. To date, most studies have focused on mergers occurring in the plane of the sky, where morphological features can be readily identified. To allow study of mergers with arbitrary orientation, we have assembled multi-…
▽ More
Galaxy cluster mergers are representative of a wide range of physics, making them an excellent probe of the properties of dark matter and the ionized plasma of the intracluster medium. To date, most studies have focused on mergers occurring in the plane of the sky, where morphological features can be readily identified. To allow study of mergers with arbitrary orientation, we have assembled multi-probe data for the eight-cluster ICM-SHOX sample sensitive to both morphology and line of sight velocity. The first ICM-SHOX paper (Silich+2023) provided an overview of our methodology applied to one member of the sample, MACS J0018.5+1626, in order to constrain its merger geometry. That work resulted in an exciting new discovery of a velocity space decoupling of its gas and dark matter distributions. In this work, we describe the availability and quality of multi-probe data for the full ICM-SHOX galaxy cluster sample. These datasets will form the observational basis of an upcoming full ICM-SHOX galaxy cluster sample analysis.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Detailed cool star flare morphology with CHEOPS and TESS
Authors:
G. Bruno,
I. Pagano,
G. Scandariato,
H. -G. Florén,
A. Brandeker,
G. Olofsson,
P. F. L. Maxted,
A. Fortier,
S. G. Sousa,
S. Sulis,
V. Van Grootel,
Z. Garai,
A. Boldog,
L. Kriskovics,
M. Gy. Szabó,
D. Gandolfi,
Y. Alibert,
R. Alonso,
T. Bárczy,
D. Barrado Navascues,
S. C. C. Barros,
W. Baumjohann,
M. Beck,
T. Beck,
W. Benz
, et al. (57 additional authors not shown)
Abstract:
Context. White-light stellar flares are proxies for some of the most energetic types of flares, but their triggering mechanism is still poorly understood. As they are associated with strong X and UV emission, their study is particularly relevant to estimate the amount of high-energy irradiation onto the atmospheres of exoplanets, especially those in their stars' habitable zone. Aims. We used the h…
▽ More
Context. White-light stellar flares are proxies for some of the most energetic types of flares, but their triggering mechanism is still poorly understood. As they are associated with strong X and UV emission, their study is particularly relevant to estimate the amount of high-energy irradiation onto the atmospheres of exoplanets, especially those in their stars' habitable zone. Aims. We used the high-cadence, high-photometric capabilities of the CHEOPS and TESS space telescopes to study the detailed morphology of white-light flares occurring in a sample of 130 late-K and M stars, and compared our findings with results obtained at a lower cadence. We developed dedicated software for this purpose. Results. Multi-peak flares represent a significant percentage ($\gtrsim 30$\%) of the detected outburst events. Our findings suggest that high-impulse flares are more frequent than suspected from lower-cadence data, so that the most impactful flux levels that hit close-in exoplanets might be more time-limited than expected. We found significant differences in the duration distributions of single-peak and complex flare components, but not in their peak luminosity. A statistical analysis of the flare parameter distributions provides marginal support for their description with a log-normal instead of a power-law function, leaving the door open to several flare formation scenarios. We tentatively confirmed previous results about quasi-periodic pulsations in high-cadence photometry, report the possible detection of a pre-flare dip, and did not find hints of photometric variability due to an undetected flare background. Conclusions. The high-cadence study of stellar hosts might be crucial to evaluate the impact of their flares on close-in exoplanets, as their impulsive phase emission might otherwise be incorrectly estimated. Future telescopes such as PLATO and Ariel will help in this respect.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Precise characterisation of HD 15337 with CHEOPS: a laboratory for planet formation and evolution
Authors:
N. M. Rosário,
O. D. S. Demangeon,
S. C. C. Barros,
D. Gandolfi,
J. A. Egger,
L. M. Serrano,
H. P. Osborn,
M. Beck,
W. Benz,
H. -G. Florén,
P. Guterman,
T. G. Wilson,
Y. Alibert,
L. Fossati,
M. J. Hooton,
L. Delrez,
N. C. Santos,
S. G. Sousa,
A. Bonfanti,
S. Salmon,
V. Adibekyan,
A. Nigioni,
J. Venturini,
R. Alonso,
G. Anglada
, et al. (68 additional authors not shown)
Abstract:
We aim to constrain the internal structure and composition of HD 15337 b and c, two short-period planets situated on opposite sides of the radius valley, using new transit photometry and radial velocity data. We acquire 6 new transit visits with the CHaracterising ExOPlanet Satellite (CHEOPS) and 32 new radial velocity measurements from the High Accuracy Radial Velocity Planet Searcher (HARPS) to…
▽ More
We aim to constrain the internal structure and composition of HD 15337 b and c, two short-period planets situated on opposite sides of the radius valley, using new transit photometry and radial velocity data. We acquire 6 new transit visits with the CHaracterising ExOPlanet Satellite (CHEOPS) and 32 new radial velocity measurements from the High Accuracy Radial Velocity Planet Searcher (HARPS) to improve the accuracy of the mass and radius estimates for both planets. We reanalyse light curves from TESS sectors 3 and 4 and analyse new data from sector 30, correcting for long-term stellar activity. Subsequently, we perform a joint fit of the TESS and CHEOPS light curves, and all available RV data from HARPS and the Planet Finder Spectrograph (PFS). Our model fits the planetary signals, the stellar activity signal and the instrumental decorrelation model for the CHEOPS data simultaneously. The stellar activity was modelled using a Gaussian-process regression on both the RV and activity indicators. We finally employ a Bayesian retrieval code to determine the internal composition and structure of the planets. We derive updated and highly precise parameters for the HD 15337 system. Our improved precision on the planetary parameters makes HD 15337 b one of the most precisely characterised rocky exoplanets, with radius and mass measurements achieving a precision better than 2\% and 7\%, respectively. We are able to improve the precision of the radius measurement of HD 15337 c to 3\%. Our results imply that the composition of HD 15337 b is predominantly rocky, while HD 15337 c exhibits a gas envelope with a mass of at least $0.01\ M_\oplus$.Our results lay the groundwork for future studies, which can further unravel the atmospheric evolution of these exoplanets and give new insights into their composition and formation history and the causes behind the radius gap.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion
Authors:
Hossein Souri,
Arpit Bansal,
Hamid Kazemi,
Liam Fowl,
Aniruddha Saha,
Jonas Gei**,
Andrew Gordon Wilson,
Rama Chellappa,
Tom Goldstein,
Micah Goldblum
Abstract:
Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it. Existing approaches for creating poisons and backdoors start with randomly sampled clea…
▽ More
Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it. Existing approaches for creating poisons and backdoors start with randomly sampled clean data, called base samples, and then modify those samples to craft poisons. However, some base samples may be significantly more amenable to poisoning than others. As a result, we may be able to craft more potent poisons by carefully choosing the base samples. In this work, we use guided diffusion to synthesize base samples from scratch that lead to significantly more potent poisons and backdoors than previous state-of-the-art attacks. Our Guided Diffusion Poisoning (GDP) base samples can be combined with any downstream poisoning or backdoor attack to boost its effectiveness. Our implementation code is publicly available at: https://github.com/hsouri/GDP .
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Searching Search Spaces: Meta-evolving a Geometric Encoding for Neural Networks
Authors:
Tarek Kunze,
Paul Templier,
Dennis G Wilson
Abstract:
In evolutionary policy search, neural networks are usually represented using a direct map**: each gene encodes one network weight. Indirect encoding methods, where each gene can encode for multiple weights, shorten the genome to reduce the dimensions of the search space and better exploit permutations and symmetries. The Geometric Encoding for Neural network Evolution (GENE) introduced an indire…
▽ More
In evolutionary policy search, neural networks are usually represented using a direct map**: each gene encodes one network weight. Indirect encoding methods, where each gene can encode for multiple weights, shorten the genome to reduce the dimensions of the search space and better exploit permutations and symmetries. The Geometric Encoding for Neural network Evolution (GENE) introduced an indirect encoding where the weight of a connection is computed as the (pseudo-)distance between the two linked neurons, leading to a genome size growing linearly with the number of genes instead of quadratically in direct encoding. However GENE still relies on hand-crafted distance functions with no prior optimization. Here we show that better performing distance functions can be found for GENE using Cartesian Genetic Programming (CGP) in a meta-evolution approach, hence optimizing the encoding to create a search space that is easier to exploit. We show that GENE with a learned function can outperform both direct encoding and the hand-crafted distances, generalizing on unseen problems, and we study how the encoding impacts neural network properties.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors
Authors:
Tim G. J. Rudner,
Ya Shi Zhang,
Andrew Gordon Wilson,
Julia Kempe
Abstract:
Machine learning models often perform poorly under subpopulation shifts in the data distribution. Develo** methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well…
▽ More
Machine learning models often perform poorly under subpopulation shifts in the data distribution. Develo** methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well under subpopulation shifts. We design a simple group-aware prior that only requires access to a small set of data with group information and demonstrate that training with this prior yields state-of-the-art performance -- even when only retraining the final layer of a previously trained non-robust model. Group aware-priors are conceptually simple, complementary to existing approaches, such as attribute pseudo labeling and data reweighting, and open up promising new avenues for harnessing Bayesian inference to enable robustness to subpopulation shifts.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Chronos: Learning the Language of Time Series
Authors:
Abdul Fatir Ansari,
Lorenzo Stella,
Caner Turkmen,
Xiyuan Zhang,
Pedro Mercado,
Huibin Shen,
Oleksandr Shchur,
Syama Sundar Rangapuram,
Sebastian Pineda Arango,
Shubham Kapoor,
Jasper Zschiegner,
Danielle C. Maddix,
Hao Wang,
Michael W. Mahoney,
Kari Torkkola,
Andrew Gordon Wilson,
Michael Bohlke-Schneider,
Yuyang Wang
Abstract:
We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M…
▽ More
We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M parameters) on a large collection of publicly available datasets, complemented by a synthetic dataset that we generated via Gaussian processes to improve generalization. In a comprehensive benchmark consisting of 42 datasets, and comprising both classical local models and deep learning methods, we show that Chronos models: (a) significantly outperform other methods on datasets that were part of the training corpus; and (b) have comparable and occasionally superior zero-shot performance on new datasets, relative to methods that were trained specifically on them. Our results demonstrate that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks, positioning pretrained models as a viable tool to greatly simplify forecasting pipelines.
△ Less
Submitted 2 May, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Homotopy type of shellable $q$-complexes and their homology groups
Authors:
Sudhir R. Ghorpade,
Rakhi Pratihar,
Tovohery H. Randrianarisoa,
Hugues Verdure,
Glen Wilson
Abstract:
The theory of shellable simplicial complexes brings together combinatorics, algebra, and topology in a remarkable way. Initially introduced by Alder for $q$-simplicial complexes, recent work of Ghorpade, Pratihar, and Randrianarisoa extends the study of shellability to $q$-matroid complexes and determines singular homology groups for a subclass of these $q$-simplicial complexes. In this paper, we…
▽ More
The theory of shellable simplicial complexes brings together combinatorics, algebra, and topology in a remarkable way. Initially introduced by Alder for $q$-simplicial complexes, recent work of Ghorpade, Pratihar, and Randrianarisoa extends the study of shellability to $q$-matroid complexes and determines singular homology groups for a subclass of these $q$-simplicial complexes. In this paper, we determine the homotopy type of shellable $q$-simplicial complexes. Moreover, we establish the shellability of order complexes from lexicographically shellable $q$-simplicial complexes, that include the $q$-matroid complexes. This results in a comprehensive determination of the homology groups for any lexicographically shellable $q$-complexes.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Authors:
Tyler Benster,
Guy Wilson,
Reshef Elisha,
Francis R Willett,
Shaul Druckmann
Abstract:
Silent Speech Interfaces (SSIs) offer a noninvasive alternative to brain-computer interfaces for soundless verbal communication. We introduce Multimodal Orofacial Neural Audio (MONA), a system that leverages cross-modal alignment through novel loss functions--cross-contrast (crossCon) and supervised temporal contrast (supTcon)--to train a multimodal model with a shared latent representation. This…
▽ More
Silent Speech Interfaces (SSIs) offer a noninvasive alternative to brain-computer interfaces for soundless verbal communication. We introduce Multimodal Orofacial Neural Audio (MONA), a system that leverages cross-modal alignment through novel loss functions--cross-contrast (crossCon) and supervised temporal contrast (supTcon)--to train a multimodal model with a shared latent representation. This architecture enables the use of audio-only datasets like LibriSpeech to improve silent speech recognition. Additionally, our introduction of Large Language Model (LLM) Integrated Scoring Adjustment (LISA) significantly improves recognition accuracy. Together, MONA LISA reduces the state-of-the-art word error rate (WER) from 28.8% to 12.2% in the Gaddy (2020) benchmark dataset for silent speech on an open vocabulary. For vocal EMG recordings, our method improves the state-of-the-art from 23.3% to 3.7% WER. In the Brain-to-Text 2024 competition, LISA performs best, improving the top WER from 9.8% to 8.9%. To the best of our knowledge, this work represents the first instance where noninvasive silent speech recognition on an open vocabulary has cleared the threshold of 15% WER, demonstrating that SSIs can be a viable alternative to automatic speech recognition (ASR). Our work not only narrows the performance gap between silent and vocalized speech but also opens new possibilities in human-computer interaction, demonstrating the potential of cross-modal approaches in noisy and data-limited regimes.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Confronting compositional confusion through the characterisation of the sub-Neptune orbiting HD 77946
Authors:
L. Palethorpe,
A. Anna John,
A. Mortier,
J. Davoult,
T. G. Wilson,
K. Rice,
A. C. Cameron,
Y. Alibert,
L. A. Buchhave,
L. Malavolta,
J. Cadman,
M. López-Morales,
X. Dumusque,
A. M. Silva,
S. N. Quinn,
V. Van Eylen,
S. Vissapragada,
L. Affer,
D. Charbonneau,
R. Cosentino,
A. Ghedina,
R. D. Haywood,
D. W. Latham,
F. Lienhard,
A. F. Martínez Fiorenzano
, et al. (7 additional authors not shown)
Abstract:
We report on the detailed characterization of the HD 77946 planetary system. HD 77946 is an F5 ($M_*$ = 1.17 M$_{\odot}$, $R_*$ = 1.31 R$_{\odot}$) star, which hosts a transiting planet recently discovered by NASA's Transiting Exoplanet Survey Satellite (TESS), classified as TOI-1778 b. Using TESS photometry, high-resolution spectroscopic data from HARPS-N, and photometry from CHEOPS, we measure t…
▽ More
We report on the detailed characterization of the HD 77946 planetary system. HD 77946 is an F5 ($M_*$ = 1.17 M$_{\odot}$, $R_*$ = 1.31 R$_{\odot}$) star, which hosts a transiting planet recently discovered by NASA's Transiting Exoplanet Survey Satellite (TESS), classified as TOI-1778 b. Using TESS photometry, high-resolution spectroscopic data from HARPS-N, and photometry from CHEOPS, we measure the radius and mass from the transit and RV observations, and find that the planet, HD 77946 b, orbits with period $P_{\rm b}$ = $6.527282_{-0.000020}^{+0.000015}$ d, has a mass of $M_{\rm b} = 8.38\pm{1.32}$M$_\oplus$, and a radius of $R_{\rm b} = 2.705_{-0.081}^{+0.086}$R$_\oplus$. From the combination of mass and radius measurements, and the stellar chemical composition, the planet properties suggest that HD 77946 b is a sub-Neptune with a $\sim$1\% H/He atmosphere. However, a degeneracy still exists between water-world and silicate/iron-hydrogen models, and even though interior structure modelling of this planet favours a sub-Neptune with a H/He layer that makes up a significant fraction of its radius, a water-world composition cannot be ruled out, as with $T_{\rm eq} = 1248^{+40}_{-38}~$K, water may be in a supercritical state. The characterisation of HD 77946 b, adding to the small sample of well-characterised sub-Neptunes, is an important step forwards on our journey to understanding planetary formation and evolution pathways. Furthermore, HD 77946 b has one of the highest transmission spectroscopic metrics for small planets orbiting hot stars, thus transmission spectroscopy of this key planet could prove vital for constraining the compositional confusion that currently surrounds small exoplanets.
△ Less
Submitted 1 May, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Authors:
Hoang Phan,
Andrew Gordon Wilson,
Qi Lei
Abstract:
Models trained on data composed of different groups or domains can suffer from severe performance degradation under distribution shifts. While recent methods have largely focused on optimizing the worst-group objective, this often comes at the expense of good performance on other groups. To address this problem, we introduce an optimization scheme to achieve good performance across groups and find…
▽ More
Models trained on data composed of different groups or domains can suffer from severe performance degradation under distribution shifts. While recent methods have largely focused on optimizing the worst-group objective, this often comes at the expense of good performance on other groups. To address this problem, we introduce an optimization scheme to achieve good performance across groups and find a good solution for all without severely sacrificing performance on any of them. However, directly applying such optimization involves updating the parameters of the entire network, making it both computationally expensive and challenging. Thus, we introduce Controllable Prompt Tuning (CPT), which couples our approach with prompt-tuning techniques. On spurious correlation benchmarks, our procedures achieve state-of-the-art results across both transformer and non-transformer architectures, as well as unimodal and multimodal data, while requiring only 0.4% tunable parameters.
△ Less
Submitted 4 June, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
The tidal deformation and atmosphere of WASP-12b from its phase curve
Authors:
B. Akinsanmi,
S. C. C. Barros,
M. Lendl,
L. Carone,
P. E. Cubillos,
A. Bekkelien,
A. Fortier,
H. -G. Florén,
A. Collier Cameron,
G. Boué,
G. Bruno,
B. -O. Demory,
A. Brandeker,
S. G. Sousa,
T. G. Wilson,
A. Deline,
A. Bonfanti,
G. Scandariato,
M. J. Hooton,
A. C. M. Correia,
O. D. S. Demangeon,
A. M. S. Smith,
V. Singh,
Y. Alibert,
R. Alonso
, et al. (63 additional authors not shown)
Abstract:
Ultra-hot Jupiters present a unique opportunity to understand the physics and chemistry of planets at extreme conditions. WASP-12b stands out as an archetype of this class of exoplanets. We performed comprehensive analyses of the transits, occultations, and phase curves of WASP-12b by combining new CHEOPS observations with previous TESS and Spitzer data to measure the planet's tidal deformation, a…
▽ More
Ultra-hot Jupiters present a unique opportunity to understand the physics and chemistry of planets at extreme conditions. WASP-12b stands out as an archetype of this class of exoplanets. We performed comprehensive analyses of the transits, occultations, and phase curves of WASP-12b by combining new CHEOPS observations with previous TESS and Spitzer data to measure the planet's tidal deformation, atmospheric properties, and orbital decay rate. The planet was modeled as a triaxial ellipsoid parameterized by the second-order fluid Love number, $h_2$, which quantifies its radial deformation and provides insight into the interior structure. We measured the tidal deformation of WASP-12b and estimated a Love number of $h_2=1.55_{-0.49}^{+0.45}$ (at 3.2$σ$) from its phase curve. We measured occultation depths of $333\pm24$ppm and $493\pm29$ppm in the CHEOPS and TESS bands, respectively, while the dayside emission spectrum indicates that CHEOPS and TESS probe similar pressure levels in the atmosphere at a temperature of 2900K. We also estimated low geometric albedos of $0.086\pm0.017$ and $0.01\pm0.023$ in the CHEOPS and TESS passbands, respectively, suggesting the absence of reflective clouds in the dayside of the WASP-12b. The CHEOPS occultations do not show strong evidence for variability in the dayside atmosphere of the planet. Finally, we refine the orbital decay rate by 12% to a value of -30.23$\pm$0.82 ms/yr.
WASP-12b becomes the second exoplanet, after WASP-103b, for which the Love number has been measured (at 3$sigma$) from the effect of tidal deformation in the light curve. However, constraining the core mass fraction of the planet requires measuring $h_2$ with a higher precision. This can be achieved with high signal-to-noise observations with JWST since the phase curve amplitude, and consequently the induced tidal deformation effect, is higher in the infrared.
△ Less
Submitted 20 February, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
LoVoCCS -- II. Weak Lensing Mass Distributions, Red-Sequence Galaxy Distributions, and Their Alignment with the Brightest Cluster Galaxy in 58 Nearby X-ray-Luminous Galaxy Clusters
Authors:
Shenming Fu,
Ian Dell'Antonio,
Zacharias Escalante,
Jessica Nelson,
Anthony Englert,
Søren Helhoski,
Rahul Shinde,
Julia Brockland,
Philip LaDuca,
Christelyn Larkin,
Lucca Paris,
Shane Weiner,
William K. Black,
Ranga-Ram Chary,
Douglas Clowe,
M. C. Cooper,
Megan Donahue,
August Evrard,
Mark Lacy,
Tod Lauer,
Binyang Liu,
Jacqueline McCleary,
Massimo Meneghetti,
Hironao Miyatake,
Mireia Montes
, et al. (9 additional authors not shown)
Abstract:
The Local Volume Complete Cluster Survey (LoVoCCS) is an on-going program to observe nearly a hundred low-redshift X-ray-luminous galaxy clusters (redshifts $0.03<z<0.12$ and X-ray luminosities in the 0.1-2.4 keV band $L_{X500c}>10^{44}$ erg/s) with the Dark Energy Camera (DECam), capturing data in $u,g,r,i,z$ bands with a $5σ$ point source depth of approximately 25-26th AB magnitudes. Here, we ma…
▽ More
The Local Volume Complete Cluster Survey (LoVoCCS) is an on-going program to observe nearly a hundred low-redshift X-ray-luminous galaxy clusters (redshifts $0.03<z<0.12$ and X-ray luminosities in the 0.1-2.4 keV band $L_{X500c}>10^{44}$ erg/s) with the Dark Energy Camera (DECam), capturing data in $u,g,r,i,z$ bands with a $5σ$ point source depth of approximately 25-26th AB magnitudes. Here, we map the aperture masses in 58 galaxy cluster fields using weak gravitational lensing. These clusters span a variety of dynamical states, from nearly relaxed to merging systems, and approximately half of them have not been subject to detailed weak lensing analysis before. In each cluster field, we analyze the alignment between the 2D mass distribution described by the aperture mass map, the 2D red-sequence (RS) galaxy distribution, and the brightest cluster galaxy (BCG). We find that the orientations of the BCG and the RS distribution are strongly aligned throughout the interiors of the clusters: the median misalignment angle is 19 deg within 2 Mpc. We also observe the alignment between the orientations of the RS distribution and the overall cluster mass distribution (by a median difference of 32 deg within 1 Mpc), although this is constrained by galaxy shape noise and the limitations of our cluster sample size. These types of alignment suggest long-term dynamical evolution within the clusters over cosmic timescales.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Fine-Tuned Language Models Generate Stable Inorganic Materials as Text
Authors:
Nate Gruver,
Anuroop Sriram,
Andrea Madotto,
Andrew Gordon Wilson,
C. Lawrence Zitnick,
Zachary Ulissi
Abstract:
We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculatio…
▽ More
We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculations, we show that our strongest model (fine-tuned LLaMA-2 70B) can generate materials predicted to be metastable at about twice the rate (49% vs 28%) of CDVAE, a competing diffusion model. Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material, infilling of partial structures and text-conditional generation. Finally, we show that language models' ability to capture key symmetries of crystal structures improves with model scale, suggesting that the biases of pretrained LLMs are surprisingly well-suited for atomistic data.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Direct cross-section measurement of the weak r-process 88Sr(α,n)91Zr reaction in ν-driven winds of core collapse supernovae
Authors:
C. Fougères,
M. L. Avila,
H. Jayatissa,
D. Santiago-Gonzalez,
K. Brandenburg,
Z. Meisel,
P. Mohr,
F. Montes,
C. Műller-Gatermann,
D. Neto,
W. -J. Ong,
J. Pereira,
K. E. Rehm,
T. L. Tang,
I. A. Tolstukhin,
L. Varriano,
G. Wilson,
J. Wu
Abstract:
About half of the heavy elements beyond iron are known to be produced by the rapid neutron capture process, known as r-process. However, the astrophysical site producing the r-process is still uncertain. Chemical abundances observed in several cosmic sites indicate that different mechanisms should be at play. For instance, the abundances around silver measured in a subset of metal-poor stars indic…
▽ More
About half of the heavy elements beyond iron are known to be produced by the rapid neutron capture process, known as r-process. However, the astrophysical site producing the r-process is still uncertain. Chemical abundances observed in several cosmic sites indicate that different mechanisms should be at play. For instance, the abundances around silver measured in a subset of metal-poor stars indicate the presence of a weak r-process. This process may be active in neutrino-driven winds of core collapse supernovae where ($α$,n) reactions dominate the synthesis of Z ~ 40 elements in the expelled materials. Scarcely measured, the rates of ($α$,n) reactions are determined from statistical Hauser-Feshbach calculations with $α$-optical-model potentials, which are still poorly constrained. The uncertainties of the ($α$,n) reaction rates therefore make a significant contribution to the uncertainties of the abundances determined from stellar modeling. In this work, the $^{88}$Sr($α$,n)$^{91}$Zr reaction which impacts the weak r-process abundances has been probed at astrophysics energy for the first time; directly measuring the total cross sections at astrophysical energies of 8.37 - 13.09 MeV in the center of mass (3.8 - 7.5 GK). Two measurements were performed at ATLAS with the electrically-segmented ionization chamber MUSIC, in inverse kinematics, while following the active target technique. The cross sections of this $α$-induced reaction on $^{88}$Sr, located at the shell closure N = 50, have been found to be lower than expected, by a factor of 3, despite recent statistical calculations validated by measurements on neighboring nuclei. This result encourages more experimental investigations of ($α$,n) reactions, at N = 50 and towards the neutron-rich side, to further test the predictive power and reliability of such calculations.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
The first quenched galaxies, when and how?
Authors:
Lizhi Xie,
Gabriella De Lucia,
Fabio Fontanot,
Michaela Hirschmann,
Yannick M Bahé,
Michael L. Balogh,
Adam Muzzin,
Benedetta Vulcani,
Devontae C. Baxter,
Ben Forrest,
Gillian Wilson,
Gregory H. Rudnick,
M. C. Cooper,
Umberto Rescigno
Abstract:
Many quiescent galaxies discovered in the early Universe by \textit{JWST} raise fundamental questions on when and how these galaxies became and stayed quenched. Making use of the latest version of the semi-analytic model GAEA that provides good agreement with the observed quenched fractions up to $z\sim 3$, we make predictions for the expected fractions of quiescent galaxies up to $z\sim 7$ and an…
▽ More
Many quiescent galaxies discovered in the early Universe by \textit{JWST} raise fundamental questions on when and how these galaxies became and stayed quenched. Making use of the latest version of the semi-analytic model GAEA that provides good agreement with the observed quenched fractions up to $z\sim 3$, we make predictions for the expected fractions of quiescent galaxies up to $z\sim 7$ and analyze the main quenching mechanism. We find that in a simulated box of $685~{\rm Mpc}$ on a side, the first quenched massive ($M_{\star} \sim 10^{11} {\rm M}_{\odot}$), Milky Way mass, and low mass ($M_{\star} \sim 10^{9.5} {\rm M}_{\odot}$ ) galaxies appear at $z\sim 4.5$, $z\sim 6.2$, and before $z = 7$. Most quenched galaxies identified at early redshifts remain quenched for more than 1 Gyr. Independently of galaxy stellar mass, the dominant quenching mechanism at high redshift is accretion disk feedback (quasar winds) from a central massive black hole, which is triggered by mergers in massive and MW-mass galaxies, and by disk instabilities in low-mass galaxies. Environmental strip** becomes increasingly more important at lower redshift.
△ Less
Submitted 1 April, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI
Authors:
Theodore Papamarkou,
Maria Skoularidou,
Konstantina Palla,
Laurence Aitchison,
Julyan Arbel,
David Dunson,
Maurizio Filippone,
Vincent Fortuin,
Philipp Hennig,
José Miguel Hernández-Lobato,
Aliaksandr Hubin,
Alexander Immer,
Theofanis Karaletsos,
Mohammad Emtiyaz Khan,
Agustinus Kristiadi,
Yingzhen Li,
Stephan Mandt,
Christopher Nemeth,
Michael A. Osborne,
Tim G. J. Rudner,
David Rügamer,
Yee Whye Teh,
Max Welling,
Andrew Gordon Wilson,
Ruqi Zhang
Abstract:
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni…
▽ More
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.
△ Less
Submitted 2 June, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Discovery of two warm mini-Neptunes with contrasting densities orbiting the young K3V star TOI-815
Authors:
Angelica Psaridi,
Hugh Osborn,
François Bouchy,
Monika Lendl,
Léna Parc,
Nicolas Billot,
Christopher Broeg,
Sérgio G. Sousa,
Vardan Adibekyan,
Omar Attia,
Andrea Bonfanti,
Hritam Chakraborty,
Karen A. Collins,
Jeanne Davoult,
Elisa Delgado-Mena,
Nolan Grieves,
Tristan Guillot,
Alexis Heitzmann,
Ravit Helled,
Coel Hellier,
Jon M. Jenkins,
Henrik Knierim,
Andreas Krenn,
JackJ. Lissauer,
Rafael Luque
, et al. (108 additional authors not shown)
Abstract:
We present the discovery and characterization of two warm mini-Neptunes transiting the K3V star TOI-815 in a K-M binary system. Analysis of the spectra and rotation period reveal it to be a young star with an age of $200^{+400}_{-200}$Myr. TOI-815b has a 11.2-day period and a radius of 2.94$\pm$0.05$\it{R_{\rm\mathrm{\oplus}}}$ with transits observed by TESS, CHEOPS, ASTEP, and LCOGT. The outer pl…
▽ More
We present the discovery and characterization of two warm mini-Neptunes transiting the K3V star TOI-815 in a K-M binary system. Analysis of the spectra and rotation period reveal it to be a young star with an age of $200^{+400}_{-200}$Myr. TOI-815b has a 11.2-day period and a radius of 2.94$\pm$0.05$\it{R_{\rm\mathrm{\oplus}}}$ with transits observed by TESS, CHEOPS, ASTEP, and LCOGT. The outer planet, TOI-815c, has a radius of 2.62$\pm$0.10$\it{R_{\rm\mathrm{\oplus}}}$, based on observations of three non-consecutive transits with TESS, while targeted CHEOPS photometry and radial velocity follow-up with ESPRESSO were required to confirm the 35-day period. ESPRESSO confirmed the planetary nature of both planets and measured masses of 7.6$\pm$1.5 $\it{M_{\rm \mathrm{\oplus}}}$ ($ρ_\mathrm{P}$=1.64$^{+0.33}_{-0.31}$gcm$^{-3}$) and 23.5$\pm$2.4$\it{M_{\rm\mathrm{\oplus}}}$ ($ρ_\mathrm{P}$=7.2$^{+1.1}_{-1.0}$gcm$^{-3}$) respectively. Thus, the planets have very different masses, unlike the usual similarity of masses in compact multi-planet systems. Moreover, our statistical analysis of mini-Neptunes orbiting FGK stars suggests that weakly irradiated planets tend to have higher bulk densities compared to those suffering strong irradiation. This could be ascribed to their cooler atmospheres, which are more compressed and denser. Internal structure modeling of TOI-815b suggests it likely has a H-He atmosphere constituting a few percent of the total planet mass, or higher if the planet is assumed to have no water. In contrast, the measured mass and radius of TOI-815c can be explained without invoking any atmosphere, challenging planetary formation theories. Finally, we infer from our measurements that the star is viewed close to pole-on, which implies a spin-orbit misalignment at the 3$σ$ level.
△ Less
Submitted 30 January, 2024; v1 submitted 28 January, 2024;
originally announced January 2024.
-
Focus topics for the ECFA study on Higgs / Top / EW factories
Authors:
Jorge de Blas,
Patrick Koppenburg,
Jenny List,
Fabio Maltoni,
Juan Alcaraz Maestre,
Juliette Alimena,
John Alison,
Patrizia Azzi,
Paolo Azzurri,
Emanuele Bagnaschi,
Timothy Barklow,
Matthew J. Basso,
Josh Bendavid,
Martin Beneke,
Eli Ben-Haim,
Mikael Berggren,
Marzia Bordone,
Ivanka Bozovic,
Valentina Cairo,
Nuno Filipe Castro,
Marina Cobal,
Paula Collins,
Mogens Dam,
Valerio Dao,
Matteo Defranchis
, et al. (83 additional authors not shown)
Abstract:
In order to stimulate new engagement and trigger some concrete studies in areas where further work would be beneficial towards fully understanding the physics potential of an $e^+e^-$ Higgs / Top / Electroweak factory, we propose to define a set of focus topics. The general reasoning and the proposed topics are described in this document.
In order to stimulate new engagement and trigger some concrete studies in areas where further work would be beneficial towards fully understanding the physics potential of an $e^+e^-$ Higgs / Top / Electroweak factory, we propose to define a set of focus topics. The general reasoning and the proposed topics are described in this document.
△ Less
Submitted 18 January, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
$Spitzer$-selected $z > 1.3$ protocluster candidates in the LSST Deep Drilling Fields
Authors:
Harry Gully,
Nina Hatch,
Yannick Bahé,
Michael Balogh,
Micol Bolzonella,
M. C. Cooper,
Adam Muzzin,
Lucia Pozzetti,
Gregory Rudnick,
Benedetta Vulcani,
Gillian Wilson
Abstract:
We have identified 189 candidate $z > 1.3$ protoclusters and clusters in the LSST Deep Drilling Fields. This sample will enable the measurement of the metal enrichment and star formation history of clusters during their early assembly period through the direct measurement of the rate of supernovae identified through the LSST. The protocluster sample was selected from galaxy overdensities in a…
▽ More
We have identified 189 candidate $z > 1.3$ protoclusters and clusters in the LSST Deep Drilling Fields. This sample will enable the measurement of the metal enrichment and star formation history of clusters during their early assembly period through the direct measurement of the rate of supernovae identified through the LSST. The protocluster sample was selected from galaxy overdensities in a $Spitzer$/IRAC colour-selected sample using criteria that were optimised for protocluster purity using a realistic lightcone. Our tests reveal that $60-80\%$ of the identified candidates are likely to be genuine protoclusters or clusters, which is corroborated by a $\sim4σ$ stacked X-ray signal from these structures. We provide photometric redshift estimates for 47 candidates which exhibit strong peaks in the photo-$z$ distribution of their candidate members. However, the lack of a photo-$z$ peak does not mean a candidate is not genuine, since we find a stacked X-ray signal of similar significance from both the candidates that exhibit photo-$z$ peaks and those that do not. Tests on the lightcone reveal that our pursuit of a pure sample of protoclusters results in that sample being highly incomplete ($\sim4\%$) and heavily biased towards larger, richer, more massive, and more centrally concentrated protoclusters than the total protocluster population. Most ($\sim75\%$) of the selected protoclusters are likely to have a maximum collapsed halo mass of between $10^{13}-10^{14}$ M$_{\odot}$, with only $\sim25\%$ likely to be collapsed clusters above $10^{14}$ M$_{\odot}$. However, the aforementioned bias ensures our sample is $\sim50\%$ complete for structures that have already collapsed into clusters more massive than $10^{14}$ M$_{\odot}$.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Understanding the Detrimental Class-level Effects of Data Augmentation
Authors:
Polina Kirichenko,
Mark Ibrahim,
Randall Balestriero,
Diane Bouchacourt,
Ramakrishna Vedantam,
Hamed Firooz,
Andrew Gordon Wilson
Abstract:
Data augmentation (DA) encodes invariance and provides implicit regularization critical to a model's performance in image classification tasks. However, while DA improves average accuracy, recent studies have shown that its impact can be highly class dependent: achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. The…
▽ More
Data augmentation (DA) encodes invariance and provides implicit regularization critical to a model's performance in image classification tasks. However, while DA improves average accuracy, recent studies have shown that its impact can be highly class dependent: achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. There has been little progress in resolving class-level accuracy drops due to a limited understanding of these effects. In this work, we present a framework for understanding how DA interacts with class-level learning dynamics. Using higher-quality multi-label annotations on ImageNet, we systematically categorize the affected classes and find that the majority are inherently ambiguous, co-occur, or involve fine-grained distinctions, while DA controls the model's bias towards one of the closely related classes. While many of the previously reported performance drops are explained by multi-label annotations, our analysis of class confusions reveals other sources of accuracy degradation. We show that simple class-conditional augmentation strategies informed by our framework improve performance on the negatively affected classes.
△ Less
Submitted 7 December, 2023;
originally announced January 2024.
-
Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution
Authors:
Ying Wang,
Tim G. J. Rudner,
Andrew Gordon Wilson
Abstract:
Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual…
▽ More
Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual and textual features. We demonstrate how M2IB can be applied to attribution analysis of vision-language pretrained models, increasing attribution accuracy and improving the interpretability of such models when applied to safety-critical domains such as healthcare. Crucially, unlike commonly used unimodal attribution methods, M2IB does not require ground truth labels, making it possible to audit representations of vision-language pretrained models when multiple modalities but no ground-truth data is available. Using CLIP as an example, we demonstrate the effectiveness of M2IB attribution and show that it outperforms gradient-based, perturbation-based, and attention-based attribution methods both qualitatively and quantitatively.
△ Less
Submitted 22 June, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Non-Vacuous Generalization Bounds for Large Language Models
Authors:
Sanae Lotfi,
Marc Finzi,
Yilun Kuang,
Tim G. J. Rudner,
Micah Goldblum,
Andrew Gordon Wilson
Abstract:
Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular,…
▽ More
Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular, we derive a compression bound that is valid for the unbounded log-likelihood loss using prediction smoothing, and we extend the bound to handle subsampling, accelerating bound computation on massive datasets. To achieve the extreme level of compression required for non-vacuous generalization bounds, we devise SubLoRA, a low-dimensional non-linear parameterization. Using this approach, we find that larger models have better generalization bounds and are more compressible than smaller models.
△ Less
Submitted 12 February, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Function-Space Regularization in Neural Networks: A Probabilistic Perspective
Authors:
Tim G. J. Rudner,
Sanyam Kapoor,
Shikai Qiu,
Andrew Gordon Wilson
Abstract:
Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by vie…
▽ More
Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by viewing parameter-space regularization as specifying an empirical prior distribution over the model parameters, we can derive a probabilistically well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. This method -- which we refer to as function-space empirical Bayes (FSEB) -- includes both parameter- and function-space regularization, is mathematically simple, easy to implement, and incurs only minimal computational overhead compared to standard regularization techniques. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection, highly-calibrated predictive uncertainty estimates, successful task adaption from pre-trained models, and improved generalization under covariate shift.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
The Correlation Function and Detection of Baryon Acoustic Oscillation Peak from the Spectroscopic SDSS GalWCat Galaxy Cluster Catalogue
Authors:
Mohamed H. Abdullah,
Anatoly Klypin,
Francisco Prada,
Gillian Wilson,
Tomoaki Ishiyama,
Julia Ereza
Abstract:
We measure the two point correlation function (CF) of 1357 galaxy clusters with a mass of $\log_{10}{M_{200}}\geq 13.6$~\hm~and at a redshift of $z \leq 0.125$. This work differs from previous analyses in that it utilizes a spectroscopic cluster catalogue, $\mathtt{SDSS-GalWCat}$, to measure the CF and detect the baryon acoustic oscillation (BAO) signal. Unlike previous studies which use statistic…
▽ More
We measure the two point correlation function (CF) of 1357 galaxy clusters with a mass of $\log_{10}{M_{200}}\geq 13.6$~\hm~and at a redshift of $z \leq 0.125$. This work differs from previous analyses in that it utilizes a spectroscopic cluster catalogue, $\mathtt{SDSS-GalWCat}$, to measure the CF and detect the baryon acoustic oscillation (BAO) signal. Unlike previous studies which use statistical techniques, we compute covariance errors directly by generating a set of 1086 galaxy cluster lightcones from the GLAM $N$-body simulation. Fitting the CF with a power-law model of the form $ξ(s) = (s/s_0)^{-γ}$, we determine the best-fit correlation length and power-law index at three mass thresholds. We find that the correlation length increases with increasing the mass threshold while the power-law index is almost constant. For $\log_{10}{M_{200}}\geq 13.6$~\hm, we find $s_0 = 14.54\pm0.87$~\h~and $γ=1.97\pm0.11$. We detect the BAO signal at $s = 100$~\h~with a significance of $1.60 σ$. Fitting the CF with a $Λ$CDM model, we find $D_\mathrm{V}(z = 0.089)\mathrm{r}^{fid}_d/\mathrm{r}_d = 267.62 \pm 26$ \h, consistent with Planck 2015 cosmology. We present a set of 108 high-fidelity simulated galaxy cluster lightcones from the high-resolution \U~N-body simulation, employed for methodological validation. We find $D_\mathrm{V}(z = 0.089)/r_d = 2.666 \pm 0.129$, indicating that our method does not introduce any bias in the parameter estimation for this small sample of galaxy clusters.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
The stellar mass function of quiescent galaxies in 2 < z < 2.5 protoclusters
Authors:
Adit H. Edward,
Michael L. Balogh,
Yannick M. Bahe,
Michael C. Cooper,
Nina A. Hatch,
Justin Marchioni,
Adam Muzzin,
Allison Noble,
Gregory H. Rednick,
Benedetta Vulcani,
Gillian Wilson,
Gabriella De Lucia,
Ricardo Demarco,
Ben Forrest,
Michaela Hirschmann,
Gianluca Castignani,
Pierluigi Cerulo,
Rose A. Finn,
Guillaume Hewitt,
Pascale Jablonka,
Yadayuki Kodama,
Sophie Maurogordato,
Julie Nantais,
Lizhi Xie
Abstract:
We present an analysis of the galaxy stellar mass function (SMF) of 14 known protoclusters between $2.0 < z < 2.5$ in the COSMOS field, down to a mass limit of $10^{9.5}$ M$_{\odot}$. We use existing photometric redshifts with a statistical background subtraction, and consider star-forming and quiescent galaxies identified from $(NUV - r)$ and $(r - J)$ colours separately. Our fiducial sample incl…
▽ More
We present an analysis of the galaxy stellar mass function (SMF) of 14 known protoclusters between $2.0 < z < 2.5$ in the COSMOS field, down to a mass limit of $10^{9.5}$ M$_{\odot}$. We use existing photometric redshifts with a statistical background subtraction, and consider star-forming and quiescent galaxies identified from $(NUV - r)$ and $(r - J)$ colours separately. Our fiducial sample includes galaxies within 1 Mpc of the cluster centres. The shape of the protocluster SMF of star-forming galaxies is indistinguishable from that of the general field at this redshift. Quiescent galaxies, however, show a flatter SMF than in the field, with an upturn at low mass, though this is only significant at $\sim 2σ$. There is no strong evidence for a dominant population of quiescent galaxies at any mass, with a fraction of $< 15\%$ at $1σ$ confidence for galaxies with log$M_{\ast}/M_{\odot} < 10.5$. We compare our results with a sample of galaxies groups at $1 < z < 1.5$, and demonstrate that a significant amount of environmental quenching must take place between these epochs, increasing the relative abundance of high-mass ($\rm M > 10^{10.5} M_{\odot}$) quiescent galaxies by a factor of $\gtrsim$ 2. However, we find that at lower masses ($\rm M < 10^{10.5} M_{\odot}$), no additional environmental quenching is required.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.