-
Weak-Lensing Characterization of the Dark Matter in 29 Merging Clusters that Exhibit Radio Relics
Authors:
Kyle Finner,
M. James Jee,
Hyejeon Cho,
Kim Hyeonghan,
Wonki Lee,
Reinout J. van Weeren,
David Wittman,
Mi** Yoon
Abstract:
We present a multiwavelength analysis of 29 merging galaxy clusters that exhibit radio relics. For each merging system, we perform a weak-lensing analysis on Subaru optical imaging. We generate high-resolution mass maps of the dark matter distributions, which are critical for discerning the merging constituents. Combining the weak-lensing detections with X-ray emission, radio emission, and galaxy…
▽ More
We present a multiwavelength analysis of 29 merging galaxy clusters that exhibit radio relics. For each merging system, we perform a weak-lensing analysis on Subaru optical imaging. We generate high-resolution mass maps of the dark matter distributions, which are critical for discerning the merging constituents. Combining the weak-lensing detections with X-ray emission, radio emission, and galaxy redshifts, we discuss the formation of radio relics from the past collision. For each subcluster, we obtain mass estimates by fitting a multi-component NFW model with and without a concentration-mass relation. Comparing the two mass estimate techniques, we find that the concentration-mass relation underestimates (overestimates) the mass relative to fitting both parameters for high- (low-) mass subclusters. We compare the mass estimates of each subcluster to their velocity dispersion measurements and find that they preferentially lie below the expected velocity dispersion scaling relation, especially at the low-mass end (~$10^{14}\ M_\odot$). We show that the majority of the clusters that exhibit radio relics are in major mergers with a mass ratio below 1:4. We investigate the position of the mass peak relative to the galaxy luminosity peak, number density peak, and BCG locations and find that the BCG tends to better trace the mass peak position. Finally, we update a golden sample of 8 galaxy clusters that have the simplest geometries and can provide the cleanest picture of the past merger, which we recommend for further investigation to constrain the nature of dark matter and the acceleration process that leads to radio relics.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Investigating the Segment Anything Foundation Model for Map** Smallholder Agriculture Field Boundaries Without Training Labels
Authors:
Pratyush Tripathy,
Kathy Baylis,
Kyle Wu,
Jyles Watson,
Ruizhe Jiang
Abstract:
Accurate map** of agricultural field boundaries is crucial for enhancing outcomes like precision agriculture, crop monitoring, and yield estimation. However, extracting these boundaries from satellite images is challenging, especially for smallholder farms and data-scarce environments. This study explores the Segment Anything Model (SAM) to delineate agricultural field boundaries in Bihar, India…
▽ More
Accurate map** of agricultural field boundaries is crucial for enhancing outcomes like precision agriculture, crop monitoring, and yield estimation. However, extracting these boundaries from satellite images is challenging, especially for smallholder farms and data-scarce environments. This study explores the Segment Anything Model (SAM) to delineate agricultural field boundaries in Bihar, India, using 2-meter resolution SkySat imagery without additional training. We evaluate SAM's performance across three model checkpoints, various input sizes, multi-date satellite images, and edge-enhanced imagery. Our results show that SAM correctly identifies about 58% of field boundaries, comparable to other approaches requiring extensive training data. Using different input image sizes improves accuracy, with the most significant improvement observed when using multi-date satellite images. This work establishes proof of concept for using SAM and maximizing its potential in agricultural field boundary map**. Our work highlights SAM's potential in delineating agriculture field boundary in training-data scarce settings to enable a wide range of agriculture related analysis.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Object Proxy Patterns for Accelerating Distributed Applications
Authors:
J. Gregory Pauloski,
Valerie Hayot-Sasson,
Logan Ward,
Alexander Brace,
André Bauer,
Kyle Chard,
Ian Foster
Abstract:
Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves optimization to the application programmer -- optimization that becomes more difficult as data become larger. The transparent object proxy, which provides wide-area r…
▽ More
Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves optimization to the application programmer -- optimization that becomes more difficult as data become larger. The transparent object proxy, which provides wide-area references that can resolve to data regardless of location, has been demonstrated as an effective low-level building block in such situations. Here we propose three high-level proxy-based programming patterns -- distributed futures, streaming, and ownership -- that make the power of the proxy pattern usable for more complex and dynamic distributed program structures. We motivate these patterns via careful review of application requirements and describe implementations of each pattern. We evaluate our implementations through a suite of benchmarks and by applying them in three substantial scientific applications, in which we demonstrate substantial improvements in runtime, throughput, and memory usage.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
How We Built Cedar: A Verification-Guided Approach
Authors:
Craig Disselkoen,
Aaron Eline,
Shaobo He,
Kyle Headley,
Michael Hicks,
Kesha Hietala,
John Kastner,
Anwar Mamat,
Matt McCutchen,
Neha Rungta,
Bhakti Shah,
Emina Torlak,
Andrew Wells
Abstract:
This paper presents verification-guided development (VGD), a software engineering process we used to build Cedar, a new policy language for expressive, fast, safe, and analyzable authorization. Develo** a system with VGD involves writing an executable model of the system and mechanically proving properties about the model; writing production code for the system and using differential random test…
▽ More
This paper presents verification-guided development (VGD), a software engineering process we used to build Cedar, a new policy language for expressive, fast, safe, and analyzable authorization. Develo** a system with VGD involves writing an executable model of the system and mechanically proving properties about the model; writing production code for the system and using differential random testing (DRT) to check that the production code matches the model; and using property-based testing (PBT) to check properties of unmodeled parts of the production code. Using VGD for Cedar, we can build fast, idiomatic production code, prove our model correct, and find and fix subtle implementation bugs that evade code reviews and unit testing. While carrying out proofs, we found and fixed 4 bugs in Cedar's policy validator, and DRT and PBT helped us find and fix 21 additional bugs in various parts of Cedar.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Unifying thermophotovoltaic performance metrics with technoeconomics
Authors:
Shomik Verma,
Kyle Buznitsky,
Asegun Henry
Abstract:
Thermophotovoltaics (TPV) are becoming a promising new heat engine with rapid recent gains in performance. Their performance is characterized by two metrics: efficiency and power density. As we bridge the gap between lab-scale and system-scale devices, we need to understand how each of these metrics impacts the technoeconomics of a TPV system. In this work, we develop a technoeconomic metric based…
▽ More
Thermophotovoltaics (TPV) are becoming a promising new heat engine with rapid recent gains in performance. Their performance is characterized by two metrics: efficiency and power density. As we bridge the gap between lab-scale and system-scale devices, we need to understand how each of these metrics impacts the technoeconomics of a TPV system. In this work, we develop a technoeconomic metric based on the levelized cost of electricity (LCOE) to understand how the metrics should be weighted relative to each other in terms of importance. We find that systems with high infrastructure and fuel costs should prioritize TPV efficiency, while systems where the TPV cell cost dominates should prioritize power density. We then evaluate how concrete cell improvements could improve the technoeconomics of five example systems, identifying the most impactful specific properties. Namely, improving spectral control with sub-bandgap reflectance is the most effective at reducing LCOE in systems with high infrastructure cost, while increasing view factor and reducing series resistance are most critical in systems with high TPV cell cost. Improving just 1-2 of these properties can reduce the LCOE by 30-50%. This study therefore helps researchers understand which performance metric is more important for their application and how to achieve high values of this performance metric.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Imaging of single barium atoms in a second matrix site in solid xenon for barium tagging in a $^{136}$Xe double beta decay experiment
Authors:
M. Yvaine,
D. Fairbank,
J. Soderstrom,
C. Taylor,
J. Stanley,
T. Walton,
C. Chambers,
A. Iverson,
W. Fairbank,
S. Al Kharusi,
A. Amy,
E. Angelico,
A. Anker,
I. J. Arnquist,
A. Atencio,
J. Bane,
V. Belov,
E. P. Bernard,
T. Bhatta,
A. Bolotnikov,
J. Breslin,
P. A. Breur,
J. P. Brodsky,
E. Brown,
T. Brunner
, et al. (112 additional authors not shown)
Abstract:
Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform s…
▽ More
Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform single atom imaging of Ba atoms in a single-vacancy site of a solid xenon matrix. In this paper, the effort to identify signal from individual barium atoms is extended to Ba atoms in a hexa-vacancy site in the matrix and is achieved despite increased photobleaching in this site. Abrupt fluorescence turn-off of a single Ba atom is also observed. Significant recovery of fluorescence signal lost through photobleaching is demonstrated upon annealing of Ba deposits in the Xe ice. Following annealing, it is observed that Ba atoms in the hexa-vacancy site exhibit antibleaching while Ba atoms in the tetra-vacancy site exhibit bleaching. This may be evidence for a matrix site transfer upon laser excitation. Our findings offer a path of continued research toward tagging of Ba daughters in all significant sites in solid xenon.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Formation of Wind-Fed Black Hole High-mass X-ray Binaries: The Role of Roche-lobe-Overflow Post Black-Hole Formation
Authors:
Zepei Xing,
Tassos Fragos,
Emmanouil Zapartas,
Tom M. Kwan,
Lixin Dai,
Ilya Mandel,
Matthias U. Kruckow,
Max Briel,
Jeff J. Andrews,
Simone S. Bavera,
Seth Gossage,
Konstantinos Kovlakas,
Kyle A. Rocha,
Meng Sun,
Philipp M. Srivastava
Abstract:
The three dynamically confirmed wind-fed black hole high-mass X-ray binaries (BH-HMXBs) are suggested to all contain a highly spinning black hole (BH). However, based on the theories of efficient angular momentum transport inside the stars, we expect that the first-born BHs in binary systems should have low spins, which is consistent with gravitational-wave observations. As a result, the origin of…
▽ More
The three dynamically confirmed wind-fed black hole high-mass X-ray binaries (BH-HMXBs) are suggested to all contain a highly spinning black hole (BH). However, based on the theories of efficient angular momentum transport inside the stars, we expect that the first-born BHs in binary systems should have low spins, which is consistent with gravitational-wave observations. As a result, the origin of the high BH spins measured in wind-fed BH-HMXBs remains a mystery. In this paper, we conduct a binary population synthesis study on wind-fed BH-HMXBs at solar metallicity with the use of the newly developed code POSYDON, considering three scenarios for BH accretion: Eddington-limited, moderately super-Eddington, and fully conservative accretion. Taking into account the conditions for accretion-disk formation, we find that regardless of the accretion model, these systems are more likely to have already experienced a phase of Roche-lobe overflow after the BH formation. To account for the extreme BH spins, highly conservative accretion onto BHs is required, when assuming the accreted material carries the specific angular momentum at the innermost stable orbit. Besides, in our simulations we found that the systems with donor stars within the mass range of $10-20\,M_{\odot}$ are prevalent, posing a challenge in explaining simultaneously all observed properties of the BH-HMXB in our Galaxy, Cygnus X-1, and potentially hinting that the accretion efficiency onto non-degenerate stars, before the formation of the BH, is also more conservative than assumed in our simulations.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Understanding and Modeling the Dynamics of Storm-time Atmospheric Neutral Density using Random Forests
Authors:
Kyle R. Murphy,
Alexa J. Halford,
Vivian Liu,
Jeffery Klenzing,
Jonathon Smith,
Katherine Garcia-Sage,
Joshua Pettit,
I. Jonathan Rae
Abstract:
Atmospheric neutral density is a crucial component to accurately predict and track the motion of satellites. During periods of elevated solar and geomagnetic activity atmospheric neutral density becomes highly variable and dynamic. This variability and enhanced dynamics make it difficult to accurately model neutral density leading to increased errors which propagate from neutral density models thr…
▽ More
Atmospheric neutral density is a crucial component to accurately predict and track the motion of satellites. During periods of elevated solar and geomagnetic activity atmospheric neutral density becomes highly variable and dynamic. This variability and enhanced dynamics make it difficult to accurately model neutral density leading to increased errors which propagate from neutral density models through to orbit propagation models. In this paper we investigate the dynamics of neutral density during geomagnetic storms. We use a combination of solar and geomagnetic variables to develop three Random Forest machine learning models of neutral density. These models are based on (1) slow solar indices, (2) high cadence solar irradiance, and (3) combined high-cadence solar irradiance and geomagnetic indices. Each model is validated using an out-of-sample dataset using analysis of residuals and typical metrics. During quiet-times, all three models perform well; however, during geomagnetic storms, the combined high cadence solar irradiance/geomagnetic model performs significantly better than the models based solely on solar activity. The combined model capturing an additional 10\% in the variability of density and having an error up to six times smaller during geomagnetic storms then the solar models. Overall, this work demonstrates the importance of including geomagnetic activity in the modeling of atmospheric density and serves as a proof of concept for using machine learning algorithms to model, and in the future forecast atmospheric density for operational use.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
The space coronagraph optical bench (SCoOB): 3. Mueller matrix polarimetry of a coronagraphic exit pupil
Authors:
Jaren N. Ashcraft,
Ewan S. Douglas,
Ramya M. Anche,
Kyle Van Gorkom,
Emory Jenkins,
William Melby,
Maxwell A. Millar-Blanchaer
Abstract:
High-contrast imaging in the next decade aims to image exoplanets at smaller angular separations and deeper contrasts than ever before. A problem that has recently garnered attention for telescopes equipped with high-contrast coronagraphs is polarization aberration arising from the optics. These aberrations manifest as low-order aberrations of different magnitudes for orthogonal polarization state…
▽ More
High-contrast imaging in the next decade aims to image exoplanets at smaller angular separations and deeper contrasts than ever before. A problem that has recently garnered attention for telescopes equipped with high-contrast coronagraphs is polarization aberration arising from the optics. These aberrations manifest as low-order aberrations of different magnitudes for orthogonal polarization states and spread light into the dark hole of the coronagraph that cannot be fully corrected. The origin of polarization aberrations has been modeled at the telescope level. However, we don't fully understand how polarization aberrations arise at the instrument level. To directly measure this effect, we construct a dual-rotating-retarder polarimeter around the SCoOB high-contrast imaging testbed to measure its Mueller matrix. With this matrix, we directly characterize the diattenuation, retardance, and depolarization of the instrument as a function of position in the exit pupil. We measure the polarization aberrations in the Lyot plane to understand how polarization couples into high-contrast imaging residuals.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Binary neutron star mergers using a discontinuous Galerkin-finite difference hybrid method
Authors:
Nils Deppe,
Francois Foucart,
Marceline S. Bonilla,
Michael Boyle,
Nicholas J. Corso,
Matthew D. Duez,
Matthew Giesler,
François Hébert,
Lawrence E. Kidder,
Yoonsoo Kim,
Prayush Kumar,
Isaac Legred,
Geoffrey Lovelace,
Elias R. Most,
Jordan Moxon,
Kyle C. Nelli,
Harald P. Pfeiffer,
Mark A. Scheel,
Saul A. Teukolsky,
William Throwe,
Nils L. Vu
Abstract:
We present a discontinuous Galerkin-finite difference hybrid scheme that allows high-order shock capturing with the discontinuous Galerkin method for general relativistic magnetohydrodynamics in dynamical spacetimes. We present several optimizations and stability improvements to our algorithm that allow the hybrid method to successfully simulate single, rotating, and binary neutron stars. The hybr…
▽ More
We present a discontinuous Galerkin-finite difference hybrid scheme that allows high-order shock capturing with the discontinuous Galerkin method for general relativistic magnetohydrodynamics in dynamical spacetimes. We present several optimizations and stability improvements to our algorithm that allow the hybrid method to successfully simulate single, rotating, and binary neutron stars. The hybrid method achieves the efficiency of discontinuous Galerkin methods throughout almost the entire spacetime during the inspiral phase, while being able to robustly capture shocks and resolve the stellar surfaces. We also use Cauchy-Characteristic evolution to compute the first gravitational waveforms at future null infinity from binary neutron star mergers. The simulations presented here are the first successful binary neutron star inspiral and merger simulations using discontinuous Galerkin methods.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Black Silicon BRDF and Polarization for Coronagraphic Pupil Masks
Authors:
Emory L. Jenkins,
Ramya M. Anche,
Kyle J. Van Gorkom,
A. J. Eldorado Riggs,
Ewan S. Douglas
Abstract:
Future space observatories will likely have segmented primaries, causing diffraction effects that reduce coronagraph performance. Reflective binary pupil apodizer masks can mitigate these, with the metamaterial black silicon (BSi) showing promise as a strong absorber. To bring contrast ratios to the $10^-{10}$ level as needed to observe Earth-like exoplanets, feature sizes on these BSi masks will…
▽ More
Future space observatories will likely have segmented primaries, causing diffraction effects that reduce coronagraph performance. Reflective binary pupil apodizer masks can mitigate these, with the metamaterial black silicon (BSi) showing promise as a strong absorber. To bring contrast ratios to the $10^-{10}$ level as needed to observe Earth-like exoplanets, feature sizes on these BSi masks will need to be less than $5$ microns when paired with MEMS (micro-electromechanical systems) deformable mirrors. As scalar diffraction cannot reliably model this feature size, we developed a Finite-Difference Time-Domain (FDTD) model of BSi masks using Meep software. We characterize the FDTD-derived polarization-dependent bidirectional reflectance distribution function of BSi and discuss the model's shortcomings.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
The Space Coronagraph Optical Bench (SCoOB): 5. End-to-end simulations of polarization aberrations
Authors:
Ramya M Anche,
Kyle J. Van Gorkom,
Jaren N. Ashcraft,
Ewan Douglas,
Emory L Jenkins,
Sebastiaan Y. Haffert,
Maxwell A. Millar-Blanchaer
Abstract:
Polarization aberrations originating from the telescope and high-contrast imaging instrument optics introduce polarization-dependent speckles and associated errors in the image plane, affecting the measured exoplanet signal. Understanding this effect is critical for future space-based high-contrast imaging instruments that aim to image the Earth analogs with 1e-10 raw contrast and characterize the…
▽ More
Polarization aberrations originating from the telescope and high-contrast imaging instrument optics introduce polarization-dependent speckles and associated errors in the image plane, affecting the measured exoplanet signal. Understanding this effect is critical for future space-based high-contrast imaging instruments that aim to image the Earth analogs with 1e-10 raw contrast and characterize their atmospheres. We present end-to-end modeling of the polarization aberrations for a high-contrast imaging testbed, SCoOB. We use a vector vortex coronagraph (VVC) as the focal plane mask, incorporate polarization filtering, and estimate the peak contrast in the dark hole region. The dominant polarization aberrations in the system are retardance defocus and tilt due to the OAPs and fold mirrors. Although the mean contrast in the dark hole region remains unaffected by the polarization aberrations, we see brighter speckles limiting the contrast to 1e-9 at smaller inner working angles. We extend the simulations using the measured retardance maps for the VVC. We find that the mean contrast in SCoOB is more sensitive to the VVC and the QWP retardance errors than the polarization aberrations.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
The space coronagraph optical bench (SCoOB): 4. vacuum performance of a high contrast imaging testbed
Authors:
Kyle Van Gorkom,
Ewan S Douglas,
Kian Milani,
Jaren N Ashcraft,
Ramya M Anche,
Emory Jenkins,
Patrick Ingraham,
Sebastiaan Haffert,
Daewook Kim,
Heejoo Choi,
Olivier Durney
Abstract:
The Space Coronagraph Optical Bench (SCoOB) is a high-contrast imaging testbed built to demonstrate starlight suppression techniques at visible wavelengths in a space-like vacuum environment. The testbed is designed to achieve ${<}10^{-8}$ contrast from $3-10λ/D$ in a one-sided dark hole using a liquid crystal vector vortex waveplate and a 952-actuator Kilo-C deformable mirror (DM) from Boston Mic…
▽ More
The Space Coronagraph Optical Bench (SCoOB) is a high-contrast imaging testbed built to demonstrate starlight suppression techniques at visible wavelengths in a space-like vacuum environment. The testbed is designed to achieve ${<}10^{-8}$ contrast from $3-10λ/D$ in a one-sided dark hole using a liquid crystal vector vortex waveplate and a 952-actuator Kilo-C deformable mirror (DM) from Boston Micromachines (BMC). We have recently expanded the testbed to include a field stop for mitigation of stray/scattered light, a precision-fabricated pinhole in the source simulator, a Minus K passive vibration isolation table for jitter reduction, and a low-noise vacuum-compatible CMOS sensor. We report the latest contrast performance achieved using implicit electric field conjugation (iEFC) at a vacuum of ${\sim}10^{-6}$ Torr and over a range of bandpasses with central wavelengths from 500 to 650nm and bandwidths (BW) from $\ll 1\%$ to 15\%. Our jitter in vacuum is $<3\times10^{-3} λ/D$, and the best contrast performance to-date in a half-sided D-shaped dark hole is $2.2\times10^{-9}$ in a $\ll 1 \%$ BW, $4\times10^{-9}$ in a 2\% BW, and $2.5\times10^{-8}$ in a 15\% BW.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Foundational Models for Pathology and Endoscopy Images: Application for Gastric Inflammation
Authors:
Hamideh Kerdegari,
Kyle Higgins,
Dennis Veselkov,
Ivan Laponogov,
Inese Polaka,
Miguel Coimbra,
Junior Andrea Pescino,
Marcis Leja,
Mario Dinis-Ribeiro,
Tania Fleitas Kanonnikoff,
Kirill Veselkov
Abstract:
The integration of artificial intelligence (AI) in medical diagnostics represents a significant advancement in managing upper gastrointestinal (GI) cancer, a major cause of global cancer mortality. Specifically for gastric cancer (GC), chronic inflammation causes changes in the mucosa such as atrophy, intestinal metaplasia (IM), dysplasia and ultimately cancer. Early detection through endoscopic r…
▽ More
The integration of artificial intelligence (AI) in medical diagnostics represents a significant advancement in managing upper gastrointestinal (GI) cancer, a major cause of global cancer mortality. Specifically for gastric cancer (GC), chronic inflammation causes changes in the mucosa such as atrophy, intestinal metaplasia (IM), dysplasia and ultimately cancer. Early detection through endoscopic regular surveillance is essential for better outcomes. Foundation models (FM), which are machine or deep learning models trained on diverse data and applicable to broad use cases, offer a promising solution to enhance the accuracy of endoscopy and its subsequent pathology image analysis. This review explores the recent advancements, applications, and challenges associated with FM in endoscopy and pathology imaging. We started by elucidating the core principles and architectures underlying these models, including their training methodologies and the pivotal role of large-scale data in develo** their predictive capabilities. Moreover, this work discusses emerging trends and future research directions, emphasizing the integration of multimodal data, the development of more robust and equitable models, and the potential for real-time diagnostic support. This review aims to provide a roadmap for researchers and practitioners in navigating the complexities of incorporating FM into clinical practice for prevention/management of GC cases, thereby improving patient outcomes.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Do they mean 'us'? Interpreting Referring Expressions in Intergroup Bias
Authors:
Venkata S Govindarajan,
Matianyu Zang,
Kyle Mahowald,
David Beaver,
Junyi Jessy Li
Abstract:
The variations between in-group and out-group speech (intergroup bias) are subtle and could underlie many social phenomena like stereotype perpetuation and implicit bias. In this paper, we model the intergroup bias as a tagging task on English sports comments from forums dedicated to fandom for NFL teams. We curate a unique dataset of over 6 million game-time comments from opposing perspectives (t…
▽ More
The variations between in-group and out-group speech (intergroup bias) are subtle and could underlie many social phenomena like stereotype perpetuation and implicit bias. In this paper, we model the intergroup bias as a tagging task on English sports comments from forums dedicated to fandom for NFL teams. We curate a unique dataset of over 6 million game-time comments from opposing perspectives (the teams in the game), each comment grounded in a non-linguistic description of the events that precipitated these comments (live win probabilities for each team). Expert and crowd annotations justify modeling the bias through tagging of implicit and explicit referring expressions and reveal the rich, contextual understanding of language and the world required for this task. For large-scale analysis of intergroup variation, we use LLMs for automated tagging, and discover that some LLMs perform best when prompted with linguistic descriptions of the win probability at the time of the comment, rather than numerical probability. Further, large-scale tagging of comments using LLMs uncovers linear variations in the form of referent across win probabilities that distinguish in-group and out-group utterances. Code and data are available at https://github.com/venkatasg/intergroup-nfl .
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Authors:
USVSN Sai Prashanth,
Alvin Deng,
Kyle O'Brien,
Jyothir S V,
Mohammad Aflah Khan,
Jaydeep Borkar,
Christopher A. Choquette-Choo,
Jacob Ray Fuehne,
Stella Biderman,
Tracy Ke,
Katherine Lee,
Naomi Saphra
Abstract:
Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the model and corpus. To build intuition around these factors, we break memorization down into a taxonomy: recitation of highly duplicated sequences, recons…
▽ More
Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the model and corpus. To build intuition around these factors, we break memorization down into a taxonomy: recitation of highly duplicated sequences, reconstruction of inherently predictable sequences, and recollection of sequences that are neither. We demonstrate the usefulness of our taxonomy by using it to construct a predictive model for memorization. By analyzing dependencies and inspecting the weights of the predictive model, we find that different factors influence the likelihood of memorization differently depending on the taxonomic category.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
GreenFaaS: Maximizing Energy Efficiency of HPC Workloads with FaaS
Authors:
Alok Kamatar,
Valerie Hayot-Sasson,
Yadu Babuji,
Andre Bauer,
Gourav Rattihalli,
Ninad Hogade,
Dejan Milojicic,
Kyle Chard,
Ian Foster
Abstract:
Application energy efficiency can be improved by executing each application component on the compute element that consumes the least energy while also satisfying time constraints. In principle, the function as a service (FaaS) paradigm should simplify such optimizations by abstracting away compute location, but existing FaaS systems do not provide for user transparency over application energy cons…
▽ More
Application energy efficiency can be improved by executing each application component on the compute element that consumes the least energy while also satisfying time constraints. In principle, the function as a service (FaaS) paradigm should simplify such optimizations by abstracting away compute location, but existing FaaS systems do not provide for user transparency over application energy consumption or task placement. Here we present GreenFaaS, a novel open source framework that bridges this gap between energy-efficient applications and FaaS platforms. GreenFaaS can be deployed by end users or providers across systems to monitor energy use, provide task-specific feedback, and schedule tasks in an energy-aware manner. We demonstrate that intelligent placement of tasks can both reduce energy consumption and improve performance. For a synthetic workload, GreenFaaS reduces the energy-delay product by 45% compared to alternatives. Furthermore, running a molecular design application through GreenFaaS can reduce energy consumption by 21% and runtime by 63% by better matching tasks with machines.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Hierarchical Framework for Optimizing Wildfire Surveillance and Suppression using Human-Autonomous Teaming
Authors:
Mahdi Al-Husseini,
Kyle Wray,
Mykel Kochenderfer
Abstract:
The integration of manned and unmanned aircraft can help improve wildfire response. Wildfire containment failures occur when resources available to first responders, who execute the initial stages of wildfire management referred to as the initial attack, are ineffective or insufficient. Initial attack surveillance and suppression models have linked action spaces and objectives, making their optimi…
▽ More
The integration of manned and unmanned aircraft can help improve wildfire response. Wildfire containment failures occur when resources available to first responders, who execute the initial stages of wildfire management referred to as the initial attack, are ineffective or insufficient. Initial attack surveillance and suppression models have linked action spaces and objectives, making their optimization computationally challenging. The initial attack may be formulated as a multi-agent partially observable Markov decision process (MPOMDP). We divide the initial attack MPOMDP into surveillance and suppression processes with their respective planners operating on different, but constant, time scales. A hierarchical framework iterates between surveillance and suppression planners while also providing collision avoidance. This framework is exemplified by a set of multi-rotor unmanned aircraft surveying an initial attack fire while a manned helicopter suppresses the fire with a water bucket. Wildfire-specific solver extensions are formulated to reduce the otherwise vast action spaces. The hierarchical framework outperforms firefighting techniques and a myopic baseline by up to 242% for moderate wildfires and 60% for rapid wildfires when simulated in abstracted and actual case studies. We also validate the early dispatching of additional suppression assets using regression models to ensure wildfire containment to thresholds established by wildfire agencies.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
SRViT: Vision Transformers for Estimating Radar Reflectivity from Satellite Observations at Scale
Authors:
Jason Stock,
Kyle Hilburn,
Imme Ebert-Uphoff,
Charles Anderson
Abstract:
We introduce a transformer-based neural network to generate high-resolution (3km) synthetic radar reflectivity fields at scale from geostationary satellite imagery. This work aims to enhance short-term convective-scale forecasts of high-impact weather events and aid in data assimilation for numerical weather prediction over the United States. Compared to convolutional approaches, which have limite…
▽ More
We introduce a transformer-based neural network to generate high-resolution (3km) synthetic radar reflectivity fields at scale from geostationary satellite imagery. This work aims to enhance short-term convective-scale forecasts of high-impact weather events and aid in data assimilation for numerical weather prediction over the United States. Compared to convolutional approaches, which have limited receptive fields, our results show improved sharpness and higher accuracy across various composite reflectivity thresholds. Additional case studies over specific atmospheric phenomena support our quantitative findings, while a novel attribution method is introduced to guide domain experts in understanding model outputs.
△ Less
Submitted 28 June, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources
Authors:
Shayne Longpre,
Stella Biderman,
Alon Albalak,
Hailey Schoelkopf,
Daniel McDuff,
Sayash Kapoor,
Kevin Klyman,
Kyle Lo,
Gabriel Ilharco,
Nay San,
Maribeth Rauh,
Aviya Skowron,
Bertie Vidgen,
Laura Weidinger,
Arvind Narayanan,
Victor Sanh,
David Adelani,
Percy Liang,
Rishi Bommasani,
Peter Henderson,
Sasha Luccioni,
Yacine Jernite,
Luca Soldaini
Abstract:
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,…
▽ More
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation, frameworks, guides, and practical tools) that support informed data selection, processing, and understanding, precise and limitation-aware artifact documentation, efficient model training, advance awareness of the environmental impact from training, careful model evaluation of capabilities, risks, and claims, as well as responsible model release, licensing and deployment practices. We hope this curated collection of resources helps guide more responsible development. The process of curating this list, enabled us to review the AI development ecosystem, revealing what tools are critically missing, misused, or over-used in existing practices. We find that (i) tools for data sourcing, model evaluation, and monitoring are critically under-serving ethical and real-world needs, (ii) evaluations for model safety, capabilities, and environmental impact all lack reproducibility and transparency, (iii) text and particularly English-centric analyses continue to dominate over multilingual and multi-modal analyses, and (iv) evaluation of systems, rather than just models, is needed so that capabilities and impact are assessed in context.
△ Less
Submitted 25 June, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
One Thousand and One Pairs: A "novel" challenge for long-context language models
Authors:
Marzena Karpinska,
Katherine Thai,
Kyle Lo,
Tanya Goyal,
Mohit Iyyer
Abstract:
Synthetic long-context LLM benchmarks (e.g., "needle-in-the-haystack") test only surface-level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and reason over information across book-length inputs? We address this question by creating NoCha, a dataset of 1,001 minimally different pairs of true and false claims about 67 recently-published English fictional books, wr…
▽ More
Synthetic long-context LLM benchmarks (e.g., "needle-in-the-haystack") test only surface-level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and reason over information across book-length inputs? We address this question by creating NoCha, a dataset of 1,001 minimally different pairs of true and false claims about 67 recently-published English fictional books, written by human readers of those books. In contrast to existing long-context benchmarks, our annotators confirm that the largest share of pairs in NoCha require global reasoning over the entire book to verify. Our experiments show that while human readers easily perform this task, it is enormously challenging for all ten long-context LLMs that we evaluate: no open-weight model performs above random chance (despite their strong performance on synthetic benchmarks), while GPT-4o achieves the highest accuracy at 55.8%. Further analysis reveals that (1) on average, models perform much better on pairs that require only sentence-level retrieval vs. global reasoning; (2) model-generated explanations for their decisions are often inaccurate even for correctly-labeled claims; and (3) models perform substantially worse on speculative fiction books that contain extensive world-building. The methodology proposed in NoCha allows for the evolution of the benchmark dataset and the easy analysis of future models.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Beyond Accidents and Misuse: Decoding the Structural Risk Dynamics of Artificial Intelligence
Authors:
Kyle A Kilian
Abstract:
The integration of artificial intelligence (AI) across contemporary industries is not just a technological upgrade but a transformation with profound structural implications. This paper explores the concept of structural risks associated with the rapid integration of advanced AI systems across social, economic, and political systems. This framework challenges the conventional perspectives that pri…
▽ More
The integration of artificial intelligence (AI) across contemporary industries is not just a technological upgrade but a transformation with profound structural implications. This paper explores the concept of structural risks associated with the rapid integration of advanced AI systems across social, economic, and political systems. This framework challenges the conventional perspectives that primarily focus on direct AI threats such as accidents and misuse and suggests that these more proximate risks are interconnected and influenced by a larger sociotechnical system. By analyzing the interactions between technological advancements and social dynamics, this study isolates three primary categories of structural risk: antecedent structural causes, antecedent system causes, and deleterious feedback loops. We present a comprehensive framework to understand the causal chains that drive these risks, highlighting the interdependence between structural forces and the more proximate risks of misuse and system failures. The paper articulates how unchecked AI advancement can reshape power dynamics, trust, and incentive structures, leading to profound and often unpredictable shifts. We introduce a methodological research agenda for map**, simulating, and gaming these dynamics aimed at preparing policymakers and national security officials for the challenges posed by next-generation AI technologies. The paper concludes with policy recommendations.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
DistiLRR: Transferring Code Repair for Low-Resource Programming Languages
Authors:
Kyle Wong,
Alfonso Amayuelas,
Liangming Pan,
William Yang Wang
Abstract:
Large language models (LLMs) have shown remarkable performance on code generation tasks. A recent application of LLMs for code generation is iterative code repair, where a model fixes an incorrect program by rationalizing about errors and generating a new program. However, code repair is primarily studied on high-resource languages like Python, and the framework's efficacy is under-explored on low…
▽ More
Large language models (LLMs) have shown remarkable performance on code generation tasks. A recent application of LLMs for code generation is iterative code repair, where a model fixes an incorrect program by rationalizing about errors and generating a new program. However, code repair is primarily studied on high-resource languages like Python, and the framework's efficacy is under-explored on low-resource languages. To apply code repair for low-resource languages, we propose Distilling Low-Resource Repairs (DistiLRR), an approach that transfers the reasoning and code generation ability from a teacher model to a student model. Our results show that DistiLRR consistently outperforms baselines on low-resource languages, but has similar performance on high-resource languages. To investigate this behavior, we perform a further analysis and find that the correlation between rationale quality and code correctness is weaker than previously perceived. We hypothesize this weakness is magnified in low-resource settings where base models lack deep knowledge of a programming language, leading to wavering benefits of code repair between high-resource and low-resource languages.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
The Unipotent Tropical Fundamental Group
Authors:
Kyle Binder,
Eric Katz
Abstract:
We define the unipotent tropical fundamental group of a polyhedral complex in $\mathbb{R}^n$ as the Tannakian fundamental group of the category of unipotent tropical vector bundles with integrable connection. We show that it is computable in that it satisfies a Seifert--Van Kampen theorem and has a description for fans in terms of a bar complex. We then review an analogous classical object, the un…
▽ More
We define the unipotent tropical fundamental group of a polyhedral complex in $\mathbb{R}^n$ as the Tannakian fundamental group of the category of unipotent tropical vector bundles with integrable connection. We show that it is computable in that it satisfies a Seifert--Van Kampen theorem and has a description for fans in terms of a bar complex. We then review an analogous classical object, the unipotent de Rham fundamental group of a schön subvariety of a toric variety. Our main result is a correspondence theorem between classical and tropical unipotent fundamental groups: there is an isomorphism between the unipotent completion of the fundamental group of a generic fiber of a tropically smooth family over a disc and the tropical unipotent fundamental group of the family's tropicalization. This theorem is established using Kato--Nakayama spaces and a descent argument. It requires a slight enlargement of the relevant categories, making use of enriched structures and partial compactifications.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Dark Matter and General Relativistic Instability in Supermassive Stars
Authors:
Kyle S. Kehrer,
George M. Fuller
Abstract:
We calculate the extent to which collisionless dark matter impacts the stability of supermassive stars $(M\gtrsim10^4\,M_\odot)$. We find that, depending on the star's mass, a dark matter content in excess of ${\sim}1\%$ by mass throughout the entire star can raise the critical central density for the onset general relativistic instability, in some cases by orders of magnitude. We consider implica…
▽ More
We calculate the extent to which collisionless dark matter impacts the stability of supermassive stars $(M\gtrsim10^4\,M_\odot)$. We find that, depending on the star's mass, a dark matter content in excess of ${\sim}1\%$ by mass throughout the entire star can raise the critical central density for the onset general relativistic instability, in some cases by orders of magnitude. We consider implications of this effect for the onset of nuclear burning and significant neutrino energy losses.
△ Less
Submitted 26 June, 2024; v1 submitted 19 June, 2024;
originally announced June 2024.
-
Clock-line-mediated Sisyphus Cooling
Authors:
Chun-Chia Chen,
Jacob L. Siegel,
Benjamin D. Hunt,
Tanner Grogan,
Youssef S. Hassan,
Kyle Beloy,
Kurt Gibble,
Roger C. Brown,
Andrew D. Ludlow
Abstract:
We demonstrate sub-recoil Sisyphus cooling using the long-lived $^{3}\mathrm{P}_{0}$ clock state in alkaline-earth-like ytterbium. A 1388 nm optical standing wave nearly resonant with the $^{3}\textrm{P}_{0}$$\,\rightarrow$$\,^{3}\textrm{D}_{1}$ transition creates a spatially periodic light shift of the $^{3}\textrm{P}_{0}$ clock state. Following excitation on the ultranarrow clock transition, we…
▽ More
We demonstrate sub-recoil Sisyphus cooling using the long-lived $^{3}\mathrm{P}_{0}$ clock state in alkaline-earth-like ytterbium. A 1388 nm optical standing wave nearly resonant with the $^{3}\textrm{P}_{0}$$\,\rightarrow$$\,^{3}\textrm{D}_{1}$ transition creates a spatially periodic light shift of the $^{3}\textrm{P}_{0}$ clock state. Following excitation on the ultranarrow clock transition, we observe Sisyphus cooling in this potential, as the light shift is correlated with excitation to $^{3}\textrm{D}_{1}$ and subsequent spontaneous decay to the $^{1}\textrm{S}_{0}$ ground state. We observe that cooling enhances the loading efficiency of atoms into a 759 nm magic-wavelength one-dimensional (1D) optical lattice, as compared to standard Doppler cooling on the $^{1}\textrm{S}_{0}$$\,\rightarrow\,$$^{3}\textrm{P}_{1}$ transition. Sisyphus cooling yields temperatures below 200 nK in the weakly confined, transverse dimensions of the 1D optical lattice. These lower temperatures improve optical lattice clocks by facilitating the use of shallow lattices with reduced light shifts, while retaining large atom numbers to reduce the quantum projection noise. This Sisyphus cooling can be pulsed or continuous and is applicable to a range of quantum metrology applications.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Kinetic Inductance, Quantum Geometry, and Superconductivity in Magic-Angle Twisted Bilayer Graphene
Authors:
Miuko Tanaka,
Joel Î-j. Wang,
Thao H. Dinh,
Daniel Rodan-Legrain,
Sameia Zaman,
Max Hays,
Bharath Kannan,
Aziza Almanakly,
David K. Kim,
Bethany M. Niedzielski,
Kyle Serniak,
Mollie E. Schwartz,
Kenji Watanabe,
Takashi Taniguchi,
Jeffrey A. Grover,
Terry P. Orlando,
Simon Gustavsson,
Pablo Jarillo-Herrero,
William D. Oliver
Abstract:
The physics of superconductivity in magic-angle twisted bilayer graphene (MATBG) is a topic of keen interest in moiré systems research, and it may provide insight into the pairing mechanism of other strongly correlated materials such as high-$T_{\mathrm{c}}$ superconductors. Here, we use DC-transport and microwave circuit quantum electrodynamics (cQED) to measure directly the superfluid stiffness…
▽ More
The physics of superconductivity in magic-angle twisted bilayer graphene (MATBG) is a topic of keen interest in moiré systems research, and it may provide insight into the pairing mechanism of other strongly correlated materials such as high-$T_{\mathrm{c}}$ superconductors. Here, we use DC-transport and microwave circuit quantum electrodynamics (cQED) to measure directly the superfluid stiffness of superconducting MATBG via its kinetic inductance. We find the superfluid stiffness to be much larger than expected from conventional single-band Fermi liquid theory; rather, it aligns well with theory involving quantum geometric effects that are dominant at the magic angle. The temperature dependence of the superfluid stiffness exhibits a power-law behavior, which contraindicates an isotropic BCS model; instead, the extracted power-law exponents indicate an anisotropic superconducting gap, whether interpreted using the conventional anisotropic BCS model or a quantum geometric theory of flat-band superconductivity. Moreover, the quadratic dependence of the stiffness on both DC and microwave current is consistent with Ginzburg-Landau theory. Taken together, these findings strongly suggest a connection between quantum geometry, superfluid stiffness, and unconventional superconductivity in MATBG. Finally, the combined DC-microwave measurement platform used here is applicable to the investigation of other atomically thin superconductors.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Tidal features and disc thicknesses of edge-on galaxies in the SDSS Stripe 82
Authors:
Maria N. Skryabina,
Kyle R. Adams,
Aleksandr V. Mosenkov
Abstract:
We examine deep optical images of edge-on galaxies selected from the Sloan Digital Sky Survey (SDSS) Stripe\,82. The entire sample consists of over 800 genuine edge-on galaxies with spectroscopic redshifts out to $z\sim0.2$. To discern the faintest details around the galaxies, we use three different data sources with a photometric depth of down to 30 mag\,arcsec$^{-2}$ in the $r$ band: SDSS Stripe…
▽ More
We examine deep optical images of edge-on galaxies selected from the Sloan Digital Sky Survey (SDSS) Stripe\,82. The entire sample consists of over 800 genuine edge-on galaxies with spectroscopic redshifts out to $z\sim0.2$. To discern the faintest details around the galaxies, we use three different data sources with a photometric depth of down to 30 mag\,arcsec$^{-2}$ in the $r$ band: SDSS Stripe\,82, Hyper Suprime-Cam Strategic Program, and DESI Legacy Imaging Surveys. Our analysis of the deep images reveals a variety of low surface brightness features. 49 galaxies exhibit prominent tidal structures, including tidal tails, stellar streams, bridges, and diffuse shells. Additionally, 56 galaxies demonstrate peculiar structural features such as lopsided discs, faint warps, and dim polar rings. Overall, we detect low surface brightness structures in 94 galaxies out of 838, accounting for 11\% of the sample. Notably, the fraction of tidal structures is only 5.8\%, which is significantly lower than that obtained in modern cosmological simulations and observations. Previous studies have shown that strongly interacting galaxies have stellar discs about 1.5--2 times thicker than those without apparent interactions. In an analysis where tidal features are carefully masked for precise disc axis ratio measurements, we show that discs of galaxies with tidal features are 1.33 times thicker, on average, than control galaxies that do not have visible tidal features. Furthermore, we find that edge-on galaxies with tidal structures tend to have a higher fraction of oval and boxy discs than galaxies without tidal features.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Towards Unlocking Insights from Logbooks Using AI
Authors:
Antonin Sulc,
Alex Bien,
Annika Eichler,
Daniel Ratner,
Florian Rehm,
Frank Mayet,
Gregor Hartmann,
Hayden Hoschouer,
Henrik Tuennermann,
Jan Kaiser,
Jason St. John,
Jennefer Maldonado,
Kyle Hazelwood,
Raimund Kammering,
Thorsten Hellert,
Tim Wilksen,
Verena Kain,
Wan-Lin Hu
Abstract:
Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly t…
▽ More
Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly testing a tailored Retrieval Augmented Generation (RAG) model for enhancing the usability of particle accelerator logbooks at institutes like DESY, BESSY, Fermilab, BNL, SLAC, LBNL, and CERN. The RAG model uses a corpus built on logbook contributions and aims to unlock insights from these logbooks by leveraging retrieval over facility datasets, including discussion about potential multimodal sources. Our goals are to increase the FAIR-ness (findability, accessibility, interoperability, and reusability) of logbooks by exploiting their information content to streamline everyday use, enable macro-analysis for root cause analysis, and facilitate problem-solving automation.
△ Less
Submitted 25 May, 2024;
originally announced June 2024.
-
Thrombogenic Risk Assessment of Transcatheter Prosthetic Heart Valves Using a Fluid-Structure Interaction Approach
Authors:
Kyle Baylous,
Brandon Kovarovic,
Salwa Anam,
Ryan Helbock,
Marvin Slepian,
Danny Bluestein
Abstract:
Prosthetic heart valve interventions such as TAVR have surged over the past decade, but the associated complication of long-term, life-threatening thrombotic events continues to undermine patient outcomes. Thus, improving thrombogenic risk analysis of TAVR devices is crucial. In vitro studies for thrombogenicity are typically difficult to perform. However, revised ISO testing standards include com…
▽ More
Prosthetic heart valve interventions such as TAVR have surged over the past decade, but the associated complication of long-term, life-threatening thrombotic events continues to undermine patient outcomes. Thus, improving thrombogenic risk analysis of TAVR devices is crucial. In vitro studies for thrombogenicity are typically difficult to perform. However, revised ISO testing standards include computational testing for thrombogenic risk assessment of cardiovascular implants. We present a fluid-structure interaction (FSI) approach for assessing thrombogenic risk of prosthetic heart valves.
An FSI framework was implemented via the incompressible computational fluid dynamics multi-physics solver of the Ansys LS-DYNA software. The numerical modeling approach for flow analysis was validated by comparing the derived flow rate of the 29-mm CoreValve device from benchtop testing and orifice areas of commercial TAVR valves in the literature to in silico results. Thrombogenic risk was analyzed by computing stress accumulation (SA) on virtual platelets seeded in the flow fields via Ansys EnSight. The integrated FSI-thrombogenicity methodology was subsequently employed to examine hemodynamics and thrombogenic risk of TAVR devices with two approaches: 1) engineering optimization and 2) clinical assessment.
Our methodology can be used to improve the thromboresistance of prosthetic valves from the initial design stage to the clinic. It allows for unparalleled optimization of devices, uncovering key TAVR leaflet design parameters that can be used to mitigate thrombogenic risk, in addition to patient-specific modeling to evaluate device performance. This work demonstrates the utility of advanced in silico analysis of TAVR devices that can be utilized for thrombogenic risk assessment of other blood recirculating devices.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Controlled Erasure as a Building Block for Universal Thermodynamically-Robust Superconducting Computing
Authors:
Christian Z. Pratt,
Kyle J. Ray,
James P. Crutchfield
Abstract:
Reducing the energy inefficiency of conventional CMOS-based computing devices -- which rely on logically irreversible gates to process information -- remains both a fundamental engineering challenge and a practical social challenge of increasing importance. We extend an alternative computing paradigm that manipulates microstate distributions to store information in the metastable minima determined…
▽ More
Reducing the energy inefficiency of conventional CMOS-based computing devices -- which rely on logically irreversible gates to process information -- remains both a fundamental engineering challenge and a practical social challenge of increasing importance. We extend an alternative computing paradigm that manipulates microstate distributions to store information in the metastable minima determined by an effective potential energy landscape. These minima serve as mesoscopic memories that are manipulated by a dynamic landscape to perform information processing. Central to our results is the control erase (CE) protocol that controls the landscape's metastable minima to determine whether information is preserved or erased. Importantly, successive protocol executions can implement a NAND gate -- a logically-irreversible universal logic gate. We show how to practically implement this in a device created by two inductively-coupled superconducting quantum interference devices (SQUIDs). We identify circuit parameter ranges that give rise to effective CEs and establish the device's robustness against logical errors. These SQUID-based logical devices are capable of operating above GHz frequencies and at the $k_\text{B} T$ energy scale. Due to this, optimized devices and associated protocols provide a universal-computation substrate that is both computationally fast and energy efficient.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Spatial and Spectral Characterization of the Gravitational-wave Background with the PTA Optimal Statistic
Authors:
Kyle A. Gersbach,
Stephen R. Taylor,
Patrick M. Meyers,
Joseph D. Romano
Abstract:
Pulsar timing arrays (PTAs) have made tremendous progress and are now showing strong evidence for the gravitational-wave background (GWB). Further probing the origin and characteristics of the GWB will require more generalized analysis techniques. Bayesian methods are most often used but can be computationally expensive. On the other hand, frequentist methods, like the PTA Optimal Statistic (OS),…
▽ More
Pulsar timing arrays (PTAs) have made tremendous progress and are now showing strong evidence for the gravitational-wave background (GWB). Further probing the origin and characteristics of the GWB will require more generalized analysis techniques. Bayesian methods are most often used but can be computationally expensive. On the other hand, frequentist methods, like the PTA Optimal Statistic (OS), are more computationally efficient and can produce results that are complementary to Bayesian methods, allowing for stronger statistical cases to be built from a confluence of different approaches. In this work we expand the capabilities of the OS through a technique we call the Per-Frequency Optimal Statistic (PFOS). The PFOS removes the underlying power-law assumption inherent in previous implementations of the OS, and allows one to estimate the GWB spectrum in a frequency-by-frequency manner. We have also adapted a recent generalization from the OS pipeline into the PFOS, making it capable of accurately characterizing the spectrum in the intermediate and strong GW signal regimes using only a small fraction of the necessary computational resources when compared with fully-correlated Bayesian methods, while also empowering many new types of analyses not possible before. We find that even in the strong GW signal regime, where the GWB dominates over noise in all frequencies, the injected value of the signal lies within the 50th-percentile of the PFOS uncertainty distribution in 41-45% of simulations, remaining 3$σ$-consistent with unbiased estimation.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Run Time Assured Reinforcement Learning for Six Degree-of-Freedom Spacecraft Inspection
Authors:
Kyle Dunlap,
Kochise Bennett,
David van Wijk,
Nathaniel Hamilton,
Kerianne Hobbs
Abstract:
The trial and error approach of reinforcement learning (RL) results in high performance across many complex tasks, but it can also lead to unsafe behavior. Run time assurance (RTA) approaches can be used to assure safety of the agent during training, allowing it to safely explore the environment. This paper investigates the application of RTA during RL training for a 6-Degree-of-Freedom spacecraft…
▽ More
The trial and error approach of reinforcement learning (RL) results in high performance across many complex tasks, but it can also lead to unsafe behavior. Run time assurance (RTA) approaches can be used to assure safety of the agent during training, allowing it to safely explore the environment. This paper investigates the application of RTA during RL training for a 6-Degree-of-Freedom spacecraft inspection task, where the agent must control its translational motion and attitude to inspect a passive chief spacecraft. Several safety constraints are developed based on position, velocity, attitude, temperature, and power of the spacecraft, and are all enforced simultaneously during training through the use of control barrier functions. This paper also explores simulating the RL agent and RTA at different frequencies to best balance training performance and safety assurance. The agent is trained with and without RTA, and the performance is compared across several metrics including inspection percentage and fuel usage.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
DataComp-LM: In search of the next generation of training sets for language models
Authors:
Jeffrey Li,
Alex Fang,
Georgios Smyrnis,
Maor Ivgi,
Matt Jordan,
Samir Gadre,
Hritik Bansal,
Etash Guha,
Sedrick Keh,
Kushal Arora,
Saurabh Garg,
Rui Xin,
Niklas Muennighoff,
Reinhard Heckel,
Jean Mercat,
Mayee Chen,
Suchin Gururangan,
Mitchell Wortsman,
Alon Albalak,
Yonatan Bitton,
Marianna Nezhurina,
Amro Abbas,
Cheng-Yu Hsieh,
Dhruba Ghosh,
Josh Gardner
, et al. (34 additional authors not shown)
Abstract:
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat…
▽ More
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with data curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters. As a baseline for DCLM, we conduct extensive experiments and find that model-based filtering is key to assembling a high-quality training set. The resulting dataset, DCLM-Baseline enables training a 7B parameter language model from scratch to 64% 5-shot accuracy on MMLU with 2.6T training tokens. Compared to MAP-Neo, the previous state-of-the-art in open-data language models, DCLM-Baseline represents a 6.6 percentage point improvement on MMLU while being trained with 40% less compute. Our baseline model is also comparable to Mistral-7B-v0.3 and Llama 3 8B on MMLU (63% & 66%), and performs similarly on an average of 53 natural language understanding tasks while being trained with 6.6x less compute than Llama 3 8B. Our results highlight the importance of dataset design for training language models and offer a starting point for further research on data curation.
△ Less
Submitted 20 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Photoinduced Patterning of Oxygen Vacancies to Promote the Ferroelectric Phase of $\mathrm{Hf_{0.5}Zr_{0.5}O_2}$
Authors:
Thomas E Beechem,
Fernando Vega,
Samantha T. Jaszewski,
Benjamin L. Aronson,
Kyle P. Kelley,
Jon F. Ihlefeld
Abstract:
Photoinduced reductions in the oxygen vacancy concentration were leveraged to increase the ferroelectric phase fraction of $\mathrm{Hf_{0.5}Zr_{0.5}O_2}$ (HZO) thin-films. Modest ($\sim 0.02-0.77~\mathrm{mJ/μm^2}$) laser doses of visible light (488 nm, 2.54 eV) spatially patterned the concentration of oxygen vacancies as monitored by photoluminescence imaging. Local, tip-based, near-field, nanoFTI…
▽ More
Photoinduced reductions in the oxygen vacancy concentration were leveraged to increase the ferroelectric phase fraction of $\mathrm{Hf_{0.5}Zr_{0.5}O_2}$ (HZO) thin-films. Modest ($\sim 0.02-0.77~\mathrm{mJ/μm^2}$) laser doses of visible light (488 nm, 2.54 eV) spatially patterned the concentration of oxygen vacancies as monitored by photoluminescence imaging. Local, tip-based, near-field, nanoFTIR measurements showed that the photoinduced oxygen vacancy concentration reduction promoted formation of the ferroelectric phase (space group $Pca2_1$) resulting in an increase in the piezoelectric response measured by piezoresponse force microscopy. Photoinduced vacancy tailoring provides, therefore, a spatially prescriptive, post-synthesis, and low-entry method to modify phase in \hfo-based materials.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
The Base-Rate Effect on LLM Benchmark Performance: Disambiguating Test-Taking Strategies from Benchmark Performance
Authors:
Kyle Moore,
Jesse Roberts,
Thao Pham,
Oseremhen Ewaleifoh,
Doug Fisher
Abstract:
Cloze testing is a common method for measuring the behavior of large language models on a number of benchmark tasks. Using the MMLU dataset, we show that the base-rate probability (BRP) differences across answer tokens are significant and affect task performance ie. guess A if uncertain. We find that counterfactual prompting does sufficiently mitigate the BRP effect. The BRP effect is found to hav…
▽ More
Cloze testing is a common method for measuring the behavior of large language models on a number of benchmark tasks. Using the MMLU dataset, we show that the base-rate probability (BRP) differences across answer tokens are significant and affect task performance ie. guess A if uncertain. We find that counterfactual prompting does sufficiently mitigate the BRP effect. The BRP effect is found to have a similar effect to test taking strategies employed by humans leading to the conflation of task performance and test-taking ability. We propose the Nvr-X-MMLU task, a variation of MMLU, which helps to disambiguate test-taking ability from task performance and reports the latter.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
A Comprehensive Survey of Foundation Models in Medicine
Authors:
Wasif Khan,
Seowung Leem,
Kyle B. See,
Joshua K. Wong,
Shaoting Zhang,
Ruogu Fang
Abstract:
Foundation models (FMs) are large-scale deep-learning models trained on extensive datasets using self-supervised techniques. These models serve as a base for various downstream tasks, including healthcare. FMs have been adopted with great success across various domains within healthcare, including natural language processing (NLP), computer vision, graph learning, biology, and omics. Existing heal…
▽ More
Foundation models (FMs) are large-scale deep-learning models trained on extensive datasets using self-supervised techniques. These models serve as a base for various downstream tasks, including healthcare. FMs have been adopted with great success across various domains within healthcare, including natural language processing (NLP), computer vision, graph learning, biology, and omics. Existing healthcare-based surveys have not yet included all of these domains. Therefore, this survey provides a comprehensive overview of FMs in healthcare. We focus on the history, learning strategies, flagship models, applications, and challenges of FMs. We explore how FMs such as the BERT and GPT families are resha** various healthcare domains, including clinical large language models, medical image analysis, and omics data. Furthermore, we provide a detailed taxonomy of healthcare applications facilitated by FMs, such as clinical NLP, medical computer vision, graph learning, and other biology-related tasks. Despite the promising opportunities FMs provide, they also have several associated challenges, which are explained in detail. We also outline potential future directions to provide researchers and practitioners with insights into the potential and limitations of FMs in healthcare to advance their deployment and mitigate associated risks.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
BrainFounder: Towards Brain Foundation Models for Neuroimage Analysis
Authors:
Joseph Cox,
Peng Liu,
Skylar E. Stolte,
Yunchao Yang,
Kang Liu,
Kyle B. See,
Huiwen Ju,
Ruogu Fang
Abstract:
The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pret…
▽ More
The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pretraining approach using vision transformers. The first stage is dedicated to encoding anatomical structures in generally healthy brains, identifying key features such as shapes and sizes of different brain regions. The second stage concentrates on spatial information, encompassing aspects like location and the relative positioning of brain structures. We rigorously evaluate our model, BrainFounder, using the Brain Tumor Segmentation (BraTS) challenge and Anatomical Tracings of Lesions After Stroke v2.0 (ATLAS v2.0) datasets. BrainFounder demonstrates a significant performance gain, surpassing the achievements of the previous winning solutions using fully supervised learning. Our findings underscore the impact of scaling up both the complexity of the model and the volume of unlabeled training data derived from generally healthy brains, which enhances the accuracy and predictive capabilities of the model in complex neuroimaging tasks with MRI. The implications of this research provide transformative insights and practical applications in healthcare and make substantial steps towards the creation of foundation models for Medical AI. Our pretrained models and training code can be found at https://github.com/lab-smile/GatorBrain.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Diffusion Synthesizer for Efficient Multilingual Speech to Speech Translation
Authors:
Nameer Hirschkind,
Xiao Yu,
Mahesh Kumar Nandwana,
Joseph Liu,
Eloi DuBois,
Dao Le,
Nicolas Thiebaut,
Colin Sinclair,
Kyle Spence,
Charles Shang,
Zoe Abrams,
Morgan McGuire
Abstract:
We introduce DiffuseST, a low-latency, direct speech-to-speech translation system capable of preserving the input speaker's voice zero-shot while translating from multiple source languages into English. We experiment with the synthesizer component of the architecture, comparing a Tacotron-based synthesizer to a novel diffusion-based synthesizer. We find the diffusion-based synthesizer to improve M…
▽ More
We introduce DiffuseST, a low-latency, direct speech-to-speech translation system capable of preserving the input speaker's voice zero-shot while translating from multiple source languages into English. We experiment with the synthesizer component of the architecture, comparing a Tacotron-based synthesizer to a novel diffusion-based synthesizer. We find the diffusion-based synthesizer to improve MOS and PESQ audio quality metrics by 23\% each and speaker similarity by 5\% while maintaining comparable BLEU scores. Despite having more than double the parameter count, the diffusion synthesizer has lower latency, allowing the entire model to run more than 5$\times$ faster than real-time.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
JWST/NIRCam 4-5 $μ$m Imaging of the Giant Planet AF Lep b
Authors:
Kyle Franson,
William O. Balmer,
Brendan P. Bowler,
Laurent Pueyo,
Yifan Zhou,
Emily Rickman,
Zhoujian Zhang,
Sagnick Mukherjee,
Tim D. Pearce,
Daniella C. Bardalez Gagliuffi,
Lauren I. Biddle,
Timothy D. Brandt,
Rachel Bowens-Rubin,
Justin R. Crepp,
James W. Davidson, Jr.,
Jacqueline Faherty,
Christian Ginski,
Elliott P. Horch,
Marvin Morgan,
Caroline V. Morley,
Marshall D. Perrin,
Aniket Sanghi,
Maissa Salama,
Christopher A. Theissen,
Quang H. Tran
, et al. (1 additional authors not shown)
Abstract:
With a dynamical mass of $3 \, M_\mathrm{Jup}$, the recently discovered giant planet AF Lep b is the lowest-mass imaged planet with a direct mass measurement. Its youth and spectral type near the L/T transition make it a promising target to study the impact of clouds and atmospheric chemistry at low surface gravities. In this work, we present JWST/NIRCam imaging of AF Lep b. Across two epochs, we…
▽ More
With a dynamical mass of $3 \, M_\mathrm{Jup}$, the recently discovered giant planet AF Lep b is the lowest-mass imaged planet with a direct mass measurement. Its youth and spectral type near the L/T transition make it a promising target to study the impact of clouds and atmospheric chemistry at low surface gravities. In this work, we present JWST/NIRCam imaging of AF Lep b. Across two epochs, we detect AF Lep b in F444W ($4.4 \, \mathrm{μm}$) with S/N ratios of 9.6 and 8.7, respectively. At the planet's separation of $320 \, \mathrm{mas}$ during the observations, the coronagraphic throughput is ${\approx}7\%$, demonstrating that NIRCam's excellent sensitivity persists down to small separations. The F444W photometry of AF Lep b affirms the presence of disequilibrium carbon chemistry and enhanced atmospheric metallicity. These observations also place deep limits on wider-separation planets in the system, ruling out $1.1 \, M_\mathrm{Jup}$ planets beyond $15.6 \, \mathrm{au}$ (0.58 arcsec), $1.1 \, M_\mathrm{Sat}$ planets beyond $27 \, \mathrm{au}$ (1 arcsec), and $2.8 \, M_\mathrm{Nep}$ planets beyond $67 \, \mathrm{au}$ (2.5 arcsec). We also present new Keck/NIRC2 $L'$ imaging of AF Lep b; combining this with the two epochs of F444W photometry and previous Keck $L'$ photometry provides limits on the long-term 3-$5 \, \mathrm{μm}$ variability of AF Lep b on months-to-years timescales. AF Lep b is the closest-separation planet imaged with JWST to date, demonstrating that planets can be recovered well inside the nominal (50% throughput) NIRCam coronagraph inner working angle.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video
Authors:
Hector A. Valdez,
Kyle Min,
Subarna Tripathi
Abstract:
Pretraining egocentric vision-language models has become essential to improving downstream egocentric video-text tasks. These egocentric foundation models commonly use the transformer architecture. The memory footprint of these models during pretraining can be substantial. Therefore, we pretrain SViTT-Ego, the first sparse egocentric video-text transformer model integrating edge and node sparsific…
▽ More
Pretraining egocentric vision-language models has become essential to improving downstream egocentric video-text tasks. These egocentric foundation models commonly use the transformer architecture. The memory footprint of these models during pretraining can be substantial. Therefore, we pretrain SViTT-Ego, the first sparse egocentric video-text transformer model integrating edge and node sparsification. We pretrain on the EgoClip dataset and incorporate the egocentric-friendly objective EgoNCE, instead of the frequently used InfoNCE. Most notably, SViTT-Ego obtains a +2.8% gain on EgoMCQ (intra-video) accuracy compared to LAVILA large, with no additional data augmentation techniques other than standard image augmentations, yet pretrainable on memory-limited devices.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
AuriDESI: Mock Catalogues for the DESI Milky Way Survey
Authors:
Namitha Kizhuprakkat,
Andrew P. Cooper,
Alexander H. Riley,
Sergey E. Koposov,
Jessica Nicole Aguilar,
Steven Ahlen,
Carlos Allende Prieto,
David Brooks,
Todd Claybaugh,
Kyle Dawson,
Axel de la Macorra,
Peter Doel,
Jaime E. Forero-Romero,
Carlos Frenk,
Enrique Gaztañaga,
Oleg Y. Gnedin,
Robert J. J. Grand,
Satya Gontcho A Gontcho,
Klaus Honscheid,
Robert Kehoe,
Martin Landriau,
Marc Manera,
Aaron Meisner,
Ramon Miquel,
Jundan Nie
, et al. (9 additional authors not shown)
Abstract:
The Dark Energy Spectroscopic Instrument Milky Way Survey (DESI MWS) will explore the assembly history of the Milky Way by characterising remnants of ancient dwarf galaxy accretion events and improving constraints on the distribution of dark matter in the outer halo. We present mock catalogues that reproduce the selection criteria of MWS and the format of the final MWS data set. These catalogues c…
▽ More
The Dark Energy Spectroscopic Instrument Milky Way Survey (DESI MWS) will explore the assembly history of the Milky Way by characterising remnants of ancient dwarf galaxy accretion events and improving constraints on the distribution of dark matter in the outer halo. We present mock catalogues that reproduce the selection criteria of MWS and the format of the final MWS data set. These catalogues can be used to test methods for quantifying the properties of stellar halo substructure and reconstructing the Milky Way's accretion history with the MWS data, including the effects of halo-to-halo variance. The mock catalogues are based on a phase-space kernel expansion technique applied to star particles in the Auriga suite of six high-resolution $Λ$CDM magneto-hydrodynamic zoom-in simulations. They include photometric properties (and associated errors) used in DESI target selection and the outputs of the MWS spectral analysis pipeline (radial velocity, metallicity, surface gravity, and temperature). They also include information from the underlying simulation, such as the total gravitational potential and information on the progenitors of accreted halo stars. We discuss how the subset of halo stars observable by MWS in these simulations corresponds to their true content and properties. These mock Milky Ways have rich accretion histories, resulting in a large number of substructures that span the whole stellar halo out to large distances and have substantial overlap in the space of orbital energy and angular momentum.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants
Authors:
G P Shrivatsa Bhargav,
Sumit Neelam,
Udit Sharma,
Shajith Ikbal,
Dheeraj Sreedhar,
Hima Karanam,
Sachindra Joshi,
Pankaj Dhoolia,
Dinesh Garg,
Kyle Croutwater,
Haode Qi,
Eric Wayne,
J William Murdock
Abstract:
We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments…
▽ More
We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments, and 2) zero-shot capabilities to serve across a wide variety of domains, slot types and conversational scenarios. We adopt a fine-tuning approach where a pre-trained LLM is fine-tuned into a slot-filling model using task specific data. The fine-tuning data is prepared carefully to cover a wide variety of slot-filling task scenarios that the model is expected to face across various domains. We give details of the data preparation and model building process. We also give a detailed analysis of the results of our experimental evaluations. Results show that our prescribed approach for slot-filling model building has resulted in 6.9% relative improvement of F1 metric over the best baseline on a realistic benchmark, while at the same time reducing the latency by 57%. More over, the data we prepared has helped improve F1 on an average by 4.2% relative across various slot-types.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
H. Al-Bataineh,
J. Alexander,
M. Alfred,
K. Aoki,
N. Apadula,
L. Aphecetche,
J. Asai,
H. Asano,
E. T. Atomssa,
R. Averbeck,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
G. Baksay,
L. Baksay,
A. Baldisseri
, et al. (510 additional authors not shown)
Abstract:
High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs…
▽ More
High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Suppressing Counter-Rotating Errors for Fast Single-Qubit Gates with Fluxonium
Authors:
David A. Rower,
Leon Ding,
Helin Zhang,
Max Hays,
Junyoung An,
Patrick M. Harrington,
Ilan T. Rosen,
Jeffrey M. Gertler,
Thomas M. Hazard,
Bethany M. Niedzielski,
Mollie E. Schwartz,
Simon Gustavsson,
Kyle Serniak,
Jeffrey A. Grover,
William D. Oliver
Abstract:
Qubit decoherence unavoidably degrades the fidelity of quantum logic gates. Accordingly, realizing gates that are as fast as possible is a guiding principle for qubit control, necessitating protocols for mitigating error channels that become significant as gate time is decreased. One such error channel arises from the counter-rotating component of strong, linearly polarized drives. This error chan…
▽ More
Qubit decoherence unavoidably degrades the fidelity of quantum logic gates. Accordingly, realizing gates that are as fast as possible is a guiding principle for qubit control, necessitating protocols for mitigating error channels that become significant as gate time is decreased. One such error channel arises from the counter-rotating component of strong, linearly polarized drives. This error channel is particularly important when gate times approach the qubit Larmor period and represents the dominant source of infidelity for sufficiently fast single-qubit gates with low-frequency qubits such as fluxonium. In this work, we develop and demonstrate two complementary protocols for mitigating this error channel. The first protocol realizes circularly polarized driving in circuit quantum electrodynamics (QED) through simultaneous charge and flux control. The second protocol -- commensurate pulses -- leverages the coherent and periodic nature of counter-rotating fields to regularize their contributions to gates, enabling single-qubit gate fidelities reliably exceeding $99.997\%$. This protocol is platform independent and requires no additional calibration overhead. This work establishes straightforward strategies for mitigating counter-rotating effects from strong drives in circuit QED and other platforms, which we expect to be helpful in the effort to realize high-fidelity control for fault-tolerant quantum computing.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
Authors:
David Wadden,
Kejian Shi,
Jacob Morrison,
Aakanksha Naik,
Shruti Singh,
Nitzan Barzilay,
Kyle Lo,
Tom Hope,
Luca Soldaini,
Shannon Zejiang Shen,
Doug Downey,
Hannaneh Hajishirzi,
Arman Cohan
Abstract:
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t…
▽ More
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed task specifications, and complex structured outputs. While instruction-following resources are available in specific domains such as clinical medicine and chemistry, SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields. To demonstrate the utility of SciRIFF, we develop a sample-efficient strategy to adapt a general instruction-following model for science by performing additional finetuning on a mix of general-domain and SciRIFF demonstrations. In evaluations on nine held-out scientific tasks, our model -- called SciTulu -- improves over a strong LLM baseline by 28.1% and 6.5% at the 7B and 70B scales respectively, while maintaining general instruction-following performance within 2% of the baseline. We are optimistic that SciRIFF will facilitate the development and evaluation of LLMs to help researchers navigate the ever-growing body of scientific literature. We release our dataset, model checkpoints, and data processing and evaluation code to enable further research.
△ Less
Submitted 18 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Personalized Product Assortment with Real-time 3D Perception and Bayesian Payoff Estimation
Authors:
Porter Jenkins,
Michael Selander,
J. Stockton Jenkins,
Andrew Merrill,
Kyle Armstrong
Abstract:
Product assortment selection is a critical challenge facing physical retailers. Effectively aligning inventory with the preferences of shoppers can increase sales and decrease out-of-stocks. However, in real-world settings the problem is challenging due to the combinatorial explosion of product assortment possibilities. Consumer preferences are typically heterogeneous across space and time, making…
▽ More
Product assortment selection is a critical challenge facing physical retailers. Effectively aligning inventory with the preferences of shoppers can increase sales and decrease out-of-stocks. However, in real-world settings the problem is challenging due to the combinatorial explosion of product assortment possibilities. Consumer preferences are typically heterogeneous across space and time, making inventory-preference alignment challenging. Additionally, existing strategies rely on syndicated data, which tends to be aggregated, low resolution, and suffer from high latency. To solve these challenges, we introduce a real-time recommendation system, which we call EdgeRec3D. Our system utilizes recent advances in 3D computer vision for perception and automatic, fine grained sales estimation. These perceptual components run on the edge of the network and facilitate real-time reward signals. Additionally, we develop a Bayesian payoff model to account for noisy estimates from 3D LIDAR data. We rely on spatial clustering to allow the system to adapt to heterogeneous consumer preferences, and a graph-based candidate generation algorithm to address the combinatorial search problem. We test our system in real-world stores across two, 6-8 week A/B tests with beverage products and demonstrate a 35% and 27% increase in sales respectively. Finally, we monitor the deployed system for a period of 28 weeks with an observational study and show a 9.4% increase in sales.
△ Less
Submitted 13 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Understanding Visual Concepts Across Models
Authors:
Brandon Trabucco,
Max Gurinas,
Kyle Doherty,
Ruslan Salakhutdinov
Abstract:
Large multimodal models such as Stable Diffusion can generate, detect, and classify new visual concepts after fine-tuning just a single word embedding. Do models learn similar words for the same concepts (i.e. <orange-cat> = orange + cat)? We conduct a large-scale analysis on three state-of-the-art models in text-to-image generation, open-set object detection, and zero-shot classification, and fin…
▽ More
Large multimodal models such as Stable Diffusion can generate, detect, and classify new visual concepts after fine-tuning just a single word embedding. Do models learn similar words for the same concepts (i.e. <orange-cat> = orange + cat)? We conduct a large-scale analysis on three state-of-the-art models in text-to-image generation, open-set object detection, and zero-shot classification, and find that new word embeddings are model-specific and non-transferable. Across 4,800 new embeddings trained for 40 diverse visual concepts on four standard datasets, we find perturbations within an $ε$-ball to any prior embedding that generate, detect, and classify an arbitrary concept. When these new embeddings are spliced into new models, fine-tuning that targets the original model is lost. We show popular soft prompt-tuning approaches find these perturbative solutions when applied to visual concept learning tasks, and embeddings for visual concepts are not transferable. Code for reproducing our work is available at: https://visual-words.github.io.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs
Authors:
**g Yang,
Kyle Fogarty,
Fangcheng Zhong,
Cengiz Oztireli
Abstract:
Despite the growing success of 3D-aware GANs, which can be trained on 2D images to generate high-quality 3D assets, they still rely on multi-view images with camera annotations to synthesize sufficient details from all viewing directions. However, the scarce availability of calibrated multi-view image datasets, especially in comparison to single-view images, has limited the potential of 3D GANs. M…
▽ More
Despite the growing success of 3D-aware GANs, which can be trained on 2D images to generate high-quality 3D assets, they still rely on multi-view images with camera annotations to synthesize sufficient details from all viewing directions. However, the scarce availability of calibrated multi-view image datasets, especially in comparison to single-view images, has limited the potential of 3D GANs. Moreover, while bypassing camera pose annotations with a camera distribution constraint reduces dependence on exact camera parameters, it still struggles to generate a consistent orientation of 3D assets. To this end, we propose SYM3D, a novel 3D-aware GAN designed to leverage the prevalent reflectional symmetry structure found in natural and man-made objects, alongside a proposed view-aware spatial attention mechanism in learning the 3D representation. We evaluate SYM3D on both synthetic (ShapeNet Chairs, Cars, and Airplanes) and real-world datasets (ABO-Chair), demonstrating its superior performance in capturing detailed geometry and texture, even when trained on only single-view images. Finally, we demonstrate the effectiveness of incorporating symmetry regularization in hel** reduce artifacts in the modeling of 3D assets in the text-to-3D task.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Implications for Governance in Public Perceptions of Societal-scale AI Risks
Authors:
Ross Gruetzemacher,
Toby D. Pilditch,
Huigang Liang,
Christy Manning,
Vael Gates,
David Moss,
James W. B. Elsey,
Willem W. A. Sleegers,
Kyle Kilian
Abstract:
Amid growing concerns over AI's societal risks--ranging from civilizational collapse to misinformation and systemic bias--this study explores the perceptions of AI experts and the general US registered voters on the likelihood and impact of 18 specific AI risks, alongside their policy preferences for managing these risks. While both groups favor international oversight over national or corporate g…
▽ More
Amid growing concerns over AI's societal risks--ranging from civilizational collapse to misinformation and systemic bias--this study explores the perceptions of AI experts and the general US registered voters on the likelihood and impact of 18 specific AI risks, alongside their policy preferences for managing these risks. While both groups favor international oversight over national or corporate governance, our survey reveals a discrepancy: voters perceive AI risks as both more likely and more impactful than experts, and also advocate for slower AI development. Specifically, our findings indicate that policy interventions may best assuage collective concerns if they attempt to more carefully balance mitigation efforts across all classes of societal-scale risks, effectively nullifying the near-vs-long-term debate over AI risks. More broadly, our results will serve not only to enable more substantive policy discussions for preventing and mitigating AI risks, but also to underscore the challenge of consensus building for effective policy implementation.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.