-
MetaFood CVPR 2024 Challenge on Physically Informed 3D Food Reconstruction: Methods and Results
Authors:
Jiangpeng He,
Yuhao Chen,
Gautham Vinod,
Talha Ibn Mahmud,
Fengqing Zhu,
Edward Delp,
Alexander Wong,
Pengcheng Xi,
Ahmad AlMughrabi,
Umair Haroon,
Ricardo Marques,
Petia Radeva,
Jiadong Tang,
Dianyi Yang,
Yu Gao,
Zhaoxiang Liang,
Yawei Jueluo,
Chengyu Shi,
Pengyu Wang
Abstract:
The increasing interest in computer vision applications for nutrition and dietary monitoring has led to the development of advanced 3D reconstruction techniques for food items. However, the scarcity of high-quality data and limited collaboration between industry and academia have constrained progress in this field. Building on recent advancements in 3D reconstruction, we host the MetaFood Workshop…
▽ More
The increasing interest in computer vision applications for nutrition and dietary monitoring has led to the development of advanced 3D reconstruction techniques for food items. However, the scarcity of high-quality data and limited collaboration between industry and academia have constrained progress in this field. Building on recent advancements in 3D reconstruction, we host the MetaFood Workshop and its challenge for Physically Informed 3D Food Reconstruction. This challenge focuses on reconstructing volume-accurate 3D models of food items from 2D images, using a visible checkerboard as a size reference. Participants were tasked with reconstructing 3D models for 20 selected food items of varying difficulty levels: easy, medium, and hard. The easy level provides 200 images, the medium level provides 30 images, and the hard level provides only 1 image for reconstruction. In total, 16 teams submitted results in the final testing phase. The solutions developed in this challenge achieved promising results in 3D food reconstruction, with significant potential for improving portion estimation for dietary assessment and nutritional monitoring. More details about this workshop challenge and access to the dataset can be found at https://sites.google.com/view/cvpr-metafood-2024.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
An impulsive geomagnetic effect from an early-impulsive flare
Authors:
Hugh S. Hudson,
Edward. W. Cliver,
Lyndsay Fletcher,
Declan A. Diver,
Peter T. Gallagher,
Ying Li,
Christopher M. J. Osborne,
Craig Stark,
Yang Su
Abstract:
The geomagnetic "solar flare effect" (SFE) results from excess ionization in the Earth's ionosphere, famously first detected at the time of the Carrington flare in 1859. This indirect detection of a flare constituted one of the first cases of "multimessenger astronomy," whereby solar ionizing radiation stimulates ionospheric currents. Well-observed SFEs have few-minute time scales and perturbation…
▽ More
The geomagnetic "solar flare effect" (SFE) results from excess ionization in the Earth's ionosphere, famously first detected at the time of the Carrington flare in 1859. This indirect detection of a flare constituted one of the first cases of "multimessenger astronomy," whereby solar ionizing radiation stimulates ionospheric currents. Well-observed SFEs have few-minute time scales and perturbations of >10 nT, with the greatest events reaching above 100 nT. In previously reported cases the SFE time profiles tend to resemble those of solar soft X-ray emission, which ionizes the D-region; there is also a less-well-studied contribution from Lyman-alpha. We report here a specific case, from flare SOL2024-03-10 (M7.4), in which an impulsive SFE deviated from this pattern. This flare contained an "early impulsive" component of exceptionally hard radiation, extending up to gamma-ray energies above 1 MeV, distinctly before the bulk of the flare soft X-ray emission. We can characterize the spectral distribution of this early-impulsive component in detail, thanks to the modern extensive wavelength coverage. A more typical gradual SFE occurred during the flare's main phase. We suggest that events of this type warrant exploration of the solar physics in the "impulse response" limit of very short time scales.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Gauge model of the fifth force
Authors:
G. A. Sardanashvily,
E. G. Timoshenko
Abstract:
The Lagrangian and equations of the gauge model of the fifth fundamental interaction are constructed and the corresponding corrections to the Newtonian potential are obtained.
The Lagrangian and equations of the gauge model of the fifth fundamental interaction are constructed and the corresponding corrections to the Newtonian potential are obtained.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Design and characterization of a 60-cm reflective half-wave plate for the CLASS 90 GHz band telescope
Authors:
Rui Shi,
Michael K. Brewer,
Carol Yan Yan Chan,
David T. Chuss,
Jullianna Denes Couto,
Joseph R. Eimer,
John Karakla,
Koji Shukawa,
Deniz A. N. Valle,
John W. Appel,
Charles L. Bennett,
Sumit Dahal,
Thomas Essinger-Hileman,
Tobias A. Marriage,
Matthew A. Petroff,
Karwan Rostem,
Edward J. Wollack
Abstract:
Front-end polarization modulation enables improved polarization measurement stability by modulating the targeted signal above the low-frequency $1/f$ drifts associated with atmospheric and instrumental instabilities and diminishes the impact of instrumental polarization. In this work, we present the design and characterization of a new 60-cm diameter Reflective Half-Wave Plate (RHWP) polarization…
▽ More
Front-end polarization modulation enables improved polarization measurement stability by modulating the targeted signal above the low-frequency $1/f$ drifts associated with atmospheric and instrumental instabilities and diminishes the impact of instrumental polarization. In this work, we present the design and characterization of a new 60-cm diameter Reflective Half-Wave Plate (RHWP) polarization modulator for the 90 GHz band telescope of the Cosmology Large Angular Scale Surveyor (CLASS) project. The RHWP consists of an array of parallel wires (diameter $50~\mathrm{μm}$, $175~\mathrm{μm}$ pitch) positioned $0.88~\mathrm{mm}$ from an aluminum mirror. In lab tests, it was confirmed that the wire resonance frequency ($f_\mathrm{res}$) profile is consistent with the target, $139~\mathrm{Hz}<f_\mathrm{res}<154~\mathrm{Hz}$ in the optically active region (diameter smaller than $150~\mathrm{mm}$), preventing the wire vibration during operation and reducing the RHWP deformation under the wire tension. The mirror tilt relative to the rotating axis was controlled to be $<15''$, corresponding to an increase in beam width due to beam smearing of $<0.6''$, negligible compared to the beam's full-width half-maximum of $36'$. The median and 16/84th percentile of the wire--mirror separation residual was $0.048^{+0.013}_{-0.014}~\mathrm{mm}$ in the optically active region, achieving a modulation efficiency $ε=96.2_{+0.5}^{-0.4}\%$ with an estimated bandpass of 34 GHz. The angular velocity of the RHWP was maintained to an accuracy of within $0.005\%$ at the nominal rotation frequency ($2.5~\mathrm{Hz}$). The RHWP has been successfully integrated into the CLASS 90 GHz telescope and started taking data in June 2024, replacing the previous modulator that has been in operation since June 2018.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
"Best" iterative coupled-cluster triples model: More evidence for 3CC
Authors:
Nakul Teke,
Ajay Melekamburath,
Bimal Gaudel,
Edward F. Valeev
Abstract:
To follow up on the unexpectedly-good performance of several coupled-cluster models with approximate inclusion of 3-body clusters [J. Chem. Phys. 151, 064102 (2019)] we performed a more complete assessment of the 3CC method [J. Chem. Phys. 125, 204105 (2006)] for accurate computational thermochemistry in the standard HEAT framework. New spin-integrated implementation of the 3CC method applicable t…
▽ More
To follow up on the unexpectedly-good performance of several coupled-cluster models with approximate inclusion of 3-body clusters [J. Chem. Phys. 151, 064102 (2019)] we performed a more complete assessment of the 3CC method [J. Chem. Phys. 125, 204105 (2006)] for accurate computational thermochemistry in the standard HEAT framework. New spin-integrated implementation of the 3CC method applicable to closed- and open-shell systems utilizes a new automated toolchain for derivation, optimization, and evaluation of operator algebra in many-body electronic structure. We found that with a double-zeta basis set the 3CC correlation energies and their atomization energy contributions are almost always more accurate (with respect to the CCSDTQ reference) than the CCSDT model as well as the standard CCSD(T) model. The mean absolute errors in cc-pVDZ {3CC, CCSDT, and CCSD(T)} electronic (per valence electron) and atomization energies relative to the CCSDTQ reference for the HEAT dataset [J. Chem. Phys. 121, 11599 (2004)], were {24, 70, 122} $μE_h/e$ and {0.46, 2.00, 2.58} kJ/mol, respectively. The mean absolute errors in the complete-basis-set limit {3CC, CCSDT, and CCSD(T)} atomization energies relative to the HEAT model reference, were {0.52, 2.00, and 1.07} kJ/mol, The significant and systematic reduction of the error by the 3CC method and its lower cost than CCSDT suggests it as a viable candidate for post-CCSD(T) thermochemistry applications, as well as the preferred alternative to CCSDT in general.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Real-Time Anomaly Detection and Reactive Planning with Large Language Models
Authors:
Rohan Sinha,
Amine Elhafsi,
Christopher Agia,
Matthew Foutter,
Edward Schmerling,
Marco Pavone
Abstract:
Foundation models, e.g., large language models (LLMs), trained on internet-scale data possess zero-shot generalization capabilities that make them a promising technology towards detecting and mitigating out-of-distribution failure modes of robotic systems. Fully realizing this promise, however, poses two challenges: (i) mitigating the considerable computational expense of these models such that th…
▽ More
Foundation models, e.g., large language models (LLMs), trained on internet-scale data possess zero-shot generalization capabilities that make them a promising technology towards detecting and mitigating out-of-distribution failure modes of robotic systems. Fully realizing this promise, however, poses two challenges: (i) mitigating the considerable computational expense of these models such that they may be applied online, and (ii) incorporating their judgement regarding potential anomalies into a safe control framework. In this work, we present a two-stage reasoning framework: First is a fast binary anomaly classifier that analyzes observations in an LLM embedding space, which may then trigger a slower fallback selection stage that utilizes the reasoning capabilities of generative LLMs. These stages correspond to branch points in a model predictive control strategy that maintains the joint feasibility of continuing along various fallback plans to account for the slow reasoner's latency as soon as an anomaly is detected, thus ensuring safety. We show that our fast anomaly classifier outperforms autoregressive reasoning with state-of-the-art GPT models, even when instantiated with relatively small language models. This enables our runtime monitor to improve the trustworthiness of dynamic robotic systems, such as quadrotors or autonomous vehicles, under resource and time constraints. Videos illustrating our approach in both simulation and real-world experiments are available on this project page: https://sites.google.com/view/aesop-llm.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Latent Spaces Enable Transformer-Based Dose Prediction in Complex Radiotherapy Plans
Authors:
Edward Wang,
Ryan Au,
Pencilla Lang,
Sarah A. Mattonen
Abstract:
Evidence is accumulating in favour of using stereotactic ablative body radiotherapy (SABR) to treat multiple cancer lesions in the lung. Multi-lesion lung SABR plans are complex and require significant resources to create. In this work, we propose a novel two-stage latent transformer framework (LDFormer) for dose prediction of lung SABR plans with varying numbers of lesions. In the first stage, pa…
▽ More
Evidence is accumulating in favour of using stereotactic ablative body radiotherapy (SABR) to treat multiple cancer lesions in the lung. Multi-lesion lung SABR plans are complex and require significant resources to create. In this work, we propose a novel two-stage latent transformer framework (LDFormer) for dose prediction of lung SABR plans with varying numbers of lesions. In the first stage, patient anatomical information and the dose distribution are encoded into a latent space. In the second stage, a transformer learns to predict the dose latent from the anatomical latents. Causal attention is modified to adapt to different numbers of lesions. LDFormer outperforms a state-of-the-art generative adversarial network on dose conformality in and around lesions, and the performance gap widens when considering overlap** lesions. LDFormer generates predictions of 3-D dose distributions in under 30s on consumer hardware, and has the potential to assist physicians with clinical decision making, reduce resource costs, and accelerate treatment planning.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Masses of Sunyaev-Zel'dovich Galaxy Clusters Detected by The Atacama Cosmology Telescope: Stacked Lensing Measurements with Subaru HSC Year 3 data
Authors:
Masato Shirasaki,
Cristóbal Sifón,
Hironao Miyatake,
Erwin Lau,
Zhuowen Zhang,
Neta Bahcall,
Mark Devlin,
Jo Dunkley,
Arya Farahi,
Matt Hilton,
Yen-Ting Lin,
Daisuke Nagai,
Suzanne T. Staggs,
Tomomi Sunayama,
David Spergel,
Edward J. Wollack
Abstract:
We present a stacked lensing analysis of 96 galaxy clusters selected by the thermal Sunyaev-Zel'dovich (SZ) effect in maps of the cosmic microwave background (CMB). We select foreground galaxy clusters with a $5σ$-level SZ threshold in CMB observations from the Atacama Cosmology Telescope, while we define background source galaxies for the lensing analysis with secure photometric redshift cuts in…
▽ More
We present a stacked lensing analysis of 96 galaxy clusters selected by the thermal Sunyaev-Zel'dovich (SZ) effect in maps of the cosmic microwave background (CMB). We select foreground galaxy clusters with a $5σ$-level SZ threshold in CMB observations from the Atacama Cosmology Telescope, while we define background source galaxies for the lensing analysis with secure photometric redshift cuts in Year 3 data of the Subaru Hyper Suprime Cam survey. We detect the stacked lensing signal in the range of $0.1 < R\, [h^{-1}\mathrm{Mpc}] < 100$ in each of three cluster redshift bins, $0.092<z\le0.445$, $0.445<z\le0.695$, and $0.695<z\le1.180$, with 32 galaxy clusters in each bin. The cumulative signal-to-noise ratios of the lensing signal are $14.6$, $12.0$, and $6.6$, respectively. Using a halo-based forward model, we then constrain statistical relationships between the mass inferred from the SZ observation (i.e. SZ mass) and the total mass derived from our stacked lensing measurements. At the average SZ mass in the cluster sample ($2.1-2.4\times10^{14}\, h^{-1}M_\odot$), our likelihood analysis shows that the average total mass differs from the SZ counterpart by a factor of $1.3 \pm 0.2$, $1.6 \pm 0.2$, and $1.6 \pm 0.3$ ($68\%$) in the aforementioned redshift ranges, respectively. Our limits are consistent with previous lensing measurements, and we find that the cluster modeling choices can introduce a $1σ$-level difference in our parameter inferences.
△ Less
Submitted 12 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Correcting Turbulence-induced Errors in Fiber Positioning for the Dark Energy Spectroscopic Instrument
Authors:
E. F. Schlafly,
J. Guy,
K. Honscheid,
S. Kent,
S. E. Koposov,
J. Aguilar,
S. Ahlen,
S. Bailey,
D. Brooks,
T. Claybaugh,
K. Dawson,
P. Doel,
K. Fanning,
D. P. Finkbeiner,
A. Font-Ribera,
J. E. Forero-Romero,
S. Gontcho A Gontcho,
G. Gutierrez,
D. Kirkby,
T. Kisner,
A. Kremin,
J. Lasker,
M. Landriau,
L. Le Guillou,
M. E. Levi
, et al. (15 additional authors not shown)
Abstract:
Highly-multiplexed, robotic, fiber-fed spectroscopic surveys are observing tens of millions of stars and galaxies. For many systems, accurate positioning relies on imaging the fibers in the focal plane and feeding that information back to the robotic positioners to correct their positions. Inhomogeneities and turbulence in the air between the focal plane and the imaging camera can affect the measu…
▽ More
Highly-multiplexed, robotic, fiber-fed spectroscopic surveys are observing tens of millions of stars and galaxies. For many systems, accurate positioning relies on imaging the fibers in the focal plane and feeding that information back to the robotic positioners to correct their positions. Inhomogeneities and turbulence in the air between the focal plane and the imaging camera can affect the measured positions of fibers, limiting the accuracy with which fibers can be placed on targets. For the Dark Energy Spectroscopic Instrument, we dramatically reduced the effect of turbulence on measurements of positioner locations in the focal plane by taking advantage of stationary positioners and the correlation function of the turbulence. We were able to reduce positioning errors from 7.3 microns to 3.5 microns, speeding the survey by 1.6% under typical conditions.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
Authors:
Hao-Tien Lewis Chiang,
Zhuo Xu,
Zipeng Fu,
Mithun George Jacob,
Tingnan Zhang,
Tsang-Wei Edward Lee,
Wenhao Yu,
Connor Schenck,
David Rendleman,
Dhruv Shah,
Fei Xia,
Jasmine Hsu,
Jonathan Hoech,
Pete Florence,
Sean Kirmani,
Sumeet Singh,
Vikas Sindhwani,
Carolina Parada,
Chelsea Finn,
Peng Xu,
Sergey Levine,
Jie Tan
Abstract:
An elusive goal in navigation research is to build an intelligent agent that can understand multimodal instructions including natural language and image, and perform useful navigation. To achieve this, we study a widely useful category of navigation tasks we call Multimodal Instruction Navigation with demonstration Tours (MINT), in which the environment prior is provided through a previously recor…
▽ More
An elusive goal in navigation research is to build an intelligent agent that can understand multimodal instructions including natural language and image, and perform useful navigation. To achieve this, we study a widely useful category of navigation tasks we call Multimodal Instruction Navigation with demonstration Tours (MINT), in which the environment prior is provided through a previously recorded demonstration video. Recent advances in Vision Language Models (VLMs) have shown a promising path in achieving this goal as it demonstrates capabilities in perceiving and reasoning about multimodal inputs. However, VLMs are typically trained to predict textual output and it is an open research question about how to best utilize them in navigation. To solve MINT, we present Mobility VLA, a hierarchical Vision-Language-Action (VLA) navigation policy that combines the environment understanding and common sense reasoning power of long-context VLMs and a robust low-level navigation policy based on topological graphs. The high-level policy consists of a long-context VLM that takes the demonstration tour video and the multimodal user instruction as input to find the goal frame in the tour video. Next, a low-level policy uses the goal frame and an offline constructed topological graph to generate robot actions at every timestep. We evaluated Mobility VLA in a 836m^2 real world environment and show that Mobility VLA has a high end-to-end success rates on previously unsolved multimodal instructions such as "Where should I return this?" while holding a plastic bin. A video demonstrating Mobility VLA can be found here: https://youtu.be/-Tof__Q8_5s
△ Less
Submitted 12 July, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
TOI 762 A b and TIC 46432937 b: Two Giant Planets Transiting M Dwarf Stars
Authors:
Joel D. Hartman,
Daniel Bayliss,
Rafael Brahm,
Edward M. Bryant,
Andrés Jordán,
Gáspár Á. Bakos,
Melissa J. Hobson,
Elyar Sedaghati,
Xavier Bonfils,
Marion Cointepas,
Jose Manuel Almenara,
Khalid Barkaoui,
Mathilde Timmermans,
George Dransfield,
Elsa Ducrot,
Sebastián Zúñiga-Fernández,
Matthew J. Hooton,
Peter Pihlmann Pedersen,
Francisco J. Pozuelos,
Amaury H. M. J. Triaud,
Michaël Gillon,
Emmanuel Jehin,
William C. Waalkes,
Zachory K. Berta-Thompson,
Steve B. Howell
, et al. (11 additional authors not shown)
Abstract:
We present the discovery of TOI 762 A b and TIC 46432937 b, two giant planets transiting M dwarf stars. Transits of both systems were first detected from observations by the NASA TESS mission, and the transiting objects are confirmed as planets through high-precision radial velocity (RV) observations carried out with VLT/ESPRESSO. TOI 762 A b is a warm sub-Saturn with a mass of 0.251 +- 0.042 M_J,…
▽ More
We present the discovery of TOI 762 A b and TIC 46432937 b, two giant planets transiting M dwarf stars. Transits of both systems were first detected from observations by the NASA TESS mission, and the transiting objects are confirmed as planets through high-precision radial velocity (RV) observations carried out with VLT/ESPRESSO. TOI 762 A b is a warm sub-Saturn with a mass of 0.251 +- 0.042 M_J, a radius of 0.744 +- 0.017 R_J, and an orbital period of 3.4717 d. It transits a mid-M dwarf star with a mass of 0.442 +- 0.025 M_S and a radius of 0.4250 +- 0.0091 R_S. The star TOI 762 A has a resolved binary star companion TOI 762 B that is separated from TOI 762 A by 3.2" (~ 319 AU) and has an estimated mass of 0.227 +- 0.010 M_S. The planet TIC 46432937 b is a warm Super-Jupiter with a mass of 3.20 +- 0.11 M_J and radius of 1.188 +- 0.030 R_J. The planet's orbital period is P = 1.4404 d, and it undergoes grazing transits of its early M dwarf host star, which has a mass of 0.563 +- 0.029 M_S and a radius of 0.5299 +- 0.0091 R_S. TIC 46432937 b is one of the highest mass planets found to date transiting an M dwarf star. TIC 46432937 b is also a promising target for atmospheric observations, having the highest Transmission Spectroscopy Metric or Emission Spectroscopy Metric value of any known warm Super-Jupiter (mass greater than 3.0 M_J, equilibrium temperature below 1000 K).
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
High-Dimensional Distributed Sparse Classification with Scalable Communication-Efficient Global Updates
Authors:
Fred Lu,
Ryan R. Curtin,
Edward Raff,
Francis Ferraro,
James Holt
Abstract:
As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer increasingly from communication costs as the data size or the number of iterations grows. Recent work on linear models has shown that a surrogate likelihood can be…
▽ More
As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer increasingly from communication costs as the data size or the number of iterations grows. Recent work on linear models has shown that a surrogate likelihood can be optimized locally to iteratively improve on an initial solution in a communication-efficient manner. However, existing versions of these methods experience multiple shortcomings as the data size becomes massive, including diverging updates and efficiently handling sparsity. In this work we develop solutions to these problems which enable us to learn a communication-efficient distributed logistic regression model even beyond millions of features. In our experiments we demonstrate a large improvement in accuracy over distributed algorithms with only a few distributed update steps needed, and similar or faster runtimes. Our code is available at \url{https://github.com/FutureComputing4AI/ProxCSL}.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
The Impact of an XAI-Augmented Approach on Binary Classification with Scarce Data
Authors:
Ximing Wen,
Rosina O. Weber,
Anik Sen,
Darryl Hannan,
Steven C. Nesbit,
Vincent Chan,
Alberto Goffi,
Michael Morris,
John C. Hunninghake,
Nicholas E. Villalobos,
Edward Kim,
Christopher J. MacLellan
Abstract:
Point-of-Care Ultrasound (POCUS) is the practice of clinicians conducting and interpreting ultrasound scans right at the patient's bedside. However, the expertise needed to interpret these images is considerable and may not always be present in emergency situations. This reality makes algorithms such as machine learning classifiers extremely valuable to augment human decisions. POCUS devices are b…
▽ More
Point-of-Care Ultrasound (POCUS) is the practice of clinicians conducting and interpreting ultrasound scans right at the patient's bedside. However, the expertise needed to interpret these images is considerable and may not always be present in emergency situations. This reality makes algorithms such as machine learning classifiers extremely valuable to augment human decisions. POCUS devices are becoming available at a reasonable cost in the size of a mobile phone. The challenge of turning POCUS devices into life-saving tools is that interpretation of ultrasound images requires specialist training and experience. Unfortunately, the difficulty to obtain positive training images represents an important obstacle to building efficient and accurate classifiers. Hence, the problem we try to investigate is how to explore strategies to increase accuracy of classifiers trained with scarce data. We hypothesize that training with a few data instances may not suffice for classifiers to generalize causing them to overfit. Our approach uses an Explainable AI-Augmented approach to help the algorithm learn more from less and potentially help the classifier better generalize.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts
Authors:
Yijia Xiao,
Edward Sun,
Tianyu Liu,
Wei Wang
Abstract:
We propose LogicVista, an evaluation benchmark that assesses the integrated logical reasoning capabilities of multimodal large language models (MLLMs) in Visual contexts. Recent advancements in MLLMs have demonstrated various fascinating abilities, from crafting poetry based on an image to performing mathematical reasoning. However, there is still a lack of systematic evaluation of MLLMs' proficie…
▽ More
We propose LogicVista, an evaluation benchmark that assesses the integrated logical reasoning capabilities of multimodal large language models (MLLMs) in Visual contexts. Recent advancements in MLLMs have demonstrated various fascinating abilities, from crafting poetry based on an image to performing mathematical reasoning. However, there is still a lack of systematic evaluation of MLLMs' proficiency in logical reasoning tasks, which are essential for activities like navigation and puzzle-solving. Thus we evaluate general logical cognition abilities across 5 logical reasoning tasks encompassing 9 different capabilities, using a sample of 448 multiple-choice questions. Each question is annotated with the correct answer and the human-written reasoning behind the selection, enabling both open-ended and multiple-choice evaluation. A total of 8 MLLMs are comprehensively evaluated using LogicVista. Code and Data Available at https://github.com/Yijia-Xiao/LogicVista.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Cosmological constraints from the cross-correlation of DESI Luminous Red Galaxies with CMB lensing from Planck PR4 and ACT DR6
Authors:
Noah Sailer,
Joshua Kim,
Simone Ferraro,
Mathew S. Madhavacheril,
Martin White,
Irene Abril-Cabezas,
Jessica Nicole Aguilar,
Steven Ahlen,
J. Richard Bond,
David Brooks,
Etienne Burtin,
Erminia Calabrese,
Shi-Fan Chen,
Steve K. Choi,
Todd Claybaugh,
Kyle Dawson,
Axel de la Macorra,
Joseph DeRose,
Arjun Dey,
Biprateep Dey,
Peter Doel,
Jo Dunkley,
Carmen Embil-Villagra,
Gerrit S. Farren,
Andreu Font-Ribera
, et al. (41 additional authors not shown)
Abstract:
We infer the growth of large scale structure over the redshift range $0.4\lesssim z \lesssim 1$ from the cross-correlation of spectroscopically calibrated Luminous Red Galaxies (LRGs) selected from the Dark Energy Spectroscopic Instrument (DESI) legacy imaging survey with CMB lensing maps reconstructed from the latest Planck and ACT data. We adopt a hybrid effective field theory (HEFT) model that…
▽ More
We infer the growth of large scale structure over the redshift range $0.4\lesssim z \lesssim 1$ from the cross-correlation of spectroscopically calibrated Luminous Red Galaxies (LRGs) selected from the Dark Energy Spectroscopic Instrument (DESI) legacy imaging survey with CMB lensing maps reconstructed from the latest Planck and ACT data. We adopt a hybrid effective field theory (HEFT) model that robustly regulates the cosmological information obtainable from smaller scales, such that our cosmological constraints are reliably derived from the (predominantly) linear regime. We perform an extensive set of bandpower- and parameter-level systematics checks to ensure the robustness of our results and to characterize the uniformity of the LRG sample. We demonstrate that our results are stable to a wide range of modeling assumptions, finding excellent agreement with a linear theory analysis performed on a restricted range of scales. From a tomographic analysis of the four LRG photometric redshift bins we find that the rate of structure growth is consistent with $Λ$CDM with an overall amplitude that is $\simeq5-7\%$ lower than predicted by primary CMB measurements with modest $(\sim2σ)$ statistical significance. From the combined analysis of all four bins and their cross-correlations with Planck we obtain $S_8 = 0.765\pm0.023$, which is less discrepant with primary CMB measurements than previous DESI LRG cross Planck CMB lensing results. From the cross-correlation with ACT we obtain $S_8 = 0.790^{+0.024}_{-0.027}$, while when jointly analyzing Planck and ACT we find $S_8 = 0.775^{+0.019}_{-0.022}$ from our data alone and $σ_8 = 0.772^{+0.020}_{-0.023}$ with the addition of BAO data. These constraints are consistent with the latest Planck primary CMB analyses at the $\simeq 1.6-2.2σ$ level, and are in excellent agreement with galaxy lensing surveys.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
The Atacama Cosmology Telescope DR6 and DESI: Structure formation over cosmic time with a measurement of the cross-correlation of CMB Lensing and Luminous Red Galaxies
Authors:
Joshua Kim,
Noah Sailer,
Mathew S. Madhavacheril,
Simone Ferraro,
Irene Abril-Cabezas,
Jessica Nicole Aguilar,
Steven Ahlen,
J. Richard Bond,
David Brooks,
Etienne Burtin,
Erminia Calabrese,
Shi-Fan Chen,
Steve K. Choi,
Todd Claybaugh,
Omar Darwish,
Axel de la Macorra,
Joseph DeRose,
Mark Devlin,
Arjun Dey,
Peter Doel,
Jo Dunkley,
Carmen Embil-Villagra,
Gerrit S. Farren,
Andreu Font-Ribera,
Jaime E. Forero-Romero
, et al. (48 additional authors not shown)
Abstract:
We present a high-significance cross-correlation of CMB lensing maps from the Atacama Cosmology Telescope (ACT) Data Release 6 (DR6) with spectroscopically calibrated luminous red galaxies (LRGs) from the Dark Energy Spectroscopic Instrument (DESI). We detect this cross-correlation at a significance of 38$σ$; combining our measurement with the Planck Public Release 4 (PR4) lensing map, we detect t…
▽ More
We present a high-significance cross-correlation of CMB lensing maps from the Atacama Cosmology Telescope (ACT) Data Release 6 (DR6) with spectroscopically calibrated luminous red galaxies (LRGs) from the Dark Energy Spectroscopic Instrument (DESI). We detect this cross-correlation at a significance of 38$σ$; combining our measurement with the Planck Public Release 4 (PR4) lensing map, we detect the cross-correlation at 50$σ$. Fitting this jointly with the galaxy auto-correlation power spectrum to break the galaxy bias degeneracy with $σ_8$, we perform a tomographic analysis in four LRG redshift bins spanning $0.4 \le z \le 1.0$ to constrain the amplitude of matter density fluctuations through the parameter combination $S_8^\times = σ_8 \left(Ω_m / 0.3\right)^{0.4}$. Prior to unblinding, we confirm with extragalactic simulations that foreground biases are negligible and carry out a comprehensive suite of null and consistency tests. Using a hybrid effective field theory (HEFT) model that allows scales as small as $k_{\rm max}=0.6$ $h/{\rm Mpc}$, we obtain a 3.3% constraint on $S_8^\times = σ_8 \left(Ω_m / 0.3\right)^{0.4} = 0.792^{+0.024}_{-0.028}$ from ACT data, as well as constraints on $S_8^\times(z)$ that probe structure formation over cosmic time. Our result is consistent with the early-universe extrapolation from primary CMB anisotropies measured by Planck PR4 within 1.2$σ$. Jointly fitting ACT and Planck lensing cross-correlations we obtain a 2.7% constraint of $S_8^\times = 0.776^{+0.019}_{-0.021}$, which is consistent with the Planck early-universe extrapolation within 2.1$σ$, with the lowest redshift bin showing the largest difference in mean. The latter may motivate further CMB lensing tomography analyses at $z<0.6$ to assess the impact of potential systematics or the consistency of the $Λ$CDM model over cosmic time.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
AGN STORM 2: VIII. Investigating the Narrow Absorption Lines in Mrk 817 Using HST-COS Observations
Authors:
Maryam Dehghanian,
Nahum Arav,
Gerard A. Kriss,
Missagh Mehdipour,
Doyee Byun,
Gwen Walker,
Mayank Sharma,
Aaron J. Barth,
Misty C. Bentz,
Benjamin D. Boizelle,
Michael S. Brotherton,
Edward M. Cackett,
Elena Dalla Bonta,
Gisella De Rosa,
Gary J. Ferland,
Carina Fian,
Alexei V. Filippenko,
Jonathan Gelbord,
Michael R. Goad,
Keith Horne,
Yasaman Homayouni,
Dragana Ilic,
Michael D. Joner,
Erin A. Kara,
Shai Kaspi
, et al. (17 additional authors not shown)
Abstract:
We observed the Seyfert 1 galaxy Mrk817 during an intensive multi-wavelength reverberation map** campaign for 16 months. Here, we examine the behavior of narrow UV absorption lines seen in HST/COS spectra, both during the campaign and in other epochs extending over 14 years. We conclude that while the narrow absorption outflow system (at -3750 km/s with FWHM=177 km/s) responds to the variations…
▽ More
We observed the Seyfert 1 galaxy Mrk817 during an intensive multi-wavelength reverberation map** campaign for 16 months. Here, we examine the behavior of narrow UV absorption lines seen in HST/COS spectra, both during the campaign and in other epochs extending over 14 years. We conclude that while the narrow absorption outflow system (at -3750 km/s with FWHM=177 km/s) responds to the variations of the UV continuum as modified by the X-ray obscurer, its total column density (logNH =19.5 cm-2) did not change across all epochs. The adjusted ionization parameter (scaled with respect to the variations in the Hydrogen ionizing continuum flux) is log UH =-1.0. The outflow is located at a distance smaller than 38 parsecs from the central source, which implies a hydrogen density of nH > 3000 cm-3. The absorption outflow system only covers the continuum emission source and not the broad emission line region, which suggests that its transverse size is small (< 1e16 cm), with potential cloud geometries ranging from spherical to elongated along the line of sight.
△ Less
Submitted 8 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Online Bayesian changepoint detection for network Poisson processes with community structure
Authors:
Joshua Corneck,
Edward A. K. Cohen,
James S. Martin,
Francesco Sanna Passino
Abstract:
Network point processes often exhibit latent structure that govern the behaviour of the sub-processes. It is not always reasonable to assume that this latent structure is static, and detecting when and how this driving structure changes is often of interest. In this paper, we introduce a novel online methodology for detecting changes within the latent structure of a network point process. We focus…
▽ More
Network point processes often exhibit latent structure that govern the behaviour of the sub-processes. It is not always reasonable to assume that this latent structure is static, and detecting when and how this driving structure changes is often of interest. In this paper, we introduce a novel online methodology for detecting changes within the latent structure of a network point process. We focus on block-homogeneous Poisson processes, where latent node memberships determine the rates of the edge processes. We propose a scalable variational procedure which can be applied on large networks in an online fashion via a Bayesian forgetting factor applied to sequential variational approximations to the posterior distribution. The proposed framework is tested on simulated and real-world data, and it rapidly and accurately detects changes to the latent edge process rates, and to the latent node group memberships, both in an online manner. In particular, in an application on the Santander Cycles bike-sharing network in central London, we detect changes within the network related to holiday periods and lockdown restrictions between 2019 and 2020.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Population Size Estimation with Many Lists and Heterogeneity: A Conditional Log-Linear Model Among the Unobserved
Authors:
Mateo Dulce Rubio,
Edward Kennedy
Abstract:
We contribute a general and flexible framework to estimate the size of a closed population in the presence of $K$ capture-recapture lists and heterogeneous capture probabilities. Our novel identifying strategy leverages the fact that it is sufficient for identification that a subset of the $K$ lists are not arbitrarily dependent \textit{within the subset of the population unobserved by the remaini…
▽ More
We contribute a general and flexible framework to estimate the size of a closed population in the presence of $K$ capture-recapture lists and heterogeneous capture probabilities. Our novel identifying strategy leverages the fact that it is sufficient for identification that a subset of the $K$ lists are not arbitrarily dependent \textit{within the subset of the population unobserved by the remaining lists}, conditional on covariates. This identification approach is interpretable and actionable, interpolating between the two predominant approaches in the literature as special cases: (conditional) independence across lists and log-linear models with no highest-order interaction. We derive nonparametric doubly-robust estimators for the resulting identification expression that are nearly optimal and approximately normal for any finite sample size, even when the heterogeneous capture probabilities are estimated nonparametrically using machine learning methods. Additionally, we devise a sensitivity analysis to show how deviations from the identification assumptions affect the resulting population size estimates, allowing for the integration of domain-specific knowledge into the identification and estimation processes more transparently. We empirically demonstrate the advantages of our method using both synthetic data and real data from the Peruvian internal armed conflict to estimate the number of casualties. The proposed methodology addresses recent critiques of capture-recapture models by allowing for a weaker and more interpretable identifying assumption and accommodating complex heterogeneous capture probabilities depending on high-dimensional or continuous covariates.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Accelerated Inference for Partially Observed Markov Processes using Automatic Differentiation
Authors:
Kevin Tan,
Giles Hooker,
Edward L. Ionides
Abstract:
Automatic differentiation (AD) has driven recent advances in machine learning, including deep neural networks and Hamiltonian Markov Chain Monte Carlo methods. Partially observed nonlinear stochastic dynamical systems have proved resistant to AD techniques because widely used particle filter algorithms yield an estimated likelihood function that is discontinuous as a function of the model paramete…
▽ More
Automatic differentiation (AD) has driven recent advances in machine learning, including deep neural networks and Hamiltonian Markov Chain Monte Carlo methods. Partially observed nonlinear stochastic dynamical systems have proved resistant to AD techniques because widely used particle filter algorithms yield an estimated likelihood function that is discontinuous as a function of the model parameters. We show how to embed two existing AD particle filter methods in a theoretical framework that provides an extension to a new class of algorithms. This new class permits a bias/variance tradeoff and hence a mean squared error substantially lower than the existing algorithms. We develop likelihood maximization algorithms suited to the Monte Carlo properties of the AD gradient estimate. Our algorithms require only a differentiable simulator for the latent dynamic system; by contrast, most previous approaches to AD likelihood maximization for particle filters require access to the system's transition probabilities. Numerical results indicate that a hybrid algorithm that uses AD to refine a coarse solution from an iterated filtering algorithm show substantial improvement on current state-of-the-art methods for a challenging scientific benchmark problem.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Distinguishing Carrier Transport and Interfacial Recombination at Perovskite-Transport Layer Interfaces Using Ultrafast Spectroscopy and Numerical Simulation
Authors:
Edward Butler-Caddle,
K. D. G. Imalka Jayawardena,
Anjana Wijesakara,
Rebecca L Milot,
James Lloyd-Hughes
Abstract:
In perovskite solar cells, photovoltaic action is created by charge transport layers (CTLs) either side of the light-absorbing metal halide perovskite semiconductor. Hence, the rates for desirable charge extraction and unwanted interfacial recombination at the perovskite-CTL interfaces play a critical role for device efficiency. Here, the electrical properties of perovskite-CTL bilayer heterostruc…
▽ More
In perovskite solar cells, photovoltaic action is created by charge transport layers (CTLs) either side of the light-absorbing metal halide perovskite semiconductor. Hence, the rates for desirable charge extraction and unwanted interfacial recombination at the perovskite-CTL interfaces play a critical role for device efficiency. Here, the electrical properties of perovskite-CTL bilayer heterostructures are obtained using ultrafast THz and optical studies of the charge carrier dynamics after pulsed photoexcitation, combined with a physical model of charge carrier transport that includes the prominent Coulombic forces that arise after selective charge extraction into a CTL, and cross-interfacial recombination. The charge extraction velocity at the interface and the ambipolar diffusion coefficient within the perovskite are determined from the experimental decay profiles for heterostructures with three of the highest performing CTLs, namely C$_{60}$, PCBM and Spiro-OMeTAD. Definitive targets for the further improvement of devices are deduced: fullerenes deliver fast electron extraction, but suffer from a large rate constant for cross-interface recombination or hole extraction. Conversely, Spiro-OMeTAD exhibits slow hole extraction but does not increase the perovskite's surface recombination rate, likely contributing to its success in solar cell devices.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
STRIDE: Simple Type Recognition In Decompiled Executables
Authors:
Harrison Green,
Edward J. Schwartz,
Claire Le Goues,
Bogdan Vasilescu
Abstract:
Decompilers are widely used by security researchers and developers to reverse engineer executable code. While modern decompilers are adept at recovering instructions, control flow, and function boundaries, some useful information from the original source code, such as variable types and names, is lost during the compilation process. Our work aims to predict these variable types and names from the…
▽ More
Decompilers are widely used by security researchers and developers to reverse engineer executable code. While modern decompilers are adept at recovering instructions, control flow, and function boundaries, some useful information from the original source code, such as variable types and names, is lost during the compilation process. Our work aims to predict these variable types and names from the remaining information.
We propose STRIDE, a lightweight technique that predicts variable names and types by matching sequences of decompiler tokens to those found in training data. We evaluate it on three benchmark datasets and find that STRIDE achieves comparable performance to state-of-the-art machine learning models for both variable rety** and renaming while being much simpler and faster. We perform a detailed comparison with two recent SOTA transformer-based models in order to understand the specific factors that make our technique effective. We implemented STRIDE in fewer than 1000 lines of Python and have open-sourced it under a permissive license at https://github.com/hgarrereyn/STRIDE.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Searching for a dark matter induced galactic axion gradient
Authors:
Edward Hardy,
Mario Reig,
Juri Smirnov
Abstract:
An ultra-light axion with CP violating interactions with a dark sector and CP preserving interactions with the visible sector can act as a novel portal between dark matter and the Standard Model. In such theories, dark matter sources an axion field extending over the entire galaxy, the gradient of which can be searched for with precise spin precession experiments. A reinterpretation of existing co…
▽ More
An ultra-light axion with CP violating interactions with a dark sector and CP preserving interactions with the visible sector can act as a novel portal between dark matter and the Standard Model. In such theories, dark matter sources an axion field extending over the entire galaxy, the gradient of which can be searched for with precise spin precession experiments. A reinterpretation of existing co-magnetometer data already constrains theories that are consistent with astrophysical bounds, and near-future experiments will begin probing well-motivated models. The required interactions can arise from a confining hidden sector without necessitating fine-tuning of the axion's mass.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Effect of Burn Parameters on PAH Emissions at Conditions Relevant for Prescribed Fires
Authors:
Karl Töpperwien,
Guillaume Vignat,
Alexandra J. Feinberg,
Conner Daube,
Mitchell W. Alton,
Edward C. Fortner,
Manjula R. Canagaratna,
Matthias F. Kling,
Mary Johnson,
Kari Nadeau,
Scott Herndon,
John T. Jayne,
Matthias Ihme
Abstract:
Wildfire smoke is a health hazard as it contains a mixture of carcinogenic volatile compounds and fine particulate matter. In particular, exposure to polycyclic aromatic hydrocarbons (PAHs) is a major concern, since these compounds have been recognized as important contributors to the overall carcinogenic risk of smoke exposure. In this work, gas and particle-phase PAH emissions from the combustio…
▽ More
Wildfire smoke is a health hazard as it contains a mixture of carcinogenic volatile compounds and fine particulate matter. In particular, exposure to polycyclic aromatic hydrocarbons (PAHs) is a major concern, since these compounds have been recognized as important contributors to the overall carcinogenic risk of smoke exposure. In this work, gas and particle-phase PAH emissions from the combustion of Eastern White Pine (pinus strobus) were quantified using time-of-flight mass spectrometry over a range of burn conditions representative of wildfires and prescribed fires. These experiments allow for controlling conditions of fuel moisture, heat flux, and oxygen concentration to understand their impact on PAH emissions. We find that optimal conditions for fuel moisture content of 20 - 30%, heat load onto the sample of 60 - 70 kW/m$^2$, and oxygen concentrations of the burn environment of 5 - 15% can reduce the emissions of the heavy molar weight PAHs by up to 77%. Our analysis shows that the relative carcinogenic risk can be reduced by more than 50% under optimal conditions, offering a way for reducing emission exposure from forest treatment activities.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving
Authors:
Ran Tian,
Boyi Li,
Xinshuo Weng,
Yuxiao Chen,
Edward Schmerling,
Yue Wang,
Boris Ivanovic,
Marco Pavone
Abstract:
The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design. Traditional end-to-end driving models, however, suffer from long-tail events due to rare or unseen inputs within their training distributions. To address this, we propose TOKEN, a novel Multi-Modal Large Language Model (MM-LLM) that tokenizes the world into ob…
▽ More
The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design. Traditional end-to-end driving models, however, suffer from long-tail events due to rare or unseen inputs within their training distributions. To address this, we propose TOKEN, a novel Multi-Modal Large Language Model (MM-LLM) that tokenizes the world into object-level knowledge, enabling better utilization of LLM's reasoning capabilities to enhance autonomous vehicle planning in long-tail scenarios. TOKEN effectively alleviates data scarcity and inefficient tokenization by leveraging a traditional end-to-end driving model to produce condensed and semantically enriched representations of the scene, which are optimized for LLM planning compatibility through deliberate representation and reasoning alignment training stages. Our results demonstrate that TOKEN excels in grounding, reasoning, and planning capabilities, outperforming existing frameworks with a 27% reduction in trajectory L2 error and a 39% decrease in collision rates in long-tail scenarios. Additionally, our work highlights the importance of representation alignment and structured reasoning in sparking the common-sense reasoning capabilities of MM-LLMs for effective planning.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Cantor Set Structure of the Weak Stability Boundary for Infinitely Many Cycles in the Restricted Three-Body Problem
Authors:
Edward Belbruno
Abstract:
The geometry of the weak stability boundary region for the planar restricted three-body problem about the secondary mass point has been an open problem. Previous studies have conjectured that it may have a fractal structure. In this paper, this region is studied for infinitely many cycles about the secondary mass point, instead of a finite number studied previously. It is shown that in this case t…
▽ More
The geometry of the weak stability boundary region for the planar restricted three-body problem about the secondary mass point has been an open problem. Previous studies have conjectured that it may have a fractal structure. In this paper, this region is studied for infinitely many cycles about the secondary mass point, instead of a finite number studied previously. It is shown that in this case the boundary consists of a family of infinitely many Cantor sets and is thus fractal in nature. It is also shown that on two-dimensional surfaces of section, it is the boundary of a region only having bounded cycling motion for infinitely many cycles, while the complement of this region generally has unbounded motion. It is shown that that this shares many properties of a Mandelbrot set. Its relationship to the non-existence of KAM tori is described, among many other properties. Applications are discussed.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Test Case Features as Hyper-heuristics for Inductive Programming
Authors:
Edward McDaid,
Sarah McDaid
Abstract:
Instruction subsets are heuristics that can reduce the size of the inductive programming search space by tens of orders of magnitude. Comprising many overlap** subsets of different sizes, they serve as predictions of the instructions required to code a solution for any problem. Currently, this approach employs a single, large family of subsets meaning that some problems can search thousands of s…
▽ More
Instruction subsets are heuristics that can reduce the size of the inductive programming search space by tens of orders of magnitude. Comprising many overlap** subsets of different sizes, they serve as predictions of the instructions required to code a solution for any problem. Currently, this approach employs a single, large family of subsets meaning that some problems can search thousands of subsets before a solution is found. In this paper we introduce the use of test case type signatures as hyper-heuristics to select one of many, smaller families of instruction subsets. The type signature for any set of test cases maps directly to a single family and smaller families mean that fewer subsets need to be considered for most problems. Having many families also permits subsets to be reordered to better reflect their relative occurrence in human code - again reducing the search space size for many problems. Overall the new approach can further reduce the size of the inductive programming search space by between 1 and 3 orders of magnitude, depending on the type signature. Larger and more consistent reductions are possible through the use of more sophisticated type systems. The potential use of additional test case features as hyper-heuristics and some other possible future work is also briefly discussed.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Analyzing Quality, Bias, and Performance in Text-to-Image Generative Models
Authors:
Nila Masrourisaadat,
Nazanin Sedaghatkish,
Fatemeh Sarshartehrani,
Edward A. Fox
Abstract:
Advances in generative models have led to significant interest in image synthesis, demonstrating the ability to generate high-quality images for a diverse range of text prompts. Despite this progress, most studies ignore the presence of bias. In this paper, we examine several text-to-image models not only by qualitatively assessing their performance in generating accurate images of human faces, gr…
▽ More
Advances in generative models have led to significant interest in image synthesis, demonstrating the ability to generate high-quality images for a diverse range of text prompts. Despite this progress, most studies ignore the presence of bias. In this paper, we examine several text-to-image models not only by qualitatively assessing their performance in generating accurate images of human faces, groups, and specified numbers of objects but also by presenting a social bias analysis. As expected, models with larger capacity generate higher-quality images. However, we also document the inherent gender or social biases these models possess, offering a more complete understanding of their impact and limitations.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
BMW Agents -- A Framework For Task Automation Through Multi-Agent Collaboration
Authors:
Noel Crawford,
Edward B. Duffy,
Iman Evazzade,
Torsten Foehr,
Gregory Robbins,
Debbrata Kumar Saha,
Jiya Varma,
Marcin Ziolkowski
Abstract:
Autonomous agents driven by Large Language Models (LLMs) offer enormous potential for automation. Early proof of this technology can be found in various demonstrations of agents solving complex tasks, interacting with external systems to augment their knowledge, and triggering actions. In particular, workflows involving multiple agents solving complex tasks in a collaborative fashion exemplify the…
▽ More
Autonomous agents driven by Large Language Models (LLMs) offer enormous potential for automation. Early proof of this technology can be found in various demonstrations of agents solving complex tasks, interacting with external systems to augment their knowledge, and triggering actions. In particular, workflows involving multiple agents solving complex tasks in a collaborative fashion exemplify their capacity to operate in less strict and less well-defined environments. Thus, a multi-agent approach has great potential for serving as a backbone in many industrial applications, ranging from complex knowledge retrieval systems to next generation robotic process automation. Given the reasoning abilities within the current generation of LLMs, complex processes require a multi-step approach that includes a plan of well-defined and modular tasks. Depending on the level of complexity, these tasks can be executed either by a single agent or a group of agents. In this work, we focus on designing a flexible agent engineering framework with careful attention to planning and execution, capable of handling complex use case applications across various domains. The proposed framework provides reliability in industrial applications and presents techniques to ensure a scalable, flexible, and collaborative workflow for multiple autonomous agents working together towards solving tasks.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced June 2024.
-
An Understanding of Principal Differential Analysis
Authors:
Edward Gunning,
Giles Hooker
Abstract:
In functional data analysis, replicate observations of a smooth functional process and its derivatives offer a unique opportunity to flexibly estimate continuous-time ordinary differential equation models. Ramsay (1996) first proposed to estimate a linear ordinary differential equation from functional data in a technique called Principal Differential Analysis, by formulating a functional regressio…
▽ More
In functional data analysis, replicate observations of a smooth functional process and its derivatives offer a unique opportunity to flexibly estimate continuous-time ordinary differential equation models. Ramsay (1996) first proposed to estimate a linear ordinary differential equation from functional data in a technique called Principal Differential Analysis, by formulating a functional regression in which the highest-order derivative of a function is modelled as a time-varying linear combination of its lower-order derivatives. Principal Differential Analysis was introduced as a technique for data reduction and representation, using solutions of the estimated differential equation as a basis to represent the functional data. In this work, we re-formulate PDA as a generative statistical model in which functional observations arise as solutions of a deterministic ODE that is forced by a smooth random error process. This viewpoint defines a flexible class of functional models based on differential equations and leads to an improved understanding and characterisation of the sources of variability in Principal Differential Analysis. It does, however, result in parameter estimates that can be heavily biased under the standard estimation approach of PDA. Therefore, we introduce an iterative bias-reduction algorithm that can be applied to improve parameter estimates. We also examine the utility of our approach when the form of the deterministic part of the differential equation is unknown and possibly non-linear, where Principal Differential Analysis is treated as an approximate model based on time-varying linearisation. We demonstrate our approach on simulated data from linear and non-linear differential equations and on real data from human movement biomechanics. Supplementary R code for this manuscript is available at \url{https://github.com/edwardgunning/UnderstandingOfPDAManuscript}.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Single atom chemical identification of TMD defects in ambient conditions
Authors:
Edward Juan Dunn,
Robert James Young,
Samuel Paul Jarvis
Abstract:
The presence of defects in transition metal dichalcogenides (TMDs) can lead to dramatic local changes in their properties which are of interest for a range of technologies including quantum security devices, hydrogen production, and energy storage. It is therefore essential to be able to study these materials in their native environments, including ambient conditions. Here we report single atom re…
▽ More
The presence of defects in transition metal dichalcogenides (TMDs) can lead to dramatic local changes in their properties which are of interest for a range of technologies including quantum security devices, hydrogen production, and energy storage. It is therefore essential to be able to study these materials in their native environments, including ambient conditions. Here we report single atom resolution imaging of atomic defects in MoS2, WSe2 and WS2 monolayers carried out in ambient conditions using conductive atomic force microscopy (C-AFM). By comparing measurements from a range of TMDs we use C-AFM to chemically identify the most likely atomic species for the defects observed and quantify their prevalence on each material, identifying oxygen chalcogen substitutions and transition metal substitutions as the most likely, and most common, defect types. Moreover, we demonstrate that C-AFM operated in ambient environments can resolve subtle changes in electronic structure with atomic resolution, which we apply to WSe2 monolayers doped using a nitrogen plasma, demonstrating the capability of C-AFM to resolve electronic, and chemical-specific, details at the atomic scale.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
The hyperplane of early-type galaxies: using stellar population properties to increase the precision and accuracy of the fundamental plane as a distance indicator
Authors:
Francesco D'Eugenio,
Matthew Colless,
Arjen van der Wel,
Sam P. Vaughan,
Khaled Said,
Jesse van de Sande,
Joss Bland-Hawthorn,
Julia J. Bryant,
Scott M. Croom,
Angel R. Lopez-Sanchez,
Nuria P. F. Lorente,
Roberto Maiolino,
Edward N. Taylor
Abstract:
We use deep spectroscopy from the SAMI Galaxy Survey to explore the precision of the fundamental plane of early-type galaxies (FP) as a distance indicator for future single-fibre spectroscopy surveys. We study the optimal trade-off between sample size and signal-to-noise ratio (SNR), and investigate which additional observables can be used to construct hyperplanes with smaller intrinsic scatter th…
▽ More
We use deep spectroscopy from the SAMI Galaxy Survey to explore the precision of the fundamental plane of early-type galaxies (FP) as a distance indicator for future single-fibre spectroscopy surveys. We study the optimal trade-off between sample size and signal-to-noise ratio (SNR), and investigate which additional observables can be used to construct hyperplanes with smaller intrinsic scatter than the FP. We add increasing levels of random noise (parametrised as effective exposure time) to the SAMI spectra to study the effect of increasing measurement uncertainties on the FP-and hyperplane-inferred distances. We find that, using direct-fit methods, the values of the FP and hyperplane best-fit coefficients depend on the spectral SNR, and reach asymptotic values for a mean SNR=40 Å$^{-1}$. As additional variables for the FP we consider three stellar-population observables: light-weighted age, stellar mass-to-light ratio and a novel combination of Lick indices (I$_{\rm age}$). For a SNR=45 Å$^{-1}$ (equivalent to 1-hour exposure on a 4-m telescope), all three hyperplanes outperform the FP as distance indicators. Being an empirical spectral index, I$_{\rm age}$ avoids the model-dependent uncertainties and bias underlying age and mass-to-light ratio measurements, yet yields a 10 per cent reduction of the median distance uncertainty compared to the FP. We also find that, as a by-product, the Iage hyperplane removes most of the reported environment bias of the FP. After accounting for the different signal-to-noise ratio, these conclusions also apply to a 50 times larger sample from SDSS-III. However, in this case, only age removes the environment bias.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Leveraging Reinforcement Learning in Red Teaming for Advanced Ransomware Attack Simulations
Authors:
Cheng Wang,
Christopher Redino,
Ryan Clark,
Abdul Rahman,
Sal Aguinaga,
Sathvik Murli,
Dhruv Nandakumar,
Roland Rao,
Lanxiao Huang,
Daniel Radke,
Edward Bowen
Abstract:
Ransomware presents a significant and increasing threat to individuals and organizations by encrypting their systems and not releasing them until a large fee has been extracted. To bolster preparedness against potential attacks, organizations commonly conduct red teaming exercises, which involve simulated attacks to assess existing security measures. This paper proposes a novel approach utilizing…
▽ More
Ransomware presents a significant and increasing threat to individuals and organizations by encrypting their systems and not releasing them until a large fee has been extracted. To bolster preparedness against potential attacks, organizations commonly conduct red teaming exercises, which involve simulated attacks to assess existing security measures. This paper proposes a novel approach utilizing reinforcement learning (RL) to simulate ransomware attacks. By training an RL agent in a simulated environment mirroring real-world networks, effective attack strategies can be learned quickly, significantly streamlining traditional, manual penetration testing processes. The attack pathways revealed by the RL agent can provide valuable insights to the defense team, hel** them identify network weak points and develop more resilient defensive measures. Experimental results on a 152-host example network confirm the effectiveness of the proposed approach, demonstrating the RL agent's capability to discover and orchestrate attacks on high-value targets while evading honeyfiles (decoy files strategically placed to detect unauthorized access).
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Smart Casual Verification of CCF's Distributed Consensus and Consistency Protocols
Authors:
Heidi Howard,
Markus A. Kuppe,
Edward Ashton,
Amaury Chamayou,
Natacha Crooks
Abstract:
The Confidential Consortium Framework (CCF) is an open-source platform for develo** trustworthy and reliable cloud applications. CCF powers Microsoft's Azure Confidential Ledger service and as such it is vital to build confidence in the correctness of CCF's design and implementation. This paper reports our experiences applying smart casual verification to validate the correctness of CCF's novel…
▽ More
The Confidential Consortium Framework (CCF) is an open-source platform for develo** trustworthy and reliable cloud applications. CCF powers Microsoft's Azure Confidential Ledger service and as such it is vital to build confidence in the correctness of CCF's design and implementation. This paper reports our experiences applying smart casual verification to validate the correctness of CCF's novel distributed protocols, focusing on its unique distributed consensus protocol and its custom client consistency model. We use the term smart casual verification to describe our hybrid approach, which combines the rigor of formal specification and model checking with the pragmatism of automated testing, in our case binding the formal specification in TLA+ to the C++ implementation. While traditional formal methods approaches require substantial buy-in and are often one-off efforts by domain experts, we have integrated our smart casual verification approach into CCF's continuous integration pipeline, allowing contributors to continuously validate CCF as it evolves. We describe the challenges we faced in applying smart casual verification to a complex existing codebase and how we overcame them to find subtle bugs in the design and implementation before they could impact production.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Detecting Frames in News Headlines and Lead Images in U.S. Gun Violence Coverage
Authors:
Isidora Chara Tourni,
Lei Guo,
Hengchang Hu,
Edward Halim,
Prakash Ishwar,
Taufiq Daryanto,
Mona Jalal,
Boqi Chen,
Margrit Betke,
Fabian Zhafransyah,
Sha Lai,
Derry Tanti Wijaya
Abstract:
News media structure their reporting of events or issues using certain perspectives.
When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called \say{frames} in communication research. We study, for the first time, the value of combining lead i…
▽ More
News media structure their reporting of events or issues using certain perspectives.
When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called \say{frames} in communication research. We study, for the first time, the value of combining lead images and their contextual information with text to identify the frame of a given news article. We observe that using multiple modes of information(article- and image-derived features) improves prediction of news frames over any single mode of information when the images are relevant to the frames of the headlines. We also observe that frame image relevance is related to the ease of conveying frames via images, which we call frame concreteness. Additionally, we release the first multimodal news framing dataset related to gun violence in the U.S., curated and annotated by communication researchers. The dataset will allow researchers to further examine the use of multiple information modalities for studying media framing.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Upgrading the Submillimeter Array: wSMA and beyond
Authors:
Paul K. Grimes,
Garrett K. Keating,
Raymond Blundell,
Robert D. Christensen,
Mark Gurwell,
Attila Kovacs,
Timothy Norton,
Scott N. Paine,
Ramprasad Rao,
Edward C. -Y. Tong,
Jonathan Weintroub,
David Wilner,
Robert W. Wilson,
Lingzhen Zeng,
Qizhou Zhang
Abstract:
The Submillimeter Array (SMA) is an array of 8 antennas operating at millimeter and submillimeter wavelengths on Maunakea, Hawaii, operated by the Smithsonian Astrophysical Observatory and Academia Sinica Institute of Astronomy and Astrophysics, Taiwan. Over the past several years, we have been preparing a major upgrade to the SMA that will replace the aging original receiver cryostats and receive…
▽ More
The Submillimeter Array (SMA) is an array of 8 antennas operating at millimeter and submillimeter wavelengths on Maunakea, Hawaii, operated by the Smithsonian Astrophysical Observatory and Academia Sinica Institute of Astronomy and Astrophysics, Taiwan. Over the past several years, we have been preparing a major upgrade to the SMA that will replace the aging original receiver cryostats and receiver cartridges with all new cryostats and new 230 and 345 GHz receiver designs. This wideband upgrade (wSMA) will also include significantly increased instantaneous bandwidth, improved sensitivity, and greater capabilities for dual frequency observations. In this paper, we will describe the wSMA receiver upgrade and status, as well as the future upgrades that will be enabled by the deployment of the wSMA receivers.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
AGN STORM 2: IX. Studying the Dynamics of the Ionized Obscurer in Mrk 817 with High-resolution X-ray Spectroscopy
Authors:
Fatima Zaidouni,
Erin Kara,
Peter Kosec,
Missagh Mehdipour,
Daniele Rogantini,
Gerard A. Kriss,
Ehud Behar,
Jelle Kaastra,
Aaron J. Barth,
Edward M. Cackett,
Gisella De Rosa,
Yasaman Homayouni,
Keith Horne,
Hermine Landt,
Nahum Arav,
Misty C. Bentz,
Michael S. Brotherton,
Elena Dalla Bontà,
Maryam Dehghanian,
Gary J. Ferland,
Carina Fian,
Jonathan Gelbord,
Michael R. Goad,
Diego H. González Buitrago,
Catherine J. Grier
, et al. (23 additional authors not shown)
Abstract:
We present the results of the XMM-Newton and NuSTAR observations taken as part of the ongoing, intensive multi-wavelength monitoring program of the Seyfert 1 galaxy Mrk 817 by the AGN Space Telescope and Optical Reverberation Map** 2 (AGN STORM 2) Project. The campaign revealed an unexpected and transient obscuring outflow, never before seen in this source. Of our four XMM-Newton/NuSTAR epochs,…
▽ More
We present the results of the XMM-Newton and NuSTAR observations taken as part of the ongoing, intensive multi-wavelength monitoring program of the Seyfert 1 galaxy Mrk 817 by the AGN Space Telescope and Optical Reverberation Map** 2 (AGN STORM 2) Project. The campaign revealed an unexpected and transient obscuring outflow, never before seen in this source. Of our four XMM-Newton/NuSTAR epochs, one fortuitously taken during a bright X-ray state has strong narrow absorption lines in the high-resolution grating spectra. From these absorption features, we determine that the obscurer is in fact a multi-phase ionized wind with an outflow velocity of $\sim$5200 km s$^{-1}$, and for the first time find evidence for a lower ionization component with the same velocity observed in absorption features in the contemporaneous HST spectra. This indicates that the UV absorption troughs may be due to dense clumps embedded in diffuse, higher ionization gas responsible for the X-ray absorption lines of the same velocity. We observe variability in the shape of the absorption lines on timescales of hours, placing the variable component at roughly 1000 $R_g$ if attributed to transverse motion along the line of sight. This estimate aligns with independent UV measurements of the distance to the obscurer suggesting an accretion disk wind at the inner broad line region. We estimate that it takes roughly 200 days for the outflow to travel from the disk to our line of sight, consistent with the timescale of the outflow's column density variations throughout the campaign.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records
Authors:
Yeonsu Kwon,
Jiho Kim,
Gyubok Lee,
Seongsu Bae,
Daeun Kyung,
Wonchul Cha,
Tom Pollard,
Alistair Johnson,
Edward Choi
Abstract:
Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system design…
▽ More
Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system designs and human errors, posing serious risks to patient safety. To address this, we developed EHRCon, a new dataset and task specifically designed to ensure data consistency between structured tables and unstructured notes in EHRs. EHRCon was crafted in collaboration with healthcare professionals using the MIMIC-III EHR dataset, and includes manual annotations of 3,943 entities across 105 clinical notes checked against database entries for consistency. EHRCon has two versions, one using the original MIMIC-III schema, and another using the OMOP CDM schema, in order to increase its applicability and generalizability. Furthermore, leveraging the capabilities of large language models, we introduce CheckEHR, a novel framework for verifying the consistency between clinical notes and database tables. CheckEHR utilizes an eight-stage process and shows promising results in both few-shot and zero-shot settings. The code is available at https://github.com/dustn1259/EHRCon.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Predicting electronic screening for fast Koopmans spectral functional calculations
Authors:
Yannick Schubert,
Sandra Luber,
Nicola Marzari,
Edward Linscott
Abstract:
Koopmans spectral functionals represent a powerful extension of Kohn-Sham density-functional theory (DFT), enabling accurate predictions of spectral properties with state-of-the-art accuracy. The success of these functionals relies on capturing the effects of electronic screening through scalar, orbital-dependent parameters. These parameters have to be computed for every calculation, making Koopma…
▽ More
Koopmans spectral functionals represent a powerful extension of Kohn-Sham density-functional theory (DFT), enabling accurate predictions of spectral properties with state-of-the-art accuracy. The success of these functionals relies on capturing the effects of electronic screening through scalar, orbital-dependent parameters. These parameters have to be computed for every calculation, making Koopmans spectral functionals more expensive than their DFT counterparts. In this work, we present a machine-learning model that -- with minimal training -- can predict these screening parameters directly from orbital densities calculated at the DFT level. We show on two prototypical use cases that using the screening parameters predicted by this model, instead of those calculated from linear response, leads to orbital energies that differ by less than 20 meV on average. Since this approach dramatically reduces run-times with minimal loss of accuracy, it will enable the application of Koopmans spectral functionals to classes of problems that previously would have been prohibitively expensive, such as the prediction of temperature-dependent spectral properties.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Symmetries in 3D photoelectron momentum spectroscopy as precursory methods for dichroic and enantiosensitive measurements
Authors:
Michael Davino,
Edward McManus,
Tobias Saule,
Phi-Hung Tran,
Andrés F. Ordóñez,
George Gibson,
Anh-Thu Le,
Carlos A. Trallero-Herrero
Abstract:
3D photoelectron angular distributions (PADs) are measured from an atomic target ionized by ultrafast, elliptical fields of opposite handedness. Comparing these PADs to one another and to numeric simulations, a difficult to avoid systematic error in their orientation is identified and subsequently corrected by imposing the dichroic symmetry by which they are necessarily related. We show that this…
▽ More
3D photoelectron angular distributions (PADs) are measured from an atomic target ionized by ultrafast, elliptical fields of opposite handedness. Comparing these PADs to one another and to numeric simulations, a difficult to avoid systematic error in their orientation is identified and subsequently corrected by imposing the dichroic symmetry by which they are necessarily related. We show that this correction can be directly applied to molecular targets in the same fields. This paves the way for measurement of enantiosensitive information which has yet to be accessed experimentally.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Transferable Tactile Transformers for Representation Learning Across Diverse Sensors and Tasks
Authors:
Jialiang Zhao,
Yuxiang Ma,
Lirui Wang,
Edward H. Adelson
Abstract:
This paper presents T3: Transferable Tactile Transformers, a framework for tactile representation learning that scales across multi-sensors and multi-tasks. T3 is designed to overcome the contemporary issue that camera-based tactile sensing is extremely heterogeneous, i.e. sensors are built into different form factors, and existing datasets were collected for disparate tasks. T3 captures the share…
▽ More
This paper presents T3: Transferable Tactile Transformers, a framework for tactile representation learning that scales across multi-sensors and multi-tasks. T3 is designed to overcome the contemporary issue that camera-based tactile sensing is extremely heterogeneous, i.e. sensors are built into different form factors, and existing datasets were collected for disparate tasks. T3 captures the shared latent information across different sensor-task pairings by constructing a shared trunk transformer with sensor-specific encoders and task-specific decoders. The pre-training of T3 utilizes a novel Foundation Tactile (FoTa) dataset, which is aggregated from several open-sourced datasets and it contains over 3 million data points gathered from 13 sensors and 11 tasks. FoTa is the largest and most diverse dataset in tactile sensing to date and it is made publicly available in a unified format. Across various sensors and tasks, experiments show that T3 pre-trained with FoTa achieved zero-shot transferability in certain sensor-task pairings, can be further fine-tuned with small amounts of domain-specific data, and its performance scales with bigger network sizes. T3 is also effective as a tactile encoder for long horizon contact-rich manipulation. Results from sub-millimeter multi-pin electronics insertion tasks show that T3 achieved a task success rate 25% higher than that of policies trained with tactile encoders trained from scratch, or 53% higher than without tactile sensing. Data, code, and model checkpoints are open-sourced at https://t3.alanz.info.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents
Authors:
Jiho Kim,
Woosog Chay,
Hyeonji Hwang,
Daeun Kyung,
Hyunseung Chung,
Eunbyeol Cho,
Yohan Jo,
Edward Choi
Abstract:
Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of conversational agents, making them applicable to various fields (e.g., education). Despite their progress, the evaluation of the agents often overlooks the complexities of real-world conversations, such as real-time interactions, multi-party dialogues, and extended contextual dependencies. To bridge…
▽ More
Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of conversational agents, making them applicable to various fields (e.g., education). Despite their progress, the evaluation of the agents often overlooks the complexities of real-world conversations, such as real-time interactions, multi-party dialogues, and extended contextual dependencies. To bridge this gap, we introduce DialSim, a real-time dialogue simulator. In this simulator, an agent is assigned the role of a character from popular TV shows, requiring it to respond to spontaneous questions using past dialogue information and to distinguish between known and unknown information. Key features of DialSim include evaluating the agent's ability to respond within a reasonable time limit, handling long-term multi-party dialogues, and managing adversarial settings (e.g., swap character names) to challenge the agent's reliance on pre-trained knowledge. We utilized this simulator to evaluate the latest conversational agents and analyze their limitations. Our experiments highlight both the strengths and weaknesses of these agents, providing valuable insights for future improvements in the field of conversational AI. DialSim is available at https://github.com/jiho283/Simulator.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
The Black Hole Explorer: Motivation and Vision
Authors:
Michael D. Johnson,
Kazunori Akiyama,
Rebecca Baturin,
Bryan Bilyeu,
Lindy Blackburn,
Don Boroson,
Alejandro Cardenas-Avendano,
Andrew Chael,
Chi-kwan Chan,
Dominic Chang,
Peter Cheimets,
Cathy Chou,
Sheperd S. Doeleman,
Joseph Farah,
Peter Galison,
Ronald Gamble,
Charles F. Gammie,
Zachary Gelles,
Jose L. Gomez,
Samuel E. Gralla,
Paul Grimes,
Leonid I. Gurvits,
Shahar Hadar,
Kari Haworth,
Kazuhiro Hada
, et al. (43 additional authors not shown)
Abstract:
We present the Black Hole Explorer (BHEX), a mission that will produce the sharpest images in the history of astronomy by extending submillimeter Very-Long-Baseline Interferometry (VLBI) to space. BHEX will discover and measure the bright and narrow "photon ring" that is predicted to exist in images of black holes, produced from light that has orbited the black hole before esca**. This discovery…
▽ More
We present the Black Hole Explorer (BHEX), a mission that will produce the sharpest images in the history of astronomy by extending submillimeter Very-Long-Baseline Interferometry (VLBI) to space. BHEX will discover and measure the bright and narrow "photon ring" that is predicted to exist in images of black holes, produced from light that has orbited the black hole before esca**. This discovery will expose universal features of a black hole's spacetime that are distinct from the complex astrophysics of the emitting plasma, allowing the first direct measurements of a supermassive black hole's spin. In addition to studying the properties of the nearby supermassive black holes M87* and Sgr A*, BHEX will measure the properties of dozens of additional supermassive black holes, providing crucial insights into the processes that drive their creation and growth. BHEX will also connect these supermassive black holes to their relativistic jets, elucidating the power source for the brightest and most efficient engines in the universe. BHEX will address fundamental open questions in the physics and astrophysics of black holes that cannot be answered without submillimeter space VLBI. The mission is enabled by recent technological breakthroughs, including the development of ultra-high-speed downlink using laser communications, and it leverages billions of dollars of existing ground infrastructure. We present the motivation for BHEX, its science goals and associated requirements, and the pathway to launch within the next decade.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Percolating Cosmic String Networks from Kination
Authors:
Joseph P. Conlon,
Edmund J. Copeland,
Edward Hardy,
Noelia Sánchez González
Abstract:
We describe a new mechanism, whose ingredients are realised in string compactifications, for the formation of cosmic (super)string networks. Oscillating string loops grow when their tension $μ$ decreases with time. If $2H + \dotμ/μ< 0$, where $H$ is the Hubble parameter, loops grow faster than the scale factor and an initial population of isolated small loops (for example, produced by nucleation)…
▽ More
We describe a new mechanism, whose ingredients are realised in string compactifications, for the formation of cosmic (super)string networks. Oscillating string loops grow when their tension $μ$ decreases with time. If $2H + \dotμ/μ< 0$, where $H$ is the Hubble parameter, loops grow faster than the scale factor and an initial population of isolated small loops (for example, produced by nucleation) can grow, percolate and form a network. This condition is satisfied for fundamental strings in the background of a kinating volume modulus rolling towards the asymptotic large volume region of moduli space. Such long kination epochs are motivated in string cosmology by both the electroweak hierarchy problem and the need to solve the overshoot problem. The tension of such a network today is set by the final vacuum; for phenomenologically appealing Large Volume Scenario (LVS) vacua, this would lead to a fundamental string network with $G μ\sim 10^{-10}$.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
WellDunn: On the Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions
Authors:
Seyedali Mohammadi,
Edward Raff,
**endra Malekar,
Vedant Palit,
Francis Ferraro,
Manas Gaur
Abstract:
Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model's utility in clinical practice. A model that can be trusted for practice should have a correspondence between explanation and clinical determination, yet no prior research has examined the attention fidelit…
▽ More
Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model's utility in clinical practice. A model that can be trusted for practice should have a correspondence between explanation and clinical determination, yet no prior research has examined the attention fidelity of these models and their effect on ground truth explanations. We introduce an evaluation design that focuses on the robustness and explainability of LMs in identifying Wellness Dimensions (WD). We focus on two mental health and well-being datasets: (a) Multi-label Classification-based MultiWD, and (b) WellXplain for evaluating attention mechanism veracity against expert-labeled explanations. The labels are based on Halbert Dunn's theory of wellness, which gives grounding to our evaluation. We reveal four surprising results about LMs/LLMs: (1) Despite their human-like capabilities, GPT-3.5/4 lag behind RoBERTa, and MedAlpaca, a fine-tuned LLM fails to deliver any remarkable improvements in performance or explanations. (2) Re-examining LMs' predictions based on a confidence-oriented loss function reveals a significant performance drop. (3) Across all LMs/LLMs, the alignment between attention and explanations remains low, with LLMs scoring a dismal 0.0. (4) Most mental health-specific LMs/LLMs overlook domain-specific knowledge and undervalue explanations, causing these discrepancies. This study highlights the need for further research into their consistency and explanations in mental health and well-being.
△ Less
Submitted 28 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
An IXPE-Led X-ray Spectro-Polarimetric Campaign on the Soft State of Cygnus X-1: X-ray Polarimetric Evidence for Strong Gravitational Lensing
Authors:
James F. Steiner,
Edward Nathan,
Kun Hu,
Henric Krawczynski,
Michal Dovciak,
Alexandra Veledina,
Fabio Muleri,
Jiri Svoboda,
Kevin Alabarta,
Maxime Parra,
Yash Bhargava,
Giorgio Matt,
Juri Poutanen,
Pierre-Olivier Petrucci,
Allyn F. Tennant,
M. Cristina Baglio,
Luca Baldini,
Samuel Barnier,
Sudip Bhattacharyya,
Stefano Bianchi,
Maimouna Brigitte,
Mauricio Cabezas,
Floriane Cangemi,
Fiamma Capitanio,
Jacob Casey
, et al. (112 additional authors not shown)
Abstract:
We present the first X-ray spectropolarimetric results for Cygnus X-1 in its soft state from a campaign of five IXPE observations conducted during 2023 May-June. Companion multiwavelength data during the campaign are likewise shown. The 2-8 keV X-rays exhibit a net polarization degree PD=1.99%+/-0.13% (68% confidence). The polarization signal is found to increase with energy across IXPE's 2-8 keV…
▽ More
We present the first X-ray spectropolarimetric results for Cygnus X-1 in its soft state from a campaign of five IXPE observations conducted during 2023 May-June. Companion multiwavelength data during the campaign are likewise shown. The 2-8 keV X-rays exhibit a net polarization degree PD=1.99%+/-0.13% (68% confidence). The polarization signal is found to increase with energy across IXPE's 2-8 keV bandpass. The polarized X-rays exhibit an energy-independent polarization angle of PA=-25.7+/-1.8 deg. East of North (68% confidence). This is consistent with being aligned to Cyg X-1's AU-scale compact radio jet and its pc-scale radio lobes. In comparison to earlier hard-state observations, the soft state exhibits a factor of 2 lower polarization degree, but a similar trend with energy and a similar (also energy-independent) position angle. When scaling by the natural unit of the disk temperature, we find the appearance of a consistent trendline in the polarization degree between soft and hard states. Our favored polarimetric model indicates Cyg X-1's spin is likely high (a* above ~0.96). The substantial X-ray polarization in Cyg X-1's soft state is most readily explained as resulting from a large portion of X-rays emitted from the disk returning and reflecting off the disk surface, generating a high polarization degree and a polarization direction parallel to the black hole spin axis and radio jet. In IXPE's bandpass, the polarization signal is dominated by the returning reflection emission. This constitutes polarimetric evidence for strong gravitational lensing of X-rays close to the black hole.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Volume Above Distance Below with Boundary II
Authors:
Brian Allen,
Edward Bryden
Abstract:
It was shown by B. Allen, R. Perales, and C. Sormani that on a closed manifold where the diameter of a sequence of Riemannian metrics is bounded, if the volume converges to the volume of a limit manifold, and the sequence of Riemannian metrics are $C^0$ converging from below then one can conclude volume preserving Sormani-Wenger Intrinsic Flat convergence. The result was extended to manifolds with…
▽ More
It was shown by B. Allen, R. Perales, and C. Sormani that on a closed manifold where the diameter of a sequence of Riemannian metrics is bounded, if the volume converges to the volume of a limit manifold, and the sequence of Riemannian metrics are $C^0$ converging from below then one can conclude volume preserving Sormani-Wenger Intrinsic Flat convergence. The result was extended to manifolds with boundary by B. Allen and R. Perales by a doubling with necks procedure which produced a closed manifold and reduced the case with boundary to the case without boundary. The consequence of the doubling with necks procedure was requiring a stronger condition than necessary on the boundary. Using the estimates for the Sormani-Wenger Intrinsic Flat distance on manifolds with boundary developed by B. Allen and R. Perales, we show that only a bound on the area of the boundary is needed in order to conclude volume preserving intrinsic flat convergence for manifolds with boundary. We also provide an example which shows that one should not expect convergence without a bound on area.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Scorecards for Synthetic Medical Data Evaluation and Reporting
Authors:
Ghada Zamzmi,
Adarsh Subbaswamy,
Elena Sizikova,
Edward Margerrison,
Jana Delfino,
Aldo Badano
Abstract:
The growing utilization of synthetic medical data (SMD) in training and testing AI-driven tools in healthcare necessitates a systematic framework for assessing SMD quality. The current lack of a standardized methodology to evaluate SMD, particularly in terms of its applicability in various medical scenarios, is a significant hindrance to its broader acceptance and utilization in healthcare applica…
▽ More
The growing utilization of synthetic medical data (SMD) in training and testing AI-driven tools in healthcare necessitates a systematic framework for assessing SMD quality. The current lack of a standardized methodology to evaluate SMD, particularly in terms of its applicability in various medical scenarios, is a significant hindrance to its broader acceptance and utilization in healthcare applications. Here, we outline an evaluation framework designed to meet the unique requirements of medical applications, and introduce the concept of SMD scorecards, which can serve as comprehensive reports that accompany artificially generated datasets. This can help standardize evaluation and enable SMD developers to assess and further enhance the quality of SMDs by identifying areas in need of attention and ensuring that the synthetic data more accurately approximate patient data.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
The Black Hole Explorer: Instrument System Overview
Authors:
Daniel P. Marrone,
Janice Houston,
Kazunori Akiyama,
Bryan Bilyeu,
Don Boroson,
Paul Grimes,
Kari Haworth,
Robert Lehmensiek,
Eliad Peretz,
Hannah Rana,
Laura C. Sinclair,
Sridharan Tirupati Kumara,
Ranjani Srinivasan,
Edward Tong,
Jade Wang,
Jonathan Weintroub,
Michael D. Johnson
Abstract:
The Black Hole Explorer (BHEX) is a space very-long-baseline interferometry (VLBI) mission concept that is currently under development. BHEX will study supermassive black holes at unprecedented resolution, isolating the signature of the "photon ring" - light that has orbited the black hole before esca** - to probe physics at the edge of the observable universe. It will also measure black hole sp…
▽ More
The Black Hole Explorer (BHEX) is a space very-long-baseline interferometry (VLBI) mission concept that is currently under development. BHEX will study supermassive black holes at unprecedented resolution, isolating the signature of the "photon ring" - light that has orbited the black hole before esca** - to probe physics at the edge of the observable universe. It will also measure black hole spins, study the energy extraction and acceleration mechanisms for black hole jets, and characterize the black hole mass distribution. BHEX achieves high angular resolution by joining with ground-based millimeter-wavelength VLBI arrays, extending the size, and therefore improving the angular resolution of the earthbound telescopes. Here we discuss the science instrument concept for BHEX. The science instrument for BHEX is a dual-band, coherent receiver system for 80-320 GHz, coupled to a 3.5-meter antenna. BHEX receiver front end will observe simultaneously with dual polarizations in two bands, one sampling 80-106 GHz and one sampling 240-320 GHz. An ultra-stable quartz oscillator provides the master frequency reference and ensures coherence for tens of seconds. To achieve the required sensitivity, the front end will instantaneously receive 32 GHz of frequency bandwidth, which will be digitized to 64 Gbits/sec of incompressible raw data. These data will be buffered and transmitted to the ground via laser data link, for correlation with data recorded simultaneously at radio telescopes on the ground. We describe the challenges associated with the instrument concept and the solutions that have been incorporated into the baseline design.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
The Black Hole Explorer Cryocooling Instrument
Authors:
Hannah Rana,
Kazunori Akiyama,
Edgar Canavan,
Michael DiPirro,
Mark Freeman,
Peter Galison,
Paul Grimes,
Mareki Honma,
Janice Houston,
Michael Johnson,
Mark Kimball,
Daniel Marrone,
Edward Tong
Abstract:
The Black Hole Explorer (BHEX) is a space-based very-long baseline interferometry (VLBI) mission aimed at precision black hole measurements, detecting the photon ring around black holes, exploring spacetime, spin, and mass properties, and validating predictions of General Relativity. These objectives are achieved using cryogenic receivers with quantum-limited sensitivities across a broad frequency…
▽ More
The Black Hole Explorer (BHEX) is a space-based very-long baseline interferometry (VLBI) mission aimed at precision black hole measurements, detecting the photon ring around black holes, exploring spacetime, spin, and mass properties, and validating predictions of General Relativity. These objectives are achieved using cryogenic receivers with quantum-limited sensitivities across a broad frequency range. Dual-band receivers at 80-106 GHz and 240-320 GHz require operating temperatures of 20 K and 4.5 K, respectively. A cryocooling system with two cold stages will be employed: a 20 K stage handling a 125 mW heat load and a 4.5 K stage handling a 10 mW heat load.
To design the cryocooling system, the mission leverages existing space industry technology at high Technology Readiness Levels (TRLs), informed by missions such as Planck, JEM/SMILES, Hitomi, and XRISM, and advancements from the ACTDP/JWST program. Integrating the cryocooler with the receivers and broader instrument involves careful consideration of thermal challenges, including linking the cold ends of each cooling stage to minimize heat losses and ensuring adequate passive cooling for the cryocooler warm end heat rejection.
Key challenges and trade-offs include sizing the mass and reducing power consumption while meeting the receiver cold temperature requirements, which impact the scientific objectives. This paper addresses efforts to balance the scientific requirements with the limitations of technical cryocooling capabilities within the framework of a small-class (SMEX) space mission, presenting an overview of cooling needs, initial design considerations, a survey of 4 K spaceflight cryocooler developments, and trade-offs.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.