-
Designing an Evaluation Framework for Large Language Models in Astronomy Research
Authors:
John F. Wu,
Alina Hyk,
Kiera McCormick,
Christine Ye,
Simone Astarita,
Elina Baral,
Jo Ciuca,
Jesse Cranney,
Anjalie Field,
Kartheik Iyer,
Philipp Koehn,
Jenn Kotler,
Sandor Kruk,
Michelle Ntampaka,
Charles O'Neill,
Joshua E. G. Peek,
Sanjib Sharma,
Mikaeel Yunus
Abstract:
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy rese…
▽ More
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy researchers interact with LLMs. We deploy a Slack chatbot that can answer queries from users via Retrieval-Augmented Generation (RAG); these responses are grounded in astronomy papers from arXiv. We record and anonymize user questions and chatbot answers, user upvotes and downvotes to LLM responses, user feedback to the LLM, and retrieved documents and similarity scores with the query. Our data collection method will enable future dynamic evaluations of LLM tools for astronomy.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Authors:
Tuan Dung Nguyen,
Yuan-Sen Ting,
Ioana Ciucă,
Charlie O'Neill,
Ze-Chang Sun,
Maja Jabłońska,
Sandor Kruk,
Ernest Perkowski,
Jack Miller,
Jason Li,
Josh Peek,
Kartheik Iyer,
Tomasz Różański,
Pranav Khetarpal,
Sharaf Zaman,
David Brodrick,
Sergio J. Rodríguez Méndez,
Thang Bui,
Alyssa Goodman,
Alberto Accomazzi,
Jill Naiman,
Jesse Cranney,
Kevin Schawinski,
UniverseTBD
Abstract:
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marke…
▽ More
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
MAVIS: performance estimation of the adaptive optics module
Authors:
Guido Agapito,
Daniele Vassallo,
Cédric Plantet,
Jesse Cranney,
Hao Zhang,
Valentina Viotto,
Enrico Pinna,
Francois Rigaut
Abstract:
The MCAO Assisted Visible Imager and Spectrograph (MAVIS) is a new visible instrument for ESO Very Large Telescope (VLT). Its Adaptive Optics Module (AOM) must provide extreme adaptive optics correction level at low galactic latitude and high sky coverage at the galactic pole on the FoV of 30arcsec of its 4k x 4k optical imager and on its monolithic Integral Field Unit, thanks to 3 deformable mirr…
▽ More
The MCAO Assisted Visible Imager and Spectrograph (MAVIS) is a new visible instrument for ESO Very Large Telescope (VLT). Its Adaptive Optics Module (AOM) must provide extreme adaptive optics correction level at low galactic latitude and high sky coverage at the galactic pole on the FoV of 30arcsec of its 4k x 4k optical imager and on its monolithic Integral Field Unit, thanks to 3 deformable mirrors (DM), 8 Laser Guide Stars (LGS), up to 3 Natural Guide Stars (NGS) and 11 Wave Front Sensors (WFS). A careful performance estimation is required to drive the design of this module and to assess the fulfillment of the system and subsystems requirements. Here we present the work done on this topic during the last year: we updated the system parameters to account for the phase B design and for more realistic conditions, and we produced a set of results from analytical and end-to-end simulations that should give a as complete as possible view on the performance of the system.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
On-sky validation of image-based adaptive optics wavefront sensor referencing
Authors:
Nour Skaf,
Olivier Guyon,
Eric Gendron,
Kyohoon Ahn,
Arielle Bertrou-Cantou,
Anthony Boccaletti,
Jesse Cranney,
Thayne Currie,
Vincent Deo,
Billy Edwards,
Florian Ferreira,
Damien Gratadour,
Julien Lozi,
Barnaby Norris,
Arnaud Sevin,
Fabrice Vidal,
Sebastien Vievard
Abstract:
Differentiating between an exoplanet signal and residual speckle noise is a key challenge in high-contrast imaging. Speckles are due to a combination of fast, slow and static wavefront aberrations introduced by atmospheric turbulence and instrument optics. While wavefront control techniques developed over the last decade have shown promise in minimizing fast atmospheric residuals, slow and static…
▽ More
Differentiating between an exoplanet signal and residual speckle noise is a key challenge in high-contrast imaging. Speckles are due to a combination of fast, slow and static wavefront aberrations introduced by atmospheric turbulence and instrument optics. While wavefront control techniques developed over the last decade have shown promise in minimizing fast atmospheric residuals, slow and static aberrations such as non-common path aberrations (NCPAs) remain a key limiting factor for exoplanet detection. NCPA are not seen by the wavefront sensor (WFS) of the adaptive optics (AO) loop, hence the difficulty in correcting them. We propose to improve the identification and rejection of those aberrations. The algorithm DrWHO, performs frequent compensation of static and quasi-static aberrations to boost image contrast. By changing the WFS reference at every iteration of the algorithm, DrWHO changes the AO point of convergence to lead it towards a compensation of the static and slow aberrations. References are calculated using an iterative lucky-imaging approach, where each iteration updates the WFS reference, ultimately favoring high-quality focal plane images. We validate this concept through numerical simulations and on-sky testing on the SCExAO instrument at the 8.2-m Subaru telescope. Simulations show a rapid convergence towards the correction of 82% of the NCPAs. On-sky tests are performed over a 10-minute run in the visible (750 nm). We introduce a flux concentration (FC) metric to quantify the point spread function (PSF) quality and measure a 15.7% improvement. The DrWHO algorithm is a robust focal-plane wavefront sensing calibration method that has been successfully demonstrated on sky. It does not rely on a model nor requires wavefront sensor calibration or linearity. It is compatible with different wavefront control methods, and can be further optimized for speed and efficiency.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
Towards Realistic Modeling of the Astrometric Capabilities of MCAO Systems: Detecting an Intermediate Mass Black Hole with MAVIS
Authors:
Stephanie Monty,
Francois Rigaut,
Richard McDermid,
Holger Baumgardt,
Jesse Cranney,
Guido Agapito,
J. Trevor Mendel,
Cedric Plantet,
Davide Greggio,
Peter B. Stetson,
Giuliana Fiorentino,
Dionne Haynes
Abstract:
Accurate astrometry is a key deliverable for the next generation of multi-conjugate adaptive optics (MCAO) systems. The MCAO Visible Imager and Spectrograph (MAVIS) is being designed for the Very Large Telescope Adaptive Optics Facility and must achieve 150 $μ$as astrometric precision (50 $μ$as goal). To test this before going on-sky, we have created MAVISIM, a tool to simulate MAVIS images. MAVIS…
▽ More
Accurate astrometry is a key deliverable for the next generation of multi-conjugate adaptive optics (MCAO) systems. The MCAO Visible Imager and Spectrograph (MAVIS) is being designed for the Very Large Telescope Adaptive Optics Facility and must achieve 150 $μ$as astrometric precision (50 $μ$as goal). To test this before going on-sky, we have created MAVISIM, a tool to simulate MAVIS images. MAVISIM accounts for three major sources of astrometric error, high- and low-order point spread function (PSF) spatial variability, tip-tilt residual error and static field distortion. When exploring the impact of these three error terms alone, we recover an astrometric accuracy of 50 $μ$as for all stars brighter than $m=19$ in a 30s integration using PSF-fitting photometry. We also assess the feasibility of MAVIS detecting an intermediate mass black hole (IMBH) in a Milky Way globular cluster. We use an N-body simulation of an NGC 3201-like cluster with a central 1500 M$_{\odot}$ IMBH as input to MAVISIM and recover the velocity dispersion profile from proper motion measurements. Under favourable astrometric conditions, the dynamical signature of the IMBH is detected with a precision of ~0.20 km/s in the inner ~4" of the cluster where HST is confusion-limited. This precision is comparable to measurements made by Gaia, HST and MUSE in the outer ~60" of the cluster. This study is the first step towards building a science-driven astrometric error budget for an MCAO system and a prediction of what MAVIS could do once on sky.
△ Less
Submitted 2 August, 2021; v1 submitted 28 July, 2021;
originally announced July 2021.