Showing 1–2 of 2 results for author: Carr, J C
-
Conditions on Preference Relations that Guarantee the Existence of Optimal Policies
Authors:
Jonathan Colaço Carr,
Prakash Panangaden,
Doina Precup
Abstract:
Learning from Preferential Feedback (LfPF) plays an essential role in training Large Language Models, as well as certain types of interactive learning agents. However, a substantial gap exists between the theory and application of LfPF algorithms. Current results guaranteeing the existence of optimal policies in LfPF problems assume that both the preferences and transition dynamics are determined…
▽ More
Learning from Preferential Feedback (LfPF) plays an essential role in training Large Language Models, as well as certain types of interactive learning agents. However, a substantial gap exists between the theory and application of LfPF algorithms. Current results guaranteeing the existence of optimal policies in LfPF problems assume that both the preferences and transition dynamics are determined by a Markov Decision Process. We introduce the Direct Preference Process, a new framework for analyzing LfPF problems in partially-observable, non-Markovian environments. Within this framework, we establish conditions that guarantee the existence of optimal policies by considering the ordinal structure of the preferences. We show that a decision-making problem can have optimal policies -- that are characterized by recursive optimality equations -- even when no reward function can express the learning goal. These findings underline the need to explore preference-based learning strategies which do not assume that preferences are generated by reward.
△ Less
Submitted 27 March, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Machine-learning recovery of foreground wedge-removed 21-cm light cones for high-$z$ galaxy map**
Authors:
Jacob Kennedy,
Jonathan Colaço Carr,
Samuel Gagnon-Hartman,
Adrian Liu,
Jordan Mirocha,
Yue Cui
Abstract:
Upcoming experiments will map the spatial distribution of the 21-cm signal over three-dimensional volumes of space during the Epoch of Reionization (EoR). Several methods have been proposed to mitigate the issue of astrophysical foreground contamination in tomographic images of the 21-cm signal, one of which involves the excision of a wedge-shaped region in cylindrical Fourier space. While this re…
▽ More
Upcoming experiments will map the spatial distribution of the 21-cm signal over three-dimensional volumes of space during the Epoch of Reionization (EoR). Several methods have been proposed to mitigate the issue of astrophysical foreground contamination in tomographic images of the 21-cm signal, one of which involves the excision of a wedge-shaped region in cylindrical Fourier space. While this removes the $k$-modes most readily contaminated by foregrounds, the concurrent removal of cosmological information located within the wedge considerably distorts the structure of 21-cm images. In this study, we build upon a U-Net based deep learning algorithm to reconstruct foreground wedge-removed maps of the 21-cm signal, newly incorporating light-cone effects. Adopting the Square Kilometre Array (SKA) as our fiducial instrument, we highlight that our U-Net recovery framework retains a reasonable level of reliability even in the face of instrumental limitations and noise. We subsequently evaluate the efficacy of recovered maps in guiding high-redshift galaxy searches and providing context to existing galaxy catalogues. This will allow for studies of how the high-redshift galaxy luminosity function varies across environments, and ultimately refine our understanding of the connection between the ionization state of the intergalactic medium (IGM) and galaxies during the EoR.
△ Less
Submitted 12 March, 2024; v1 submitted 18 August, 2023;
originally announced August 2023.