Search | arXiv e-print repository

arXiv:2406.19370 [pdf, other]

Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

Authors: Core Francisco Park, Maya Okawa, Andrew Lee, Ekdeep Singh Lubana, Hidenori Tanaka

Abstract: Modern generative models demonstrate impressive capabilities, likely stemming from an ability to identify and manipulate abstract concepts underlying their training data. However, fundamental questions remain: what determines the concepts a model learns, the order in which it learns them, and its ability to manipulate those concepts? To address these questions, we propose analyzing a model's learn… ▽ More Modern generative models demonstrate impressive capabilities, likely stemming from an ability to identify and manipulate abstract concepts underlying their training data. However, fundamental questions remain: what determines the concepts a model learns, the order in which it learns them, and its ability to manipulate those concepts? To address these questions, we propose analyzing a model's learning dynamics via a framework we call the concept space, where each axis represents an independent concept underlying the data generating process. By characterizing learning dynamics in this space, we identify how the speed at which a concept is learned, and hence the order of concept learning, is controlled by properties of the data we term concept signal. Further, we observe moments of sudden turns in the direction of a model's learning dynamics in concept space. Surprisingly, these points precisely correspond to the emergence of hidden capabilities, i.e., where latent interventions show the model possesses the capability to manipulate a concept, but these capabilities cannot yet be elicited via naive input prompting. While our results focus on synthetically defined toy datasets, we hypothesize a general claim on emergence of hidden capabilities may hold: generative models possess latent capabilities that emerge suddenly and consistently during training, though a model might not exhibit these capabilities under naive input prompting. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Preprint

arXiv:2403.10648 [pdf, other]

Debiasing with Diffusion: Probabilistic reconstruction of Dark Matter fields from galaxies with CAMELS

Authors: Victoria Ono, Core Francisco Park, Nayantara Mudur, Yueying Ni, Carolina Cuesta-Lazaro, Francisco Villaescusa-Navarro

Abstract: Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between dark matter density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation mo… ▽ More Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between dark matter density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation models, that remain uncertain in many aspects. In this work, we develop a diffusion generative model to reconstruct dark matter fields from galaxies. The diffusion model is trained on the CAMELS simulation suite that contains thousands of state-of-the-art galaxy formation simulations with varying cosmological parameters and sub-grid astrophysics. We demonstrate that the diffusion model can predict the unbiased posterior distribution of the underlying dark matter fields from the given stellar mass fields, while being able to marginalize over uncertainties in cosmological and astrophysical models. Interestingly, the model generalizes to simulation volumes approximately 500 times larger than those it was trained on, and across different galaxy formation models. Code for reproducing these results can be found at https://github.com/victoriaono/variational-diffusion-cdm △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2312.15386 [pdf, other]

Hyperspectral shadow removal with Iterative Logistic Regression and latent Parametric Linear Combination of Gaussians

Authors: Core Francisco Park, Maya Nasr, Manuel Pérez-Carrasco, Eleanor Walker, Douglas Finkbeiner, Cecilia Garraffo

Abstract: Shadow detection and removal is a challenging problem in the analysis of hyperspectral images. Yet, this step is crucial for analyzing data for remote sensing applications like methane detection. In this work, we develop a shadow detection and removal method only based on the spectrum of each pixel and the overall distribution of spectral values. We first introduce Iterative Logistic Regression (I… ▽ More Shadow detection and removal is a challenging problem in the analysis of hyperspectral images. Yet, this step is crucial for analyzing data for remote sensing applications like methane detection. In this work, we develop a shadow detection and removal method only based on the spectrum of each pixel and the overall distribution of spectral values. We first introduce Iterative Logistic Regression (ILR) to learn a spectral basis in which shadows can be linearly classified. We then model the joint distribution of the mean radiance and the projection coefficients of the spectra onto the above basis as a parametric linear combination of Gaussians. We can then extract the maximum likelihood mixing parameter of the Gaussians to estimate the shadow coverage and to correct the shadowed spectra. Our correction scheme reduces correction artefacts at shadow borders. The shadow detection and removal method is applied to hyperspectral images from MethaneAIR, a precursor to the satellite MethaneSAT. △ Less

Submitted 23 December, 2023; originally announced December 2023.

arXiv:2311.08558 [pdf, other]

Probabilistic reconstruction of Dark Matter fields from biased tracers using diffusion models

Authors: Core Francisco Park, Victoria Ono, Nayantara Mudur, Yueying Ni, Carolina Cuesta-Lazaro

Abstract: Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. The relationship between dark matter density fields and galaxy distributions can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation models, that remain uncertain in many aspects. Based on state-of-the-art galaxy… ▽ More Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. The relationship between dark matter density fields and galaxy distributions can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation models, that remain uncertain in many aspects. Based on state-of-the-art galaxy formation simulation suites with varied cosmological parameters and sub-grid astrophysics, we develop a diffusion generative model to predict the unbiased posterior distribution of the underlying dark matter fields from the given stellar mass fields, while being able to marginalize over the uncertainties in cosmology and galaxy formation. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2212.04514 [pdf, other]

doi 10.3847/1538-4357/acc32c

Stellar Reddening Based Extinction Maps for Cosmological Applications

Authors: Nayantara Mudur, Core Francisco Park, Douglas P Finkbeiner

Abstract: Cosmological surveys must correct their observations for the reddening of extragalactic objects by Galactic dust. Existing dust maps, however, have been found to have spatial correlations with the large-scale structure of the Universe. Errors in extinction maps can propagate systematic biases into samples of dereddened extragalactic objects and into cosmological measurements such as correlation fu… ▽ More Cosmological surveys must correct their observations for the reddening of extragalactic objects by Galactic dust. Existing dust maps, however, have been found to have spatial correlations with the large-scale structure of the Universe. Errors in extinction maps can propagate systematic biases into samples of dereddened extragalactic objects and into cosmological measurements such as correlation functions between foreground lenses and background objects and the primordial non-gaussianity parameter $f_{NL}$. Emission-based maps are contaminated by the cosmic infrared background, while maps inferred from stellar-reddenings suffer from imperfect removal of quasars and galaxies from stellar catalogs. Thus, stellar-reddening based maps using catalogs without extragalactic objects offer a promising path to making dust maps with minimal correlations with large-scale structure. We present two high-latitude integrated extinction maps based on stellar reddenings, with a point spread function of full-width half-maximum 6.1' and 15'. We employ a strict selection of catalog objects to filter out galaxies and quasars and measure the spatial correlation of our extinction maps with extragalactic structure. Our galactic extinction maps have reduced spatial correlation with large scale structure relative to most existing stellar-reddening based and emission-based extinction maps. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: 21 pages, 10 figures

arXiv:2204.05435 [pdf, other]

doi 10.3847/1538-4357/acbe3b

Quantification of high dimensional non-Gaussianities and its implication to Fisher analysis in cosmology

Authors: Core Francisco Park, Erwan Allys, Francisco Villaescusa-Navarro, Douglas P. Finkbeiner

Abstract: It is well known that the power spectrum is not able to fully characterize the statistical properties of non-Gaussian density fields. Recently, many different statistics have been proposed to extract information from non-Gaussian cosmological fields that perform better than the power spectrum. The Fisher matrix formalism is commonly used to quantify the accuracy with which a given statistic can co… ▽ More It is well known that the power spectrum is not able to fully characterize the statistical properties of non-Gaussian density fields. Recently, many different statistics have been proposed to extract information from non-Gaussian cosmological fields that perform better than the power spectrum. The Fisher matrix formalism is commonly used to quantify the accuracy with which a given statistic can constrain the value of the cosmological parameters. However, these calculations typically rely on the assumption that the likelihood of the considered statistic follows a multivariate Gaussian distribution. In this work we follow Sellentin & Heavens (2017) and use two different statistical tests to identify non-Gaussianities in different statistics such as the power spectrum, bispectrum, marked power spectrum, and wavelet scatering transform (WST). We remove the non-Gaussian components of the different statistics and perform Fisher matrix calculations with the \textit{Gaussianized} statistics using Quijote simulations. We show that constraints on the parameters can change by a factor of $\sim 2$ in some cases. We show with simple examples how statistics that do not follow a multivariate Gaussian distribution can achieve artificially tight bounds on the cosmological parameters when using the Fisher matrix formalism. We think that the non-Gaussian tests used in this work represent a powerful tool to quantify the robustness of Fisher matrix calculations and their underlying assumptions. We release the code used to compute the power spectra, bispectra, and WST that can be run on both CPUs and GPUs. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: 24 pages, 6 figures

arXiv:2110.06421 [pdf, other]

Revisiting Latent-Space Interpolation via a Quantitative Evaluation Framework

Authors: Lu Mi, Tianxing He, Core Francisco Park, Hao Wang, Yue Wang, Nir Shavit

Abstract: Latent-space interpolation is commonly used to demonstrate the generalization ability of deep latent variable models. Various algorithms have been proposed to calculate the best trajectory between two encodings in the latent space. In this work, we show how data labeled with semantically continuous attributes can be utilized to conduct a quantitative evaluation of latent-space interpolation algori… ▽ More Latent-space interpolation is commonly used to demonstrate the generalization ability of deep latent variable models. Various algorithms have been proposed to calculate the best trajectory between two encodings in the latent space. In this work, we show how data labeled with semantically continuous attributes can be utilized to conduct a quantitative evaluation of latent-space interpolation algorithms, for variational autoencoders. Our framework can be used to complement the standard qualitative comparison, and also enables evaluation for domains (such as graph) in which the visualization is difficult. Interestingly, our experiments reveal that the superiority of interpolation algorithms could be domain-dependent. While normalised interpolation works best for the image domain, spherical linear interpolation achieves the best performance in the graph domain. Next, we propose a simple-yet-effective method to restrict the latent space via a bottleneck structure in the encoder. We find that all interpolation algorithms evaluated in this work can benefit from this restriction. Finally, we conduct interpolation-aware training with the labeled attributes, and show that this explicit supervision can improve the interpolation performance. △ Less

Submitted 12 October, 2021; originally announced October 2021.

Comments: 11 pages

Showing 1–7 of 7 results for author: Park, C F