-
The rate and contribution of mergers to mass assembly from NIRCam observations of galaxy candidates up to 13.3 billion years ago
Authors:
Nicolò Dalmasso,
Antonello Calabrò,
Nicha Leethochawalit,
Benedetta Vulcani,
Kristan Boyett,
Michele Trenti,
Tommaso Treu,
Marco Castellano,
Maruša Bradač,
Benjamin Metha,
Paola Santini
Abstract:
We present an analysis of the galaxy merger rate in the redshift range $4.0<z<9.0$ (i.e. about 1.5 to 0.5 Gyr after the Big Bang) based on visually identified galaxy mergers from morphological parameter analysis. Our dataset is based on high-resolution NIRCam JWST data (F150W and F2000W broad-band filters) in the low-to-moderate magnification ($μ<2$) regions of the Abell 2744 cluster field. From a…
▽ More
We present an analysis of the galaxy merger rate in the redshift range $4.0<z<9.0$ (i.e. about 1.5 to 0.5 Gyr after the Big Bang) based on visually identified galaxy mergers from morphological parameter analysis. Our dataset is based on high-resolution NIRCam JWST data (F150W and F2000W broad-band filters) in the low-to-moderate magnification ($μ<2$) regions of the Abell 2744 cluster field. From a parent set of 675 galaxies $(M_{UV}\in[-26.6,-17.9])$, we identify 64 merger candidates from the Gini, $M_{20}$ and Asymmetry morphological parameters, leading to a merger fraction $f_m=0.11\pm0.04$. There is no evidence of redshift evolution of $f_m$ even at the highest redshift considered, thus extending well into the epoch of reionization the constant trend seen previously at $z\lesssim 6$. Furthermore, we investigate any potential redshift dependent differences in the specific star formation rates between mergers and non-mergers. Our analysis reveals no significant correlation in this regard, with deviations in the studied redshift range typically falling within $0.25$ dex (logarithmic scale) that can be attributed to sample variance and measurement errors. Finally, we also demonstrate that the classification of a merging system is robust with respect to the observed (and equivalently rest-frame) wavelength of the high-quality JWST broad-band images used. This preliminary study highlights the potential for progress in quantifying galaxy assembly through mergers during the epoch of reionization, with significant sample size growth expected from upcoming large JWST infrared imaging datasets.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Galaxy clustering at cosmic dawn from JWST/NIRCam observations to redshift z$\sim$11
Authors:
Nicolò Dalmasso,
Nicha Leethochawalit,
Michele Trenti,
Kristan Boyett
Abstract:
We report measurements of the galaxy two-point correlation function at cosmic dawn, using photometrically-selected sources from the JWST Advanced Deep Extragalactic Survey (JADES). The JWST/NIRCam dataset comprises approximately $N_g \simeq 7000$ photometrically-selected Lyman Break Galaxies (LBGs), spanning from $z=5.5$ up to $z=10.6$. The primary objective of this study is to extend clustering m…
▽ More
We report measurements of the galaxy two-point correlation function at cosmic dawn, using photometrically-selected sources from the JWST Advanced Deep Extragalactic Survey (JADES). The JWST/NIRCam dataset comprises approximately $N_g \simeq 7000$ photometrically-selected Lyman Break Galaxies (LBGs), spanning from $z=5.5$ up to $z=10.6$. The primary objective of this study is to extend clustering measurements beyond redshift $z>10$, finding a galaxy bias $b=9.6\pm1.7$ for the sample at $\overline{z} = 10.6$. The result suggests that the observed sources are hosted by dark matter halos of approximately $M_{h}\sim 10^{10.5}~\mathrm{M_{\odot}}$, in broad agreement with theoretical and numerical modelling of early galaxy formation during the epoch of reionization. Furthermore, the JWST JADES dataset enables an unprecedented investigation of clustering of dwarf galaxies two orders of magnitude fainter than the characteristic $L_*$ luminosity (i.e. with $M_{F200W}\simeq-14.5$) during the late stages of the epoch of reionization at $z\sim 6$. By measuring clustering versus luminosity, we observe that $b(M_{F200W})$ initially decreases with $M_{F200W}$ as theoretically expected, but a turning point of the relationship is seen at $M_{F200W} \sim -16$. We interpret the rise of clustering of the faintest dwarf as evidence of multiple halo occupation (i.e. as a one-halo term in bias modelling). These initial results demonstrate the potential for further quantitative characterisation of the interplay between assembly of dark matter and light during cosmic dawn that the growing samples of JWST observations are enabling.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Synthetic Data Applications in Finance
Authors:
Vamsi K. Potluru,
Daniel Borrajo,
Andrea Coletta,
Niccolò Dalmasso,
Yousef El-Laham,
Elizabeth Fons,
Mohsen Ghassemi,
Sriram Gopalakrishnan,
Vikesh Gosai,
Eleonora Kreačić,
Ganapathy Mani,
Saheed Obitayo,
Deepak Paramanand,
Natraj Raman,
Mikhail Solonin,
Srijan Sood,
Svitlana Vyetrenko,
Haibei Zhu,
Manuela Veloso,
Tucker Balch
Abstract:
Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured ar…
▽ More
Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured arising from both markets and retail financial applications. Since finance is a highly regulated industry, synthetic data is a potential approach for dealing with issues related to privacy, fairness, and explainability. Various metrics are utilized in evaluating the quality and effectiveness of our approaches in these applications. We conclude with open directions in synthetic data in the context of the financial domain.
△ Less
Submitted 20 March, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
Galaxy clustering measurements out to redshift z$\sim$8 from Hubble Legacy Fields
Authors:
Nicolò Dalmasso,
Michele Trenti,
Nicha Leethochawalit
Abstract:
We present a novel approach for measuring the two-point correlation function of galaxies in narrow pencil beam surveys with varying depths. Our methodology is utilized to expand high-redshift galaxy clustering investigations up to $z \sim 8$ by analyzing a comprehensive sample consisting of $N_g = 160$ Lyman break galaxy candidates obtained through optical and near-infrared photometric data within…
▽ More
We present a novel approach for measuring the two-point correlation function of galaxies in narrow pencil beam surveys with varying depths. Our methodology is utilized to expand high-redshift galaxy clustering investigations up to $z \sim 8$ by analyzing a comprehensive sample consisting of $N_g = 160$ Lyman break galaxy candidates obtained through optical and near-infrared photometric data within the CANDELS GOODS datasets from the Hubble Space Telescope Legacy Fields. For bright sources with $M_{UV} < -19.8$, we determine a galaxy bias of $b = 9.33\pm4.90$ at $\overline{z} = 7.7$ and a correlation length of $r_0 = 10.74\pm7.06$ $h^{-1}Mpc$. We obtain similar results for the XDF, with a galaxy bias measurement of $b = 8.26\pm3.41$ at the same redshift for a slightly fainter sample with a median luminosity of $M_{UV} = -18.4$. By comparing with dark-matter halo bias and employing abundance matching, we deduce a characteristic halo mass of $M_h \sim 10^{11.5} M_{\odot}$ and a duty cycle close to unity. To validate our approach for variable-depth datasets, we replicate the analysis in a region with near-uniform depth using a standard two-point correlation function estimator, yielding consistent outcomes. Our study not only provides a valuable tool for future utilization in JWST datasets but also suggests that the clustering of early galaxies continues to increase with redshift beyond $z \gtrsim 8$, potentially contributing to the existence of protocluster structures observed in early JWST imaging and spectroscopic surveys at $z \gtrsim 8$.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Fair Wasserstein Coresets
Authors:
Zikai Xiong,
Niccolò Dalmasso,
Shubham Sharma,
Freddy Lecue,
Daniele Magazzeni,
Vamsi K. Potluru,
Tucker Balch,
Manuela Veloso
Abstract:
Data distillation and coresets have emerged as popular approaches to generate a smaller representative set of samples for downstream learning tasks to handle large-scale datasets. At the same time, machine learning is being increasingly applied to decision-making processes at a societal level, making it imperative for modelers to address inherent biases towards subgroups present in the data. While…
▽ More
Data distillation and coresets have emerged as popular approaches to generate a smaller representative set of samples for downstream learning tasks to handle large-scale datasets. At the same time, machine learning is being increasingly applied to decision-making processes at a societal level, making it imperative for modelers to address inherent biases towards subgroups present in the data. While current approaches focus on creating fair synthetic representative samples by optimizing local properties relative to the original samples, their impact on downstream learning processes has yet to be explored. In this work, we present fair Wasserstein coresets (FWC), a novel coreset approach which generates fair synthetic representative samples along with sample-level weights to be used in downstream learning tasks. FWC uses an efficient majority minimization algorithm to minimize the Wasserstein distance between the original dataset and the weighted synthetic samples while enforcing demographic parity. We show that an unconstrained version of FWC is equivalent to Lloyd's algorithm for k-medians and k-means clustering. Experiments conducted on both synthetic and real datasets show that FWC: (i) achieves a competitive fairness-utility tradeoff in downstream models compared to existing approaches, (ii) improves downstream fairness when added to the existing training data and (iii) can be used to reduce biases in predictions from large language models (GPT-3.5 and GPT-4).
△ Less
Submitted 4 June, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
FairWASP: Fast and Optimal Fair Wasserstein Pre-processing
Authors:
Zikai Xiong,
Niccolò Dalmasso,
Alan Mishler,
Vamsi K. Potluru,
Tucker Balch,
Manuela Veloso
Abstract:
Recent years have seen a surge of machine learning approaches aimed at reducing disparities in model outputs across different subgroups. In many settings, training data may be used in multiple downstream applications by different users, which means it may be most effective to intervene on the training data itself. In this work, we present FairWASP, a novel pre-processing approach designed to reduc…
▽ More
Recent years have seen a surge of machine learning approaches aimed at reducing disparities in model outputs across different subgroups. In many settings, training data may be used in multiple downstream applications by different users, which means it may be most effective to intervene on the training data itself. In this work, we present FairWASP, a novel pre-processing approach designed to reduce disparities in classification datasets without modifying the original data. FairWASP returns sample-level weights such that the reweighted dataset minimizes the Wasserstein distance to the original dataset while satisfying (an empirical version of) demographic parity, a popular fairness criterion. We show theoretically that integer weights are optimal, which means our method can be equivalently understood as duplicating or eliminating samples. FairWASP can therefore be used to construct datasets which can be fed into any classification method, not just methods which accept sample weights. Our work is based on reformulating the pre-processing task as a large-scale mixed-integer program (MIP), for which we propose a highly efficient algorithm based on the cutting plane method. Experiments demonstrate that our proposed optimization algorithm significantly outperforms state-of-the-art commercial solvers in solving both the MIP and its linear program relaxation. Further experiments highlight the competitive performance of FairWASP in reducing disparities while preserving accuracy in downstream classification settings.
△ Less
Submitted 8 February, 2024; v1 submitted 31 October, 2023;
originally announced November 2023.
-
Deep Gaussian Mixture Ensembles
Authors:
Yousef El-Laham,
Niccolò Dalmasso,
Elizabeth Fons,
Svitlana Vyetrenko
Abstract:
This work introduces a novel probabilistic deep learning technique called deep Gaussian mixture ensembles (DGMEs), which enables accurate quantification of both epistemic and aleatoric uncertainty. By assuming the data generating process follows that of a Gaussian mixture, DGMEs are capable of approximating complex probability distributions, such as heavy-tailed or multimodal distributions. Our co…
▽ More
This work introduces a novel probabilistic deep learning technique called deep Gaussian mixture ensembles (DGMEs), which enables accurate quantification of both epistemic and aleatoric uncertainty. By assuming the data generating process follows that of a Gaussian mixture, DGMEs are capable of approximating complex probability distributions, such as heavy-tailed or multimodal distributions. Our contributions include the derivation of an expectation-maximization (EM) algorithm used for learning the model parameters, which results in an upper-bound on the log-likelihood of training data over that of standard deep ensembles. Additionally, the proposed EM training procedure allows for learning of mixture weights, which is not commonly done in ensembles. Our experimental results demonstrate that DGMEs outperform state-of-the-art uncertainty quantifying deep learning models in handling complex predictive densities.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
A massive interacting galaxy 510 million years after the Big Bang
Authors:
Kristan Boyett,
Michele Trenti,
Nicha Leethochawalit,
Antonello Calabró,
Benjamin Metha,
Guido Roberts-Borsani,
Nicoló Dalmasso,
Lilan Yang,
Paola Santini,
Tommaso Treu,
Tucker Jones,
Alaina Henry,
Charlotte A. Mason,
Takahiro Morishita,
Themiya Nanayakkara,
Namrata Roy,
Xin Wang,
Adriano Fontana,
Emiliano Merlin,
Marco Castellano,
Diego Paris,
Marusa Bradac,
Danilo Marchesini,
Sara Mascia,
Laura Pentericci
, et al. (2 additional authors not shown)
Abstract:
JWST observations confirm the existence of galaxies as early as 300Myr and at a higher number density than expected based on galaxy formation models and HST observations. Yet, sources confirmed spectroscopically in the first 500Myr have estimated stellar masses $<5\times10^8M_\odot$, limiting the signal to noise ratio (SNR) for investigating substructure. We present a high-resolution spectroscopic…
▽ More
JWST observations confirm the existence of galaxies as early as 300Myr and at a higher number density than expected based on galaxy formation models and HST observations. Yet, sources confirmed spectroscopically in the first 500Myr have estimated stellar masses $<5\times10^8M_\odot$, limiting the signal to noise ratio (SNR) for investigating substructure. We present a high-resolution spectroscopic and spatially resolved study of a rare bright galaxy at $z=9.3127\pm0.0002$ with a stellar mass of $(2.5^{+0.7}_{-0.5})\times10^9M_\odot$, forming $25^{+3}_{-4}M_\odot/yr$ and with a metallicity of $\sim0.1Z_\odot$- lower than in the local universe for the stellar mass but in line with expectations of chemical enrichment in galaxies 1-2Gyr after the Big Bang. The system has a morphology typically associated to two interacting galaxies, with a two-component main clump of very young stars (age$<10$Myr) surrounded by an extended stellar population ($130\pm20$Myr old, identified by modeling the NIRSpec spectrum) and an elongated clumpy tidal tail. The spectroscopic observations identify O, Ne and H emission lines, and the Lyman break, where there is evidence of substantial Ly$α$ absorption. The [OII] doublet is resolved spectrally, enabling an estimate of the electron number density and ionization parameter of the interstellar medium and showing higher densities and ionization than in lower redshift analogs. For the first time at $z>8$, we identify evidence of absorption lines (Si, C and Fe), with low confidence individual detections but SNR$>6$ when stacked. The absorption features suggest that Ly$α$ is damped by the interstellar and circumgalactic medium. Our observations provide evidence of rapid efficient build-up of mass and metals in the immediate aftermath of the Big Bang through mergers, demonstrating that massive galaxies with several billion stars exist earlier than expected.
△ Less
Submitted 26 February, 2024; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Fast Learning of Multidimensional Hawkes Processes via Frank-Wolfe
Authors:
Renbo Zhao,
Niccolò Dalmasso,
Mohsen Ghassemi,
Vamsi K. Potluru,
Tucker Balch,
Manuela Veloso
Abstract:
Hawkes processes have recently risen to the forefront of tools when it comes to modeling and generating sequential events data. Multidimensional Hawkes processes model both the self and cross-excitation between different types of events and have been applied successfully in various domain such as finance, epidemiology and personalized recommendations, among others. In this work we present an adapt…
▽ More
Hawkes processes have recently risen to the forefront of tools when it comes to modeling and generating sequential events data. Multidimensional Hawkes processes model both the self and cross-excitation between different types of events and have been applied successfully in various domain such as finance, epidemiology and personalized recommendations, among others. In this work we present an adaptation of the Frank-Wolfe algorithm for learning multidimensional Hawkes processes. Experimental results show that our approach has better or on par accuracy in terms of parameter estimation than other first order methods, while enjoying a significantly faster runtime.
△ Less
Submitted 12 December, 2022;
originally announced December 2022.
-
Online Learning for Mixture of Multivariate Hawkes Processes
Authors:
Mohsen Ghassemi,
Niccolò Dalmasso,
Simran Lamba,
Vamsi K. Potluru,
Sameena Shah,
Tucker Balch,
Manuela Veloso
Abstract:
Online learning of Hawkes processes has received increasing attention in the last couple of years especially for modeling a network of actors. However, these works typically either model the rich interaction between the events or the latent cluster of the actors or the network structure between the actors. We propose to model the latent structure of the network of actors as well as their rich inte…
▽ More
Online learning of Hawkes processes has received increasing attention in the last couple of years especially for modeling a network of actors. However, these works typically either model the rich interaction between the events or the latent cluster of the actors or the network structure between the actors. We propose to model the latent structure of the network of actors as well as their rich interaction across events for real-world settings of medical and financial applications. Experimental results on both synthetic and real-world data showcase the efficacy of our approach.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Differentially Private Learning of Hawkes Processes
Authors:
Mohsen Ghassemi,
Eleonora Kreačić,
Niccolò Dalmasso,
Vamsi K. Potluru,
Tucker Balch,
Manuela Veloso
Abstract:
Hawkes processes have recently gained increasing attention from the machine learning community for their versatility in modeling event sequence data. While they have a rich history going back decades, some of their properties, such as sample complexity for learning the parameters and releasing differentially private versions, are yet to be thoroughly analyzed. In this work, we study standard Hawke…
▽ More
Hawkes processes have recently gained increasing attention from the machine learning community for their versatility in modeling event sequence data. While they have a rich history going back decades, some of their properties, such as sample complexity for learning the parameters and releasing differentially private versions, are yet to be thoroughly analyzed. In this work, we study standard Hawkes processes with background intensity $μ$ and excitation function $αe^{-βt}$. We provide both non-private and differentially private estimators of $μ$ and $α$, and obtain sample complexity results in both settings to quantify the cost of privacy. Our analysis exploits the strong mixing property of Hawkes processes and classical central limit theorem results for weakly dependent random variables. We validate our theoretical findings on both synthetic and real datasets.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Structural Forecasting for Short-term Tropical Cyclone Intensity Guidance
Authors:
Trey McNeely,
Pavel Khokhlov,
Niccolo Dalmasso,
Kimberly M. Wood,
Ann B. Lee
Abstract:
Because geostationary satellite (Geo) imagery provides a high temporal resolution window into tropical cyclone (TC) behavior, we investigate the viability of its application to short-term probabilistic forecasts of TC convective structure to subsequently predict TC intensity. Here, we present a prototype model which is trained solely on two inputs: Geo infrared imagery leading up to the synoptic t…
▽ More
Because geostationary satellite (Geo) imagery provides a high temporal resolution window into tropical cyclone (TC) behavior, we investigate the viability of its application to short-term probabilistic forecasts of TC convective structure to subsequently predict TC intensity. Here, we present a prototype model which is trained solely on two inputs: Geo infrared imagery leading up to the synoptic time of interest and intensity estimates up to 6 hours prior to that time. To estimate future TC structure, we compute cloud-top temperature radial profiles from infrared imagery and then simulate the evolution of an ensemble of those profiles over the subsequent 12 hours by applying a Deep Autoregressive Generative Model (PixelSNAIL). To forecast TC intensities at hours 6 and 12, we input operational intensity estimates up to the current time (0 h) and simulated future radial profiles up to +12 h into a ``nowcasting'' convolutional neural network. We limit our inputs to demonstrate the viability of our approach and to enable quantification of value added by the observed and simulated future radial profiles beyond operational intensity estimates alone. Our prototype model achieves a marginally higher error than the National Hurricane Center's official forecasts despite excluding environmental factors, such as vertical wind shear and sea surface temperature. We also demonstrate that it is possible to reasonably predict short-term evolution of TC convective structure via radial profiles from Geo infrared imagery, resulting in interpretable structural forecasts that may be valuable for TC operational guidance.
△ Less
Submitted 8 April, 2023; v1 submitted 31 May, 2022;
originally announced June 2022.
-
Unsigned Distance Field as an Accurate 3D Scene Representation for Neural Scene Completion
Authors:
Jean Pierre Richa,
Jean-Emmanuel Deschaud,
François Goulette,
Nicolas Dalmasso
Abstract:
Scene Completion is the task of completing missing geometry from a partial scan of a scene. Most previous methods compute an implicit representation from range data using a Truncated Signed Distance Function (T-SDF) computed on a 3D grid as input to neural networks. The truncation decreases but does not remove the border errors introduced by the sign of SDF for open surfaces. As an alternative, we…
▽ More
Scene Completion is the task of completing missing geometry from a partial scan of a scene. Most previous methods compute an implicit representation from range data using a Truncated Signed Distance Function (T-SDF) computed on a 3D grid as input to neural networks. The truncation decreases but does not remove the border errors introduced by the sign of SDF for open surfaces. As an alternative, we present an Unsigned Distance Function (UDF) as an input representation to scene completion neural networks. The proposed UDF is simple, and efficient as a geometry representation, and can be computed on any point cloud. In contrast to usual Signed Distance Functions, our UDF does not require normal computation. To obtain the explicit geometry, we present a method for extracting a point cloud from discretized UDF values on a sparse grid. We compare different SDFs and UDFs for the scene completion task on indoor and outdoor point clouds collected using RGB-D and LiDAR sensors and show improved completion using the proposed UDF function.
△ Less
Submitted 2 December, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
AdaSplats: Adaptive Splatting of Point Clouds for Accurate 3D Modeling and Real-time High-Fidelity LiDAR Simulation
Authors:
Jean Pierre Richa,
Jean-Emmanuel Deschaud,
François Goulette,
Nicolas Dalmasso
Abstract:
LiDAR sensors provide rich 3D information about their surrounding{s} and are becoming increasingly important for autonomous vehicles tasks such as {localization}, semantic segmentation, object detection, and tracking. {Simulation} accelerates the testing, validation, and deployment of autonomous vehicles while {also} reducing cost and eliminating the risks of testing in real-world scenarios. We ad…
▽ More
LiDAR sensors provide rich 3D information about their surrounding{s} and are becoming increasingly important for autonomous vehicles tasks such as {localization}, semantic segmentation, object detection, and tracking. {Simulation} accelerates the testing, validation, and deployment of autonomous vehicles while {also} reducing cost and eliminating the risks of testing in real-world scenarios. We address the problem of high-fidelity LiDAR simulation and present a pipeline that leverages real-world point clouds acquired by mobile map** systems. Point-based geometry representations, more specifically splats {(2D oriented disks with normals)}, have proven their ability to accurately model the underlying surface in large point clouds{, mainly with uniform density}. We introduce an adaptive splat generation method that accurately models the underlying 3D geometry {to handle real-world point clouds with variable densities}, especially for thin structures. Moreover, we introduce a {fast} LiDAR {sensor} simulator, {working} in the splatted model, {that leverages} the GPU parallel architecture with an acceleration structure while focusing on efficiently handling large point clouds. We test our LiDAR simulation in real-world conditions, showing qualitative and quantitative results compared to basic splatting and meshing techniques, demonstrating the interest of our modeling technique.
△ Less
Submitted 26 December, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Fair When Trained, Unfair When Deployed: Observable Fairness Measures are Unstable in Performative Prediction Settings
Authors:
Alan Mishler,
Niccolò Dalmasso
Abstract:
Many popular algorithmic fairness measures depend on the joint distribution of predictions, outcomes, and a sensitive feature like race or gender. These measures are sensitive to distribution shift: a predictor which is trained to satisfy one of these fairness definitions may become unfair if the distribution changes. In performative prediction settings, however, predictors are precisely intended…
▽ More
Many popular algorithmic fairness measures depend on the joint distribution of predictions, outcomes, and a sensitive feature like race or gender. These measures are sensitive to distribution shift: a predictor which is trained to satisfy one of these fairness definitions may become unfair if the distribution changes. In performative prediction settings, however, predictors are precisely intended to induce distribution shift. For example, in many applications in criminal justice, healthcare, and consumer finance, the purpose of building a predictor is to reduce the rate of adverse outcomes such as recidivism, hospitalization, or default on a loan. We formalize the effect of such predictors as a type of concept shift-a particular variety of distribution shift-and show both theoretically and via simulated examples how this causes predictors which are fair when they are trained to become unfair when they are deployed. We further show how many of these issues can be avoided by using fairness definitions that depend on counterfactual rather than observable outcomes.
△ Less
Submitted 10 February, 2022;
originally announced February 2022.
-
Likelihood-Free Frequentist Inference: Bridging Classical Statistics and Machine Learning for Reliable Simulator-Based Inference
Authors:
Niccolò Dalmasso,
Luca Masserano,
David Zhao,
Rafael Izbicki,
Ann B. Lee
Abstract:
Many areas of science make extensive use of computer simulators that implicitly encode intractable likelihood functions of complex systems. Classical statistical methods are poorly suited for these so-called likelihood-free inference (LFI) settings, especially outside asymptotic and low-dimensional regimes. At the same time, traditional LFI methods - such as Approximate Bayesian Computation or mor…
▽ More
Many areas of science make extensive use of computer simulators that implicitly encode intractable likelihood functions of complex systems. Classical statistical methods are poorly suited for these so-called likelihood-free inference (LFI) settings, especially outside asymptotic and low-dimensional regimes. At the same time, traditional LFI methods - such as Approximate Bayesian Computation or more recent machine learning techniques - do not guarantee confidence sets with nominal coverage in general settings (i.e., with high-dimensional data, finite sample sizes, and for any parameter value). In addition, there are no diagnostic tools to check the empirical coverage of confidence sets provided by such methods across the entire parameter space. In this work, we propose a unified and modular inference framework that bridges classical statistics and modern machine learning providing (i) a practical approach to the Neyman construction of confidence sets with frequentist finite-sample coverage for any value of the unknown parameters; and (ii) interpretable diagnostics that estimate the empirical coverage across the entire parameter space. We refer to the general framework as likelihood-free frequentist inference (LF2I). Any method that defines a test statistic can leverage LF2I to create valid confidence sets and diagnostics without costly Monte Carlo samples at fixed parameter settings. We study the power of two likelihood-based test statistics (ACORE and BFF) and demonstrate their empirical performance on high-dimensional, complex data. Code is available at https://github.com/lee-group-cmu/lf2i.
△ Less
Submitted 19 November, 2023; v1 submitted 8 July, 2021;
originally announced July 2021.
-
When the Oracle Misleads: Modeling the Consequences of Using Observable Rather than Potential Outcomes in Risk Assessment Instruments
Authors:
Alan Mishler,
Niccolò Dalmasso
Abstract:
Risk Assessment Instruments (RAIs) are widely used to forecast adverse outcomes in domains such as healthcare and criminal justice. RAIs are commonly trained on observational data and are optimized to predict observable outcomes rather than potential outcomes, which are the outcomes that would occur absent a particular intervention. Examples of relevant potential outcomes include whether a patient…
▽ More
Risk Assessment Instruments (RAIs) are widely used to forecast adverse outcomes in domains such as healthcare and criminal justice. RAIs are commonly trained on observational data and are optimized to predict observable outcomes rather than potential outcomes, which are the outcomes that would occur absent a particular intervention. Examples of relevant potential outcomes include whether a patient's condition would worsen without treatment or whether a defendant would recidivate if released pretrial. We illustrate how RAIs which are trained to predict observable outcomes can lead to worse decision making, causing precisely the types of harm they are intended to prevent. This can occur even when the predictors are Bayes-optimal and there is no unmeasured confounding.
△ Less
Submitted 5 April, 2021;
originally announced April 2021.
-
Diagnostics for Conditional Density Models and Bayesian Inference Algorithms
Authors:
David Zhao,
Niccolò Dalmasso,
Rafael Izbicki,
Ann B. Lee
Abstract:
There has been growing interest in the AI community for precise uncertainty quantification. Conditional density models f(y|x), where x represents potentially high-dimensional features, are an integral part of uncertainty quantification in prediction and Bayesian inference. However, it is challenging to assess conditional density estimates and gain insight into modes of failure. While existing diag…
▽ More
There has been growing interest in the AI community for precise uncertainty quantification. Conditional density models f(y|x), where x represents potentially high-dimensional features, are an integral part of uncertainty quantification in prediction and Bayesian inference. However, it is challenging to assess conditional density estimates and gain insight into modes of failure. While existing diagnostic tools can determine whether an approximated conditional density is compatible overall with a data sample, they lack a principled framework for identifying, locating, and interpreting the nature of statistically significant discrepancies over the entire feature space. In this paper, we present rigorous and easy-to-interpret diagnostics such as (i) the "Local Coverage Test" (LCT), which distinguishes an arbitrarily misspecified model from the true conditional density of the sample, and (ii) "Amortized Local P-P plots" (ALP) which can quickly provide interpretable graphical summaries of distributional differences at any location x in the feature space. Our validation procedures scale to high dimensions and can potentially adapt to any type of data at hand. We demonstrate the effectiveness of LCT and ALP through a simulated experiment and applications to prediction and parameter inference for image data.
△ Less
Submitted 23 July, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Structural Forecasting for Tropical Cyclone Intensity Prediction: Providing Insight with Deep Learning
Authors:
Trey McNeely,
Niccolò Dalmasso,
Kimberly M. Wood,
Ann B. Lee
Abstract:
Tropical cyclone (TC) intensity forecasts are ultimately issued by human forecasters. The human in-the-loop pipeline requires that any forecasting guidance must be easily digestible by TC experts if it is to be adopted at operational centers like the National Hurricane Center. Our proposed framework leverages deep learning to provide forecasters with something neither end-to-end prediction models…
▽ More
Tropical cyclone (TC) intensity forecasts are ultimately issued by human forecasters. The human in-the-loop pipeline requires that any forecasting guidance must be easily digestible by TC experts if it is to be adopted at operational centers like the National Hurricane Center. Our proposed framework leverages deep learning to provide forecasters with something neither end-to-end prediction models nor traditional intensity guidance does: a powerful tool for monitoring high-dimensional time series of key physically relevant predictors and the means to understand how the predictors relate to one another and to short-term intensity changes.
△ Less
Submitted 7 December, 2020; v1 submitted 7 October, 2020;
originally announced October 2020.
-
HECT: High-Dimensional Ensemble Consistency Testing for Climate Models
Authors:
Niccolò Dalmasso,
Galen Vincent,
Dorit Hammerling,
Ann B. Lee
Abstract:
Climate models play a crucial role in understanding the effect of environmental and man-made changes on climate to help mitigate climate risks and inform governmental decisions. Large global climate models such as the Community Earth System Model (CESM), developed by the National Center for Atmospheric Research, are very complex with millions of lines of code describing interactions of the atmosph…
▽ More
Climate models play a crucial role in understanding the effect of environmental and man-made changes on climate to help mitigate climate risks and inform governmental decisions. Large global climate models such as the Community Earth System Model (CESM), developed by the National Center for Atmospheric Research, are very complex with millions of lines of code describing interactions of the atmosphere, land, oceans, and ice, among other components. As development of the CESM is constantly ongoing, simulation outputs need to be continuously controlled for quality. To be able to distinguish a "climate-changing" modification of the code base from a true climate-changing physical process or intervention, there needs to be a principled way of assessing statistical reproducibility that can handle both spatial and temporal high-dimensional simulation outputs. Our proposed work uses probabilistic classifiers like tree-based algorithms and deep neural networks to perform a statistically rigorous goodness-of-fit test of high-dimensional spatio-temporal data.
△ Less
Submitted 30 November, 2020; v1 submitted 8 October, 2020;
originally announced October 2020.
-
Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting
Authors:
Niccolò Dalmasso,
Rafael Izbicki,
Ann B. Lee
Abstract:
Parameter estimation, statistical tests and confidence sets are the cornerstones of classical statistics that allow scientists to make inferences about the underlying process that generated the observed data. A key question is whether one can still construct hypothesis tests and confidence sets with proper coverage and high power in a so-called likelihood-free inference (LFI) setting; that is, a s…
▽ More
Parameter estimation, statistical tests and confidence sets are the cornerstones of classical statistics that allow scientists to make inferences about the underlying process that generated the observed data. A key question is whether one can still construct hypothesis tests and confidence sets with proper coverage and high power in a so-called likelihood-free inference (LFI) setting; that is, a setting where the likelihood is not explicitly known but one can forward-simulate observable data according to a stochastic model. In this paper, we present $\texttt{ACORE}$ (Approximate Computation via Odds Ratio Estimation), a frequentist approach to LFI that first formulates the classical likelihood ratio test (LRT) as a parametrized classification problem, and then uses the equivalence of tests and confidence sets to build confidence regions for parameters of interest. We also present a goodness-of-fit procedure for checking whether the constructed tests and confidence regions are valid. $\texttt{ACORE}$ is based on the key observation that the LRT statistic, the rejection probability of the test, and the coverage of the confidence set are conditional distribution functions which often vary smoothly as a function of the parameters of interest. Hence, instead of relying solely on samples simulated at fixed parameter settings (as is the convention in standard Monte Carlo solutions), one can leverage machine learning tools and data simulated in the neighborhood of a parameter to improve estimates of quantities of interest. We demonstrate the efficacy of $\texttt{ACORE}$ with both theoretical and empirical results. Our implementation is available on Github.
△ Less
Submitted 13 August, 2020; v1 submitted 24 February, 2020;
originally announced February 2020.
-
Explicit Group Sparse Projection with Applications to Deep Learning and NMF
Authors:
Riyasat Ohib,
Nicolas Gillis,
Niccolò Dalmasso,
Sameena Shah,
Vamsi K. Potluru,
Sergey Plis
Abstract:
We design a new sparse projection method for a set of vectors that guarantees a desired average sparsity level measured leveraging the popular Hoyer measure (an affine function of the ratio of the $\ell_1$ and $\ell_2$ norms). Existing approaches either project each vector individually or require the use of a regularization parameter which implicitly maps to the average $\ell_0$-measure of sparsit…
▽ More
We design a new sparse projection method for a set of vectors that guarantees a desired average sparsity level measured leveraging the popular Hoyer measure (an affine function of the ratio of the $\ell_1$ and $\ell_2$ norms). Existing approaches either project each vector individually or require the use of a regularization parameter which implicitly maps to the average $\ell_0$-measure of sparsity. Instead, in our approach we set the sparsity level for the whole set explicitly and simultaneously project a group of vectors with the sparsity level of each vector tuned automatically. We show that the computational complexity of our projection operator is linear in the size of the problem. Additionally, we propose a generalization of this projection by replacing the $\ell_1$ norm by its weighted version. We showcase the efficacy of our approach in both supervised and unsupervised learning tasks on image datasets including CIFAR10 and ImageNet. In deep neural network pruning, the sparse models produced by our method on ResNet50 have significantly higher accuracies at corresponding sparsity values compared to existing competitors. In nonnegative matrix factorization, our approach yields competitive reconstruction errors against state-of-the-art algorithms.
△ Less
Submitted 18 February, 2022; v1 submitted 9 December, 2019;
originally announced December 2019.
-
Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic
Authors:
Matteo Sordello,
Niccolò Dalmasso,
Hangfeng He,
Weijie Su
Abstract:
This paper proposes SplitSGD, a new dynamic learning rate schedule for stochastic optimization. This method decreases the learning rate for better adaptation to the local geometry of the objective function whenever a stationary phase is detected, that is, the iterates are likely to bounce at around a vicinity of a local minimum. The detection is performed by splitting the single thread into two an…
▽ More
This paper proposes SplitSGD, a new dynamic learning rate schedule for stochastic optimization. This method decreases the learning rate for better adaptation to the local geometry of the objective function whenever a stationary phase is detected, that is, the iterates are likely to bounce at around a vicinity of a local minimum. The detection is performed by splitting the single thread into two and using the inner product of the gradients from the two threads as a measure of stationarity. Owing to this simple yet provably valid stationarity detection, SplitSGD is easy-to-implement and essentially does not incur additional computational cost than standard SGD. Through a series of extensive experiments, we show that this method is appropriate for both convex problems and training (non-convex) neural networks, with performance compared favorably to other stochastic optimization methods. Importantly, this method is observed to be very robust with a set of default parameters for a wide range of problems and, moreover, can yield better generalization performance than other adaptive gradient methods such as Adam.
△ Less
Submitted 16 February, 2024; v1 submitted 18 October, 2019;
originally announced October 2019.
-
Conditional Density Estimation Tools in Python and R with Applications to Photometric Redshifts and Likelihood-Free Cosmological Inference
Authors:
Niccolò Dalmasso,
Taylor Pospisil,
Ann B. Lee,
Rafael Izbicki,
Peter E. Freeman,
Alex I. Malz
Abstract:
It is well known in astronomy that propagating non-Gaussian prediction uncertainty in photometric redshift estimates is key to reducing bias in downstream cosmological analyses. Similarly, likelihood-free inference approaches, which are beginning to emerge as a tool for cosmological analysis, require a characterization of the full uncertainty landscape of the parameters of interest given observed…
▽ More
It is well known in astronomy that propagating non-Gaussian prediction uncertainty in photometric redshift estimates is key to reducing bias in downstream cosmological analyses. Similarly, likelihood-free inference approaches, which are beginning to emerge as a tool for cosmological analysis, require a characterization of the full uncertainty landscape of the parameters of interest given observed data. However, most machine learning (ML) or training-based methods with open-source software target point prediction or classification, and hence fall short in quantifying uncertainty in complex regression and parameter inference settings. As an alternative to methods that focus on predicting the response (or parameters) $\mathbf{y}$ from features $\mathbf{x}$, we provide nonparametric conditional density estimation (CDE) tools for approximating and validating the entire probability density function (PDF) $\mathrm{p}(\mathbf{y}|\mathbf{x})$ of $\mathbf{y}$ given (i.e., conditional on) $\mathbf{x}$. As there is no one-size-fits-all CDE method, the goal of this work is to provide a comprehensive range of statistical tools and open-source software for nonparametric CDE and method assessment which can accommodate different types of settings and be easily fit to the problem at hand. Specifically, we introduce four CDE software packages in $\texttt{Python}$ and $\texttt{R}$ based on ML prediction methods adapted and optimized for CDE: $\texttt{NNKCDE}$, $\texttt{RFCDE}$, $\texttt{FlexCode}$, and $\texttt{DeepCDE}$. Furthermore, we present the $\texttt{cdetools}$ package, which includes functions for computing a CDE loss function for tuning and assessing the quality of individual PDFs, along with diagnostic functions. We provide sample code in $\texttt{Python}$ and $\texttt{R}$ as well as examples of applications to photometric redshift estimation and likelihood-free cosmological inference via CDE.
△ Less
Submitted 20 December, 2019; v1 submitted 29 August, 2019;
originally announced August 2019.
-
A Flexible Pipeline for Prediction of Tropical Cyclone Paths
Authors:
Niccolò Dalmasso,
Robin Dunn,
Benjamin LeRoy,
Chad Schafer
Abstract:
Hurricanes and, more generally, tropical cyclones (TCs) are rare, complex natural phenomena of both scientific and public interest. The importance of understanding TCs in a changing climate has increased as recent TCs have had devastating impacts on human lives and communities. Moreover, good prediction and understanding about the complex nature of TCs can mitigate some of these human and property…
▽ More
Hurricanes and, more generally, tropical cyclones (TCs) are rare, complex natural phenomena of both scientific and public interest. The importance of understanding TCs in a changing climate has increased as recent TCs have had devastating impacts on human lives and communities. Moreover, good prediction and understanding about the complex nature of TCs can mitigate some of these human and property losses. Though TCs have been studied from many different angles, more work is needed from a statistical approach of providing prediction regions. The current state-of-the-art in TC prediction bands comes from the National Hurricane Center of the National Oceanographic and Atmospheric Administration (NOAA), whose proprietary model provides "cones of uncertainty" for TCs through an analysis of historical forecast errors.
The contribution of this paper is twofold. We introduce a new pipeline that encourages transparent and adaptable prediction band development by streamlining cyclone track simulation and prediction band generation. We also provide updates to existing models and novel statistical methodologies in both areas of the pipeline, respectively.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.
-
Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations
Authors:
Niccolò Dalmasso,
Ann B. Lee,
Rafael Izbicki,
Taylor Pospisil,
Ilmun Kim,
Chieh-An Lin
Abstract:
Complex phenomena in engineering and the sciences are often modeled with computationally intensive feed-forward simulations for which a tractable analytic likelihood does not exist. In these cases, it is sometimes necessary to estimate an approximate likelihood or fit a fast emulator model for efficient statistical inference; such surrogate models include Gaussian synthetic likelihoods and more re…
▽ More
Complex phenomena in engineering and the sciences are often modeled with computationally intensive feed-forward simulations for which a tractable analytic likelihood does not exist. In these cases, it is sometimes necessary to estimate an approximate likelihood or fit a fast emulator model for efficient statistical inference; such surrogate models include Gaussian synthetic likelihoods and more recently neural density estimators such as autoregressive models and normalizing flows. To date, however, there is no consistent way of quantifying the quality of such a fit. Here we propose a statistical framework that can distinguish any arbitrary misspecified model from the target likelihood, and that in addition can identify with statistical confidence the regions of parameter as well as feature space where the fit is inadequate. Our validation method applies to settings where simulations are extremely costly and generated in batches or "ensembles" at fixed locations in parameter space. At the heart of our approach is a two-sample test that quantifies the quality of the fit at fixed parameter values, and a global test that assesses goodness-of-fit across simulation parameters. While our general framework can incorporate any test statistic or distance metric, we specifically argue for a new two-sample test that can leverage any regression method to attain high power and provide diagnostics in complex data settings.
△ Less
Submitted 2 December, 2019; v1 submitted 27 May, 2019;
originally announced May 2019.
-
Clarifying the Hubble constant tension with a Bayesian hierarchical model of the local distance ladder
Authors:
Stephen M. Feeney,
Daniel J. Mortlock,
Niccolò Dalmasso
Abstract:
Estimates of the Hubble constant, $H_0$, from the distance ladder and the cosmic microwave background (CMB) differ at the $\sim$3-$σ$ level, indicating a potential issue with the standard $Λ$CDM cosmology. Interpreting this tension correctly requires a model comparison calculation depending on not only the traditional `$n$-$σ$' mismatch but also the tails of the likelihoods. Determining the form o…
▽ More
Estimates of the Hubble constant, $H_0$, from the distance ladder and the cosmic microwave background (CMB) differ at the $\sim$3-$σ$ level, indicating a potential issue with the standard $Λ$CDM cosmology. Interpreting this tension correctly requires a model comparison calculation depending on not only the traditional `$n$-$σ$' mismatch but also the tails of the likelihoods. Determining the form of the tails of the local $H_0$ likelihood is impossible with the standard Gaussian least-squares approximation, as it requires using non-Gaussian distributions to faithfully represent anchor likelihoods and model outliers in the Cepheid and supernova (SN) populations, and simultaneous fitting of the full distance-ladder dataset to correctly propagate uncertainties. We have developed a Bayesian hierarchical model that describes the full distance ladder, from nearby geometric anchors through Cepheids to Hubble-Flow SNe. This model does not rely on any distributions being Gaussian, allowing outliers to be modeled and obviating the need for arbitrary data cuts. Sampling from the $\sim$3000-parameter joint posterior using Hamiltonian Monte Carlo, we find $H_0$ = (72.72 $\pm$ 1.67) ${\rm km\,s^{-1}\,Mpc^{-1}}$ when applied to the outlier-cleaned Riess et al. (2016) data, and ($73.15 \pm 1.78$) ${\rm km\,s^{-1}\,Mpc^{-1}}$ with SN outliers reintroduced. Our high-fidelity sampling of the low-$H_0$ tail of the distance-ladder likelihood allows us to apply Bayesian model comparison to assess the evidence for deviation from $Λ$CDM. We set up this comparison to yield a lower limit on the odds of the underlying model being $Λ$CDM given the distance-ladder and Planck XIII (2016) CMB data. The odds against $Λ$CDM are at worst 10:1 or 7:1, depending on whether the SNe outliers are cut or modeled, or 60:1 if an approximation to the Planck Int. XLVI (2016) likelihood is used.
△ Less
Submitted 8 November, 2017; v1 submitted 30 June, 2017;
originally announced July 2017.