-
X-Ray Constraints on Dark Photon Tridents
Authors:
Tim Linden,
Thong T. Q. Nguyen,
Tim M. P. Tait
Abstract:
Dark photons that are sufficiently light and/or weakly-interacting represent a compelling vision of dark matter. Dark photon decay into three photons, which we call the dark photon trident, can be the dominant channel when the dark photon mass falls below the electron pair threshold and can produce a significant flux of x-rays. We use 16 years of data from INTEGRAL/SPI to constrain sub-MeV dark ph…
▽ More
Dark photons that are sufficiently light and/or weakly-interacting represent a compelling vision of dark matter. Dark photon decay into three photons, which we call the dark photon trident, can be the dominant channel when the dark photon mass falls below the electron pair threshold and can produce a significant flux of x-rays. We use 16 years of data from INTEGRAL/SPI to constrain sub-MeV dark photon decay, producing new worlds-best constraints on the kinetic mixing parameter for dark photon masses between 61 keV and 1022 keV, and comment on the potential for future x-ray observatories to discover the trident decay process.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Multi-target and multi-stage liver lesion segmentation and detection in multi-phase computed tomography scans
Authors:
Abdullah F. Al-Battal,
Soan T. M. Duong,
Van Ha Tang,
Quang Duc Tran,
Steven Q. H. Truong,
Chien Phan,
Truong Q. Nguyen,
Cheolhong An
Abstract:
Multi-phase computed tomography (CT) scans use contrast agents to highlight different anatomical structures within the body to improve the probability of identifying and detecting anatomical structures of interest and abnormalities such as liver lesions. Yet, detecting these lesions remains a challenging task as these lesions vary significantly in their size, shape, texture, and contrast with resp…
▽ More
Multi-phase computed tomography (CT) scans use contrast agents to highlight different anatomical structures within the body to improve the probability of identifying and detecting anatomical structures of interest and abnormalities such as liver lesions. Yet, detecting these lesions remains a challenging task as these lesions vary significantly in their size, shape, texture, and contrast with respect to surrounding tissue. Therefore, radiologists need to have an extensive experience to be able to identify and detect these lesions. Segmentation-based neural networks can assist radiologists with this task. Current state-of-the-art lesion segmentation networks use the encoder-decoder design paradigm based on the UNet architecture where the multi-phase CT scan volume is fed to the network as a multi-channel input. Although this approach utilizes information from all the phases and outperform single-phase segmentation networks, we demonstrate that their performance is not optimal and can be further improved by incorporating the learning from models trained on each single-phase individually. Our approach comprises three stages. The first stage identifies the regions within the liver where there might be lesions at three different scales (4, 8, and 16 mm). The second stage includes the main segmentation model trained using all the phases as well as a segmentation model trained on each of the phases individually. The third stage uses the multi-phase CT volumes together with the predictions from each of the segmentation models to generate the final segmentation map. Overall, our approach improves relative liver lesion segmentation performance by 1.6% while reducing performance variability across subjects by 8% when compared to the current state-of-the-art models.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Indirect Searches for Dark Photon-Photon Tridents in Celestial Objects
Authors:
Tim Linden,
Thong T. Q. Nguyen,
Tim M. P. Tait
Abstract:
We model and constrain the unique indirect detection signature produced by dark matter particles that annihilate through a $U(1)$ gauge symmetry into dark photons that subsequently decay into three-photon final states. We focus on scenarios where the dark photon is long-lived, and show that $γ$-ray probes of celestial objects can set strong constraints on the dark matter/baryon scattering cross se…
▽ More
We model and constrain the unique indirect detection signature produced by dark matter particles that annihilate through a $U(1)$ gauge symmetry into dark photons that subsequently decay into three-photon final states. We focus on scenarios where the dark photon is long-lived, and show that $γ$-ray probes of celestial objects can set strong constraints on the dark matter/baryon scattering cross section that in many cases surpass the power of current direct detection constraints, and in some cases even peer into the neutrino fog.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Practical challenges in mediation analysis: A guide for applied researchers
Authors:
Megan S. Schuler,
Donna L. Coffman,
Elizabeth A. Stuart,
Trang Q. Nguyen,
Brian Vegetabile,
Daniel F. McCaffrey
Abstract:
Mediation analysis is a statistical approach that can provide insights regarding the intermediary processes by which an intervention or exposure affects a given outcome. Mediation analyses rose to prominence, particularly in social science research, with the publication of the seminal paper by Baron and Kenny and is now commonly applied in many research disciplines, including health services resea…
▽ More
Mediation analysis is a statistical approach that can provide insights regarding the intermediary processes by which an intervention or exposure affects a given outcome. Mediation analyses rose to prominence, particularly in social science research, with the publication of the seminal paper by Baron and Kenny and is now commonly applied in many research disciplines, including health services research. Despite the growth in popularity, applied researchers may still encounter challenges in terms of conducting mediation analyses in practice. In this paper, we provide an overview of conceptual and methodological challenges that researchers face when conducting mediation analyses. Specifically, we discuss the following key challenges: (1) Conceptually differentiating mediators from other third variables, (2) Extending beyond the single mediator context, (3) Identifying appropriate datasets in which measurement and temporal ordering supports the hypothesized mediation model, (4) Selecting mediation effects that reflect the scientific question of interest, (5) Assessing the validity of underlying assumptions of no omitted confounders, (6) Addressing measurement error regarding the mediator, and (7) Clearly reporting results from mediation analyses. We discuss each challenge and highlight ways in which the applied researcher can approach these challenges.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Celestial Objects as Dark Matter Colliders
Authors:
Thong T. Q. Nguyen
Abstract:
In the vicinity of the Milky Way Galactic Center, celestial bodies, including neutron stars, reside within a dense dark matter environment. This study explores the accumulation of dark matter by neutron stars through dark matter-nucleon interactions, leading to increased internal dark matter density. Consequently, dark matter annihilation produces long-lived mediators that escape and decay into ne…
▽ More
In the vicinity of the Milky Way Galactic Center, celestial bodies, including neutron stars, reside within a dense dark matter environment. This study explores the accumulation of dark matter by neutron stars through dark matter-nucleon interactions, leading to increased internal dark matter density. Consequently, dark matter annihilation produces long-lived mediators that escape and decay into neutrinos. Leveraging experimental limits from IceCube, ANTARES, and future projections from ARIA, we establish constraints on the dark matter-nucleon cross section within a simplified dark $U(1)_{X}$ mediator model. This approach, applicable to various celestial objects and dark matter models, offers insights into the intricate interplay between dark matter and neutron stars near the Galactic Center.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Identification of complier and noncomplier average causal effects in the presence of latent missing-at-random (LMAR) outcomes: a unifying view and choices of assumptions
Authors:
Trang Quynh Nguyen,
Michelle C. Carlson,
Elizabeth A. Stuart
Abstract:
The study of treatment effects is often complicated by noncompliance and missing data. In the one-sided noncompliance setting where of interest are the complier and noncomplier average causal effects (CACE and NACE), we address outcome missingness of the \textit{latent missing at random} type (LMAR, also known as \textit{latent ignorability}). That is, conditional on covariates and treatment assig…
▽ More
The study of treatment effects is often complicated by noncompliance and missing data. In the one-sided noncompliance setting where of interest are the complier and noncomplier average causal effects (CACE and NACE), we address outcome missingness of the \textit{latent missing at random} type (LMAR, also known as \textit{latent ignorability}). That is, conditional on covariates and treatment assigned, the missingness may depend on compliance type. Within the instrumental variable (IV) approach to noncompliance, methods have been proposed for handling LMAR outcome that additionally invoke an exclusion restriction type assumption on missingness, but no solution has been proposed for when a non-IV approach is used. This paper focuses on effect identification in the presence of LMAR outcome, with a view to flexibly accommodate different principal identification approaches. We show that under treatment assignment ignorability and LMAR only, effect nonidentifiability boils down to a set of two connected mixture equations involving unidentified stratum-specific response probabilities and outcome means. This clarifies that (except for a special case) effect identification generally requires two additional assumptions: a \textit{specific missingness mechanism} assumption and a \textit{principal identification} assumption. This provides a template for identifying effects based on separate choices of these assumptions. We consider a range of specific missingness assumptions, including those that have appeared in the literature and some new ones. Incidentally, we find an issue in the existing assumptions, and propose a modification of the assumptions to avoid the issue. Results under different assumptions are illustrated using data from the Baltimore Experience Corps Trial.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Self-interacting Vectorial Dark Matter in a SM-like Dark Sector
Authors:
Van Que Tran,
Thong T. Q. Nguyen,
Tzu-Chiang Yuan
Abstract:
A $SU(2)_D \times U(1)_D$ gauge-Higgs sector, an exact dark copy of the Standard Model (SM) one, is proposed. It is demonstrated that the dark gauge bosons ${\cal W}^{(p,m)}$, in analogous to the SM $W^\pm$, can fulfill the role as a self-interacting vector dark matter candidate, solving the core versus cusp and missing satellites problems faced by the conventional paradigm of collisionless weakly…
▽ More
A $SU(2)_D \times U(1)_D$ gauge-Higgs sector, an exact dark copy of the Standard Model (SM) one, is proposed. It is demonstrated that the dark gauge bosons ${\cal W}^{(p,m)}$, in analogous to the SM $W^\pm$, can fulfill the role as a self-interacting vector dark matter candidate, solving the core versus cusp and missing satellites problems faced by the conventional paradigm of collisionless weakly interacting massive particle. Constraints from collider, astroparticle and cosmology on such a self-interacting vector dark matter candidate are scrutinized. Implications for the future searches of ${\cal W}^{(p,m)}$ in direct detection experiments are discussed.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
A multi-institutional pediatric dataset of clinical radiology MRIs by the Children's Brain Tumor Network
Authors:
Ariana M. Familiar,
Anahita Fathi Kazerooni,
Hannah Anderson,
Aliaksandr Lubneuski,
Karthik Viswanathan,
Rocky Breslow,
Nastaran Khalili,
Sina Bagheri,
Debanjan Haldar,
Meen Chul Kim,
Sherjeel Arif,
Rachel Madhogarhia,
Thinh Q. Nguyen,
Elizabeth A. Frenkel,
Zeinab Helili,
Jessica Harrison,
Keyvan Farahani,
Marius George Linguraru,
Ulas Bagci,
Yury Velichko,
Jeffrey Stevens,
Sarah Leary,
Robert M. Lober,
Stephani Campion,
Amy A. Smith
, et al. (15 additional authors not shown)
Abstract:
Pediatric brain and spinal cancers remain the leading cause of cancer-related death in children. Advancements in clinical decision-support in pediatric neuro-oncology utilizing the wealth of radiology imaging data collected through standard care, however, has significantly lagged other domains. Such data is ripe for use with predictive analytics such as artificial intelligence (AI) methods, which…
▽ More
Pediatric brain and spinal cancers remain the leading cause of cancer-related death in children. Advancements in clinical decision-support in pediatric neuro-oncology utilizing the wealth of radiology imaging data collected through standard care, however, has significantly lagged other domains. Such data is ripe for use with predictive analytics such as artificial intelligence (AI) methods, which require large datasets. To address this unmet need, we provide a multi-institutional, large-scale pediatric dataset of 23,101 multi-parametric MRI exams acquired through routine care for 1,526 brain tumor patients, as part of the Children's Brain Tumor Network. This includes longitudinal MRIs across various cancer diagnoses, with associated patient-level clinical information, digital pathology slides, as well as tissue genotype and omics data. To facilitate downstream analysis, treatment-naïve images for 370 subjects were processed and released through the NCI Childhood Cancer Data Initiative via the Cancer Data Service. Through ongoing efforts to continuously build these imaging repositories, our aim is to accelerate discovery and translational AI models with real-world data, to ultimately empower precision medicine for children.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Advances in Kidney Biopsy Lesion Assessment through Dense Instance Segmentation
Authors:
Zhan Xiong,
Junling He,
Pieter Valkema,
Tri Q. Nguyen,
Maarten Naesens,
Jesper Kers,
Fons J. Verbeek
Abstract:
Renal biopsies are the gold standard for diagnosis of kidney diseases. Lesion scores made by renal pathologists are semi-quantitative and exhibit high inter-observer variability. Automating lesion classification within segmented anatomical structures can provide decision support in quantification analysis and reduce the inter-observer variability. Nevertheless, classifying lesions in regions-of-in…
▽ More
Renal biopsies are the gold standard for diagnosis of kidney diseases. Lesion scores made by renal pathologists are semi-quantitative and exhibit high inter-observer variability. Automating lesion classification within segmented anatomical structures can provide decision support in quantification analysis and reduce the inter-observer variability. Nevertheless, classifying lesions in regions-of-interest (ROIs) is clinically challenging due to (a) a large amount of densely packed anatomical objects (up to 1000), (b) class imbalance across different compartments (at least 3), (c) significant variation in object scales (i.e. sizes and shapes), and (d) the presence of multi-label lesions per anatomical structure. Existing models lack the capacity to address these complexities efficiently and generically. This paper presents \textbf{a generalized technical solution} for large-scale, multi-source datasets with diverse lesions. Our approach utilizes two sub-networks: dense instance segmentation and lesion classification. We introduce \textbf{DiffRegFormer}, an end-to-end dense instance segmentation model designed for multi-class, multi-scale objects within ROIs. Combining diffusion models, transformers, and RCNNs, DiffRegFormer efficiently recognizes over 500 objects across three anatomical classes (glomeruli, tubuli, arteries) within ROIs on a single NVIDIA GeForce RTX 3090 GPU. On a dataset of 303 ROIs (from 148 Jones' silver-stained renal WSIs), it outperforms state of art models, achieving AP of 52.1\% (detection) and 46.8\% (segmentation). Our lesion classification sub-network achieves 89.2\% precision and 64.6\% recall on 21889 object patches (from the 303 ROIs). Importantly, the model demonstrates direct domain transfer to PAS-stained WSIs without fine-tuning.
△ Less
Submitted 28 March, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Natural Language Commanding via Program Synthesis
Authors:
Apurva Gandhi,
Thong Q. Nguyen,
Huitian Jiao,
Robert Steen,
Ameya Bhatawdekar
Abstract:
We present Semantic Interpreter, a natural language-friendly AI system for productivity software such as Microsoft Office that leverages large language models (LLMs) to execute user intent across application features. While LLMs are excellent at understanding user intent expressed as natural language, they are not sufficient for fulfilling application-specific user intent that requires more than t…
▽ More
We present Semantic Interpreter, a natural language-friendly AI system for productivity software such as Microsoft Office that leverages large language models (LLMs) to execute user intent across application features. While LLMs are excellent at understanding user intent expressed as natural language, they are not sufficient for fulfilling application-specific user intent that requires more than text-to-text transformations. We therefore introduce the Office Domain Specific Language (ODSL), a concise, high-level language specialized for performing actions in and interacting with entities in Office applications. Semantic Interpreter leverages an Analysis-Retrieval prompt construction method with LLMs for program synthesis, translating natural language user utterances to ODSL programs that can be transpiled to application APIs and then executed. We focus our discussion primarily on a research exploration for Microsoft PowerPoint.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Deep learning network to correct axial and coronal eye motion in 3D OCT retinal imaging
Authors:
Yiqian Wang,
Alexandra Warter,
Melina Cavichini,
Varsha Alex,
Dirk-Uwe G. Bartsch,
William R. Freeman,
Truong Q. Nguyen,
Cheolhong An
Abstract:
Optical Coherence Tomography (OCT) is one of the most important retinal imaging technique. However, involuntary motion artifacts still pose a major challenge in OCT imaging that compromises the quality of downstream analysis, such as retinal layer segmentation and OCT Angiography. We propose deep learning based neural networks to correct axial and coronal motion artifacts in OCT based on a single…
▽ More
Optical Coherence Tomography (OCT) is one of the most important retinal imaging technique. However, involuntary motion artifacts still pose a major challenge in OCT imaging that compromises the quality of downstream analysis, such as retinal layer segmentation and OCT Angiography. We propose deep learning based neural networks to correct axial and coronal motion artifacts in OCT based on a single volumetric scan. The proposed method consists of two fully-convolutional neural networks that predict Z and X dimensional displacement maps sequentially in two stages. The experimental result shows that the proposed method can effectively correct motion artifacts and achieve smaller error than other methods. Specifically, the method can recover the overall curvature of the retina, and can be generalized well to various diseases and resolutions.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Comparison of Methods that Combine Multiple Randomized Trials to Estimate Heterogeneous Treatment Effects
Authors:
Carly Lupton Brantner,
Trang Quynh Nguyen,
Tengjie Tang,
Congwen Zhao,
Hwanhee Hong,
Elizabeth A. Stuart
Abstract:
Individualized treatment decisions can improve health outcomes, but using data to make these decisions in a reliable, precise, and generalizable way is challenging with a single dataset. Leveraging multiple randomized controlled trials allows for the combination of datasets with unconfounded treatment assignment to better estimate heterogeneous treatment effects. This paper discusses several non-p…
▽ More
Individualized treatment decisions can improve health outcomes, but using data to make these decisions in a reliable, precise, and generalizable way is challenging with a single dataset. Leveraging multiple randomized controlled trials allows for the combination of datasets with unconfounded treatment assignment to better estimate heterogeneous treatment effects. This paper discusses several non-parametric approaches for estimating heterogeneous treatment effects using data from multiple trials. We extend single-study methods to a scenario with multiple trials and explore their performance through a simulation study, with data generation scenarios that have differing levels of cross-trial heterogeneity. The simulations demonstrate that methods that directly allow for heterogeneity of the treatment effect across trials perform better than methods that do not, and that the choice of single-study method matters based on the functional form of the treatment effect. Finally, we discuss which methods perform well in each setting and then apply them to four randomized controlled trials to examine effect heterogeneity of treatments for major depressive disorder.
△ Less
Submitted 15 November, 2023; v1 submitted 28 March, 2023;
originally announced March 2023.
-
3D Facial Imperfection Regeneration: Deep learning approach and 3D printing prototypes
Authors:
Phuong D. Nguyen,
Thinh D. Le,
Duong Q. Nguyen,
Thanh Q. Nguyen,
Li-Wei Chou,
H. Nguyen-Xuan
Abstract:
This study explores the potential of a fully convolutional mesh autoencoder model for regenerating 3D nature faces with the presence of imperfect areas. We utilize deep learning approaches in graph processing and analysis to investigate the capabilities model in recreating a filling part for facial scars. Our approach in dataset creation is able to generate a facial scar rationally in a virtual sp…
▽ More
This study explores the potential of a fully convolutional mesh autoencoder model for regenerating 3D nature faces with the presence of imperfect areas. We utilize deep learning approaches in graph processing and analysis to investigate the capabilities model in recreating a filling part for facial scars. Our approach in dataset creation is able to generate a facial scar rationally in a virtual space that corresponds to the unique circumstances. Especially, we propose a new method which is named 3D Facial Imperfection Regeneration(3D-FaIR) for reproducing a complete face reconstruction based on the remaining features of the patient face. To further enhance the applicable capacity of the present research, we develop an improved outlier technique to separate the wounds of patients and provide appropriate wound cover models. Also, a Cir3D-FaIR dataset of imperfect faces and open codes was released at https://github.com/SIMOGroup/3DFaIR. Our findings demonstrate the potential of the proposed approach to help patients recover more quickly and safely through convenient techniques. We hope that this research can contribute to the development of new products and innovative solutions for facial scar regeneration.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
Sensitivity analysis for principal ignorability violation in estimating complier and noncomplier average causal effects
Authors:
Trang Quynh Nguyen,
Elizabeth A. Stuart,
Daniel O. Scharfstein,
Elizabeth L. Ogburn
Abstract:
An important strategy for identifying principal causal effects, which are often used in settings with noncompliance, is to invoke the principal ignorability (PI) assumption. As PI is untestable, it is important to gauge how sensitive effect estimates are to its violation. We focus on this task for the common one-sided noncompliance setting where there are two principal strata, compliers and noncom…
▽ More
An important strategy for identifying principal causal effects, which are often used in settings with noncompliance, is to invoke the principal ignorability (PI) assumption. As PI is untestable, it is important to gauge how sensitive effect estimates are to its violation. We focus on this task for the common one-sided noncompliance setting where there are two principal strata, compliers and noncompliers. Under PI, compliers and noncompliers share the same outcome-mean-given-covariates function under the control condition. For sensitivity analysis, we allow this function to differ between compliers and noncompliers in several ways, indexed by an odds ratio, a generalized odds ratio, a mean ratio, or a standardized mean difference sensitivity parameter. We tailor sensitivity analysis techniques (with any sensitivity parameter choice) to several types of PI-based main analysis methods, including outcome regression, influence function (IF) based and weighting methods. We illustrate the proposed sensitivity analyses using several outcome types from the JOBS II study. This application estimates nuisance functions parametrically -- for simplicity and accessibility. In addition, we establish rate conditions on nonparametric nuisance estimation for IF-based estimators to be asymptotically normal -- with a view to inform nonparametric inference.
△ Less
Submitted 28 March, 2024; v1 submitted 8 March, 2023;
originally announced March 2023.
-
Methods for Integrating Trials and Non-Experimental Data to Examine Treatment Effect Heterogeneity
Authors:
Carly Lupton Brantner,
Ting-Hsuan Chang,
Trang Quynh Nguyen,
Hwanhee Hong,
Leon Di Stefano,
Elizabeth A. Stuart
Abstract:
Estimating treatment effects conditional on observed covariates can improve the ability to tailor treatments to particular individuals. Doing so effectively requires dealing with potential confounding, and also enough data to adequately estimate effect moderation. A recent influx of work has looked into estimating treatment effect heterogeneity using data from multiple randomized controlled trials…
▽ More
Estimating treatment effects conditional on observed covariates can improve the ability to tailor treatments to particular individuals. Doing so effectively requires dealing with potential confounding, and also enough data to adequately estimate effect moderation. A recent influx of work has looked into estimating treatment effect heterogeneity using data from multiple randomized controlled trials and/or observational datasets. With many new methods available for assessing treatment effect heterogeneity using multiple studies, it is important to understand which methods are best used in which setting, how the methods compare to one another, and what needs to be done to continue progress in this field. This paper reviews these methods broken down by data setting: aggregate-level data, federated learning, and individual participant-level data. We define the conditional average treatment effect and discuss differences between parametric and nonparametric estimators, and we list key assumptions, both those that are required within a single study and those that are necessary for data combination. After describing existing approaches, we compare and contrast them and reveal open areas for future research. This review demonstrates that there are many possible approaches for estimating treatment effect heterogeneity through the combination of datasets, but that there is substantial work to be done to compare these methods through case studies and simulations, extend them to different settings, and refine them to account for various challenges present in real data.
△ Less
Submitted 28 March, 2023; v1 submitted 26 February, 2023;
originally announced February 2023.
-
Leptoquark search at the Forward Physics Facility
Authors:
Kingman Cheung,
Thong T. Q. Nguyen,
C. J. Ouseph
Abstract:
In this study, we calculate the sensitivity reach on the vector leptoquark (LQ) $U_1$ at the experiments proposed in Forward Physics Facility (FPF), including FASER$ν$, FASER$\nu2$, FLArE (10 tons), and FLArE (100 tons) using the neutrino-nucleon scattering ($νN \rightarrow νN'$ and $νN \rightarrow l N'$). We cover a wide mass range of $10^{-3}$ GeV $\leq M_{LQ}\leq 10^4$ GeV. The new result shows…
▽ More
In this study, we calculate the sensitivity reach on the vector leptoquark (LQ) $U_1$ at the experiments proposed in Forward Physics Facility (FPF), including FASER$ν$, FASER$\nu2$, FLArE (10 tons), and FLArE (100 tons) using the neutrino-nucleon scattering ($νN \rightarrow νN'$ and $νN \rightarrow l N'$). We cover a wide mass range of $10^{-3}$ GeV $\leq M_{LQ}\leq 10^4$ GeV. The new result shows that the FLArE (100 tons) offers the best sensitivity to the LQ model. The sensitivity curves for all the experiments follow a similar pattern with weakened sensitivities with the increment of the LQ mass. We combine the sensitivities obtained from the neutral- and charged-current interactions of the neutrinos.
△ Less
Submitted 15 August, 2023; v1 submitted 10 February, 2023;
originally announced February 2023.
-
Multiple imputation for propensity score analysis with covariates missing at random: some clarity on within and across methods
Authors:
Trang Quynh Nguyen,
Elizabeth A. Stuart
Abstract:
In epidemiology and social sciences, propensity score methods are popular for estimating treatment effects using observational data, and multiple imputation is popular for handling covariate missingness. However, how to appropriately use multiple imputation for propensity score analysis is not completely clear. This paper aims to bring clarity on the consistency (or lack thereof) of methods that h…
▽ More
In epidemiology and social sciences, propensity score methods are popular for estimating treatment effects using observational data, and multiple imputation is popular for handling covariate missingness. However, how to appropriately use multiple imputation for propensity score analysis is not completely clear. This paper aims to bring clarity on the consistency (or lack thereof) of methods that have been proposed, focusing on the within approach (where the effect is estimated separately in each imputed dataset and then the multiple estimates are combined) and the across approach (where typically propensity scores are averaged across imputed datasets before being used for effect estimation). We show that the within method is valid and can be used with any causal effect estimator that is consistent in the full-data setting. Existing across methods are inconsistent, but a different across method that averages the inverse probability weights across imputed datasets is consistent for propensity score weighting. We also comment on methods that rely on imputing a function of the missing covariate rather than the covariate itself, including imputation of the propensity score and of the probability weight. Based on consistency results and practical flexibility, we recommend generally using the standard within method. Throughout, we provide intuition to make the results meaningful to the broad audience of applied researchers.
△ Less
Submitted 28 August, 2023; v1 submitted 17 January, 2023;
originally announced January 2023.
-
Bounds on Long-lived Dark Matter Mediators from Neutron Stars
Authors:
Thong T. Q. Nguyen,
Tim M. P. Tait
Abstract:
Neutron stars close to the Galactic center are expected to swim in a dense background of dark matter. For models in which the dark matter has efficient interactions with neutrons, they are expected to accumulate their own local cloud of dark matter, making them appealing targets for observations seeking signs of dark matter annihilation. For theories with very light mediators, the dark matter may…
▽ More
Neutron stars close to the Galactic center are expected to swim in a dense background of dark matter. For models in which the dark matter has efficient interactions with neutrons, they are expected to accumulate their own local cloud of dark matter, making them appealing targets for observations seeking signs of dark matter annihilation. For theories with very light mediators, the dark matter may annihilate into pairs of mediators which are sufficiently long-lived to escape the star and decay outside it into neutrinos. We examine the bounds on the parameter space of heavy ($\sim$TeV to $\sim$PeV) dark matter theories with long-lived mediators decaying into neutrinos based on observations of high energy neutrino observatories, and make projections for the future. We find that these observations provide information that is complementary to terrestrial searches, and probe otherwise inaccessible regimes of dark matter parameter space.
△ Less
Submitted 15 June, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
Obliquely Scrutinizing a Hidden SM-like Gauge Model
Authors:
Van Que Tran,
Thong T. Q. Nguyen,
Tzu-Chiang Yuan
Abstract:
In view of the recent high precision measurement of the Standard Model $W$ boson mass at the CDF II detector, we compute the contributions to the oblique parameters $S$, $T$ and $U$ coming from the two additional Higgs doublets (one inert and one hidden) as well as the hidden neutral dark gauge bosons and extra heavy fermions in the gauged two-Higgs-doublet model (G2HDM). While the effects from th…
▽ More
In view of the recent high precision measurement of the Standard Model $W$ boson mass at the CDF II detector, we compute the contributions to the oblique parameters $S$, $T$ and $U$ coming from the two additional Higgs doublets (one inert and one hidden) as well as the hidden neutral dark gauge bosons and extra heavy fermions in the gauged two-Higgs-doublet model (G2HDM). While the effects from the hidden Higgs doublet and new heavy fermions are found to be minuscule, the hidden gauge sector $SU(2)_H \times U(1)_X$ with gauge coupling strength $\gtrsim 10^{-2}$ and gauge boson mass $\gtrsim 100$ GeV can readily explain the $W$ boson mass anomaly but nevertheless excluded by the dilepton high-mass resonance searches at the Large Hadron Collider. On the other hand, the new global fits to the oblique parameters due to the new $W$ boson mass measurement can give discernible impacts on the mass splitting and mixing angle for the inert Higgs doublet in G2HDM. We also study the impact to the signal strength of diphoton mode of the 125 GeV Higgs boson $h \to γγ$ and the detectability of the yet to observe process $h \to Z γ$ at the High Luminosity Large Hadron Collider. Current constraints for the dark matter candidate $W^\prime$ including the dark matter relic density, dark matter direct detections and invisible Higgs decays are also taken into account in this study.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Generative Adversarial Network (GAN) and Enhanced Root Mean Square Error (ERMSE): Deep Learning for Stock Price Movement Prediction
Authors:
Ashish Kumar,
Abeer Alsadoon,
P. W. C. Prasad,
Salma Abdullah,
Tarik A. Rashid,
Duong Thu Hang Pham,
Tran Quoc Vinh Nguyen
Abstract:
The prediction of stock price movement direction is significant in financial circles and academic. Stock price contains complex, incomplete, and fuzzy information which makes it an extremely difficult task to predict its development trend. Predicting and analysing financial data is a nonlinear, time-dependent problem. With rapid development in machine learning and deep learning, this task can be p…
▽ More
The prediction of stock price movement direction is significant in financial circles and academic. Stock price contains complex, incomplete, and fuzzy information which makes it an extremely difficult task to predict its development trend. Predicting and analysing financial data is a nonlinear, time-dependent problem. With rapid development in machine learning and deep learning, this task can be performed more effectively by a purposely designed network. This paper aims to improve prediction accuracy and minimizing forecasting error loss through deep learning architecture by using Generative Adversarial Networks. It was proposed a generic model consisting of Phase-space Reconstruction (PSR) method for reconstructing price series and Generative Adversarial Network (GAN) which is a combination of two neural networks which are Long Short-Term Memory (LSTM) as Generative model and Convolutional Neural Network (CNN) as Discriminative model for adversarial training to forecast the stock market. LSTM will generate new instances based on historical basic indicators information and then CNN will estimate whether the data is predicted by LSTM or is real. It was found that the Generative Adversarial Network (GAN) has performed well on the enhanced root mean square error to LSTM, as it was 4.35% more accurate in predicting the direction and reduced processing time and RMSE by 78 secs and 0.029, respectively. This study provides a better result in the accuracy of the stock index. It seems that the proposed system concentrates on minimizing the root mean square error and processing time and improving the direction prediction accuracy, and provides a better result in the accuracy of the stock index.
△ Less
Submitted 30 November, 2021;
originally announced December 2021.
-
Low-temperature acanthite-like phase of Cu$_{2}$S: A first-principles study on electronic and transport properties
Authors:
Ho Ngoc Nam,
Katsuhiro Suzuki,
Tien Quang Nguyen,
Akira Masago,
Hikari Shinya,
Tetsuya Fukushima,
Kazunori Sato
Abstract:
The mobility and disorder in the lattice of Cu atoms as liquid-like behavior is an important characteristic affecting the thermoelectric properties of Cu$_{2}$S. In this study, using a theoretical model called acanthite-like structure for Cu$_{2}$S at a low-temperature range, we systematically investigate the electronic structure, intrinsic defect formation, and transport properties by first-princ…
▽ More
The mobility and disorder in the lattice of Cu atoms as liquid-like behavior is an important characteristic affecting the thermoelectric properties of Cu$_{2}$S. In this study, using a theoretical model called acanthite-like structure for Cu$_{2}$S at a low-temperature range, we systematically investigate the electronic structure, intrinsic defect formation, and transport properties by first-principles calculations. Thereby, previous experimental reports on the indirect bandgap nature of Cu$_{2}$S were confirmed in this work with an energy gap of about 0.9-0.95 eV. As a result, the optical absorption coefficient estimated from this model also gives a potential value of $α> 10^{4}$ cm$^{-1}$ in the visible spectrum range. According to the bonding analysis and formation energy aspect, Cu vacancy is the most preferred defect to form in Cu$_{2}$S, which primarily affects the conductive behavior as a $p$-type, as experimentally observed. Finally, the transport properties of Cu$_{2}$S system were successfully reproduced using an electron-phonon scattering method, highlighting the important role of relaxation time prediction in conductivity estimation instead of regarding it as a constant.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Autoencoders on FPGAs for real-time, unsupervised new physics detection at 40 MHz at the Large Hadron Collider
Authors:
Ekaterina Govorkova,
Ema Puljak,
Thea Aarrestad,
Thomas James,
Vladimir Loncar,
Maurizio Pierini,
Adrian Alan Pol,
Nicolò Ghielmetti,
Maksymilian Graczyk,
Sioni Summers,
Jennifer Ngadiuba,
Thong Q. Nguyen,
Javier Duarte,
Zhenbin Wu
Abstract:
In this paper, we show how to adapt and deploy anomaly detection algorithms based on deep autoencoders, for the unsupervised detection of new physics signatures in the extremely challenging environment of a real-time event selection system at the Large Hadron Collider (LHC). We demonstrate that new physics signatures can be enhanced by three orders of magnitude, while staying within the strict lat…
▽ More
In this paper, we show how to adapt and deploy anomaly detection algorithms based on deep autoencoders, for the unsupervised detection of new physics signatures in the extremely challenging environment of a real-time event selection system at the Large Hadron Collider (LHC). We demonstrate that new physics signatures can be enhanced by three orders of magnitude, while staying within the strict latency and resource constraints of a typical LHC event filtering system. This would allow for collecting datasets potentially enriched with high-purity contributions from new physics processes. Through per-layer, highly parallel implementations of network layers, support for autoencoder-specific losses on FPGAs and latent space based inference, we demonstrate that anomaly detection can be performed in as little as $80\,$ns using less than 3% of the logic resources in the Xilinx Virtex VU9P FPGA. Opening the way to real-life applications of this idea during the next data-taking campaign of the LHC.
△ Less
Submitted 12 August, 2021; v1 submitted 9 August, 2021;
originally announced August 2021.
-
VinaFood21: A Novel Dataset for Evaluating Vietnamese Food Recognition
Authors:
Thuan Trong Nguyen,
Thuan Q. Nguyen,
Dung Vo,
Vi Nguyen,
Ngoc Ho,
Nguyen D. Vo,
Kiet Van Nguyen,
Khang Nguyen
Abstract:
Vietnam is such an attractive tourist destination with its stunning and pristine landscapes and its top-rated unique food and drink. Among thousands of Vietnamese dishes, foreigners and native people are interested in easy-to-eat tastes and easy-to-do recipes, along with reasonable prices, mouthwatering flavors, and popularity. Due to the diversity and almost all the dishes have significant simila…
▽ More
Vietnam is such an attractive tourist destination with its stunning and pristine landscapes and its top-rated unique food and drink. Among thousands of Vietnamese dishes, foreigners and native people are interested in easy-to-eat tastes and easy-to-do recipes, along with reasonable prices, mouthwatering flavors, and popularity. Due to the diversity and almost all the dishes have significant similarities and the lack of quality Vietnamese food datasets, it is hard to implement an auto system to classify Vietnamese food, therefore, make people easier to discover Vietnamese food. This paper introduces a new Vietnamese food dataset named VinaFood21, which consists of 13,950 images corresponding to 21 dishes. We use 10,044 images for model training and 6,682 test images to classify each food in the VinaFood21 dataset and achieved an average accuracy of 74.81% when fine-tuning CNN EfficientNet-B0. (https://github.com/nguyenvd-uit/uit-together-dataset)
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
A CNN Segmentation-Based Approach to Object Detection and Tracking in Ultrasound Scans with Application to the Vagus Nerve Detection
Authors:
Abdullah F. Al-Battal,
Yan Gong,
Lu Xu,
Timothy Morton,
Chen Du,
Yifeng Bu 1,
Imanuel R Lerman,
Radhika Madhavan,
Truong Q. Nguyen
Abstract:
Ultrasound scanning is essential in several medical diagnostic and therapeutic applications. It is used to visualize and analyze anatomical features and structures that influence treatment plans. However, it is both labor intensive, and its effectiveness is operator dependent. Real-time accurate and robust automatic detection and tracking of anatomical structures while scanning would significantly…
▽ More
Ultrasound scanning is essential in several medical diagnostic and therapeutic applications. It is used to visualize and analyze anatomical features and structures that influence treatment plans. However, it is both labor intensive, and its effectiveness is operator dependent. Real-time accurate and robust automatic detection and tracking of anatomical structures while scanning would significantly impact diagnostic and therapeutic procedures to be consistent and efficient. In this paper, we propose a deep learning framework to automatically detect and track a specific anatomical target structure in ultrasound scans. Our framework is designed to be accurate and robust across subjects and imaging devices, to operate in real-time, and to not require a large training set. It maintains a localization precision and recall higher than 90% when trained on training sets that are as small as 20% in size of the original training set. The framework backbone is a weakly trained segmentation neural network based on U-Net. We tested the framework on two different ultrasound datasets with the aim to detect and track the Vagus nerve, where it outperformed current state-of-the-art real-time object detection networks.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Data Augmentation by Concatenation for Low-Resource Translation: A Mystery and a Solution
Authors:
Toan Q. Nguyen,
Kenton Murray,
David Chiang
Abstract:
In this paper, we investigate the driving factors behind concatenation, a simple but effective data augmentation method for low-resource neural machine translation. Our experiments suggest that discourse context is unlikely the cause for the improvement of about +1 BLEU across four language pairs. Instead, we demonstrate that the improvement comes from three other factors unrelated to discourse: c…
▽ More
In this paper, we investigate the driving factors behind concatenation, a simple but effective data augmentation method for low-resource neural machine translation. Our experiments suggest that discourse context is unlikely the cause for the improvement of about +1 BLEU across four language pairs. Instead, we demonstrate that the improvement comes from three other factors unrelated to discourse: context diversity, length diversity, and (to a lesser extent) position shifting.
△ Less
Submitted 2 July, 2021; v1 submitted 4 May, 2021;
originally announced May 2021.
-
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Authors:
Julia Kreutzer,
Isaac Caswell,
Lisa Wang,
Ahsan Wahab,
Daan van Esch,
Nasanbayar Ulzii-Orshikh,
Allahsera Tapo,
Nishant Subramani,
Artem Sokolov,
Claytone Sikasote,
Monang Setyawan,
Supheakmungkol Sarin,
Sokhar Samb,
Benoît Sagot,
Clara Rivera,
Annette Rios,
Isabel Papadimitriou,
Salomey Osei,
Pedro Ortiz Suarez,
Iroro Orife,
Kelechi Ogueji,
Andre Niyongabo Rubungo,
Toan Q. Nguyen,
Mathias Müller,
André Müller
, et al. (27 additional authors not shown)
Abstract:
With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering hundreds of languages. We manually audit the quality of 205 language-specific corpora released with five major public datasets (CCAligned, ParaCrawl, WikiMatrix, OSCAR, mC4). Lower-resource corpora have system…
▽ More
With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering hundreds of languages. We manually audit the quality of 205 language-specific corpora released with five major public datasets (CCAligned, ParaCrawl, WikiMatrix, OSCAR, mC4). Lower-resource corpora have systematic issues: At least 15 corpora have no usable text, and a significant fraction contains less than 50% sentences of acceptable quality. In addition, many are mislabeled or use nonstandard/ambiguous language codes. We demonstrate that these issues are easy to detect even for non-proficient speakers, and supplement the human audit with automatic analyses. Finally, we recommend techniques to evaluate and improve multilingual corpora and discuss potential risks that come with low-quality data releases.
△ Less
Submitted 21 February, 2022; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Termination of Multipartite Graph Series Arising from Complex Network Modelling
Authors:
Matthieu Latapy,
Thi Ha Duong Phan,
Christophe Crespelle,
Thanh Qui Nguyen
Abstract:
An intense activity is nowadays devoted to the definition of models capturing the properties of complex networks. Among the most promising approaches, it has been proposed to model these graphs via their clique incidence bipartite graphs. However, this approach has, until now, severe limitations resulting from its incapacity to reproduce a key property of this object: the overlap** nature of cli…
▽ More
An intense activity is nowadays devoted to the definition of models capturing the properties of complex networks. Among the most promising approaches, it has been proposed to model these graphs via their clique incidence bipartite graphs. However, this approach has, until now, severe limitations resulting from its incapacity to reproduce a key property of this object: the overlap** nature of cliques in complex networks. In order to get rid of these limitations we propose to encode the structure of clique overlaps in a network thanks to a process consisting in iteratively factorising the maximal bicliques between the upper level and the other levels of a multipartite graph. We show that the most natural definition of this factorising process leads to infinite series for some instances. Our main result is to design a restriction of this process that terminates for any arbitrary graph. Moreover, we show that the resulting multipartite graph has remarkable combinatorial properties and is closely related to another fundamental combinatorial object. Finally, we show that, in practice, this multipartite graph is computationally tractable and has a size that makes it suitable for complex network modelling.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Causal mediation analysis: From simple to more robust strategies for estimation of marginal natural (in)direct effects
Authors:
Trang Quynh Nguyen,
Elizabeth L. Ogburn,
Ian Schmid,
Elizabeth B. Sarker,
Noah Greifer,
Ina M. Koning,
Elizabeth A. Stuart
Abstract:
This paper aims to provide practitioners of causal mediation analysis with a better understanding of estimation options. We take as inputs two familiar strategies (weighting and model-based prediction) and a simple way of combining them (weighted models), and show how a range of estimators can be generated, with different modeling requirements and robustness properties. The primary goal is to help…
▽ More
This paper aims to provide practitioners of causal mediation analysis with a better understanding of estimation options. We take as inputs two familiar strategies (weighting and model-based prediction) and a simple way of combining them (weighted models), and show how a range of estimators can be generated, with different modeling requirements and robustness properties. The primary goal is to help build intuitive appreciation for robust estimation that is conducive to sound practice. A second goal is to provide a "menu" of estimators that practitioners can choose from for the estimation of marginal natural (in)direct effects. The estimators generated from this exercise include some that coincide or are similar to existing estimators and others that have not previously appeared in the literature. We note several different ways to estimate the weights for cross-world weighting based on three expressions of the weighting function, including one that is novel; and show how to check the resulting covariate and mediator balance. We use a random continuous weights bootstrap to obtain confidence intervals, and also derive general asymptotic variance formulas for the estimators. The estimators are illustrated using data from an adolescent alcohol use prevention study.
△ Less
Submitted 13 January, 2023; v1 submitted 11 February, 2021;
originally announced February 2021.
-
In Defense of Scene Graphs for Image Captioning
Authors:
Kien Nguyen,
Subarna Tripathi,
Bang Du,
Tanaya Guha,
Truong Q. Nguyen
Abstract:
The mainstream image captioning models rely on Convolutional Neural Network (CNN) image features to generate captions via recurrent models. Recently, image scene graphs have been used to augment captioning models so as to leverage their structural semantics, such as object entities, relationships and attributes. Several studies have noted that the naive use of scene graphs from a black-box scene g…
▽ More
The mainstream image captioning models rely on Convolutional Neural Network (CNN) image features to generate captions via recurrent models. Recently, image scene graphs have been used to augment captioning models so as to leverage their structural semantics, such as object entities, relationships and attributes. Several studies have noted that the naive use of scene graphs from a black-box scene graph generator harms image captioning performance and that scene graph-based captioning models have to incur the overhead of explicit use of image features to generate decent captions. Addressing these challenges, we propose \textbf{SG2Caps}, a framework that utilizes only the scene graph labels for competitive image captioning performance. The basic idea is to close the semantic gap between the two scene graphs - one derived from the input image and the other from its caption. In order to achieve this, we leverage the spatial location of objects and the Human-Object-Interaction (HOI) labels as an additional HOI graph. SG2Caps outperforms existing scene graph-only captioning models by a large margin, indicating scene graphs as a promising representation for image captioning. Direct utilization of scene graph labels avoids expensive graph convolutions over high-dimensional CNN features resulting in 49% fewer trainable parameters. Our code is available at: https://github.com/Kien085/SG2Caps
△ Less
Submitted 17 August, 2021; v1 submitted 9 February, 2021;
originally announced February 2021.
-
Sensitivity analyses for effect modifiers not observed in the target population when generalizing treatment effects from a randomized controlled trial: Assumptions, models, effect scales, data scenarios, and implementation details
Authors:
Trang Quynh Nguyen,
Benjamin Ackerman,
Ian Schmid,
Stephen R. Cole,
Elizabeth A. Stuart
Abstract:
Background: Randomized controlled trials are often used to inform policy and practice for broad populations. The average treatment effect (ATE) for a target population, however, may be different from the ATE observed in a trial if there are effect modifiers whose distribution in the target population is different that from that in the trial. Methods exist to use trial data to estimate the target p…
▽ More
Background: Randomized controlled trials are often used to inform policy and practice for broad populations. The average treatment effect (ATE) for a target population, however, may be different from the ATE observed in a trial if there are effect modifiers whose distribution in the target population is different that from that in the trial. Methods exist to use trial data to estimate the target population ATE, provided the distributions of treatment effect modifiers are observed in both the trial and target population -- an assumption that may not hold in practice.
Methods: The proposed sensitivity analyses address the situation where a treatment effect modifier is observed in the trial but not the target population. These methods are based on an outcome model or the combination of such a model and weighting adjustment for observed differences between the trial sample and target population. They accommodate several types of outcome models: linear models (including single time outcome and pre- and post-treatment outcomes) for additive effects, and models with log or logit link for multiplicative effects. We clarify the methods' assumptions and provide detailed implementation instructions.
Illustration: We illustrate the methods using an example generalizing the effects of an HIV treatment regimen from a randomized trial to a relevant target population.
Conclusion: These methods allow researchers and decision-makers to have more appropriate confidence when drawing conclusions about target population effects.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Clarifying causal mediation analysis: Effect identification via three assumptions and five potential outcomes
Authors:
Trang Quynh Nguyen,
Ian Schmid,
Elizabeth L. Ogburn,
Elizabeth A. Stuart
Abstract:
Causal mediation analysis is complicated with multiple effect definitions that require different sets of assumptions for identification. This paper provides a systematic explanation of such assumptions. We define five potential outcome types whose means are involved in various effect definitions. We tackle their mean/distribution's identification, starting with the one that requires the weakest as…
▽ More
Causal mediation analysis is complicated with multiple effect definitions that require different sets of assumptions for identification. This paper provides a systematic explanation of such assumptions. We define five potential outcome types whose means are involved in various effect definitions. We tackle their mean/distribution's identification, starting with the one that requires the weakest assumptions and gradually building up to the one that requires the strongest assumptions. This presentation shows clearly why an assumption is required for one estimand and not another, and provides a succinct table from which an applied researcher could pick out the assumptions required for identifying the causal effects they target. Using a running example, the paper illustrates the assembling and consideration of identifying assumptions for a range of causal contrasts. For several that are commonly encountered in the literature, this exercise clarifies that identification requires weaker assumptions than those often stated in the literature. This attention to the details also draws attention to the differences in the positivity assumption for different estimands, with practical implications. Clarity on the identifying assumptions of these various estimands will help researchers conduct appropriate mediation analyses and interpret the results with appropriate caution given the plausibility of the assumptions.
△ Less
Submitted 7 July, 2022; v1 submitted 18 November, 2020;
originally announced November 2020.
-
Data Augmentation at the LHC through Analysis-specific Fast Simulation with Deep Learning
Authors:
Cheng Chen,
Olmo Cerri,
Thong Q. Nguyen,
Jean-Roch Vlimant,
Maurizio Pierini
Abstract:
We present a fast simulation application based on a Deep Neural Network, designed to create large analysis-specific datasets. Taking as an example the generation of W+jet events produced in sqrt(s)= 13 TeV proton-proton collisions, we train a neural network to model detector resolution effects as a transfer function acting on an analysis-specific set of relevant features, computed at generation le…
▽ More
We present a fast simulation application based on a Deep Neural Network, designed to create large analysis-specific datasets. Taking as an example the generation of W+jet events produced in sqrt(s)= 13 TeV proton-proton collisions, we train a neural network to model detector resolution effects as a transfer function acting on an analysis-specific set of relevant features, computed at generation level, i.e., in absence of detector effects. Based on this model, we propose a novel fast-simulation workflow that starts from a large amount of generator-level events to deliver large analysis-specific samples. The adoption of this approach would result in about an order-of-magnitude reduction in computing and storage requirements for the collision simulation workflow. This strategy could help the high energy physics community to face the computing challenges of the future High-Luminosity LHC.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Adversarially Learned Anomaly Detection on CMS Open Data: re-discovering the top quark
Authors:
Oliver Knapp,
Guenther Dissertori,
Olmo Cerri,
Thong Q. Nguyen,
Jean-Roch Vlimant,
Maurizio Pierini
Abstract:
We apply an Adversarially Learned Anomaly Detection (ALAD) algorithm to the problem of detecting new physics processes in proton-proton collisions at the Large Hadron Collider. Anomaly detection based on ALAD matches performances reached by Variational Autoencoders, with a substantial improvement in some cases. Training the ALAD algorithm on 4.4 fb-1 of 8 TeV CMS Open Data, we show how a data-driv…
▽ More
We apply an Adversarially Learned Anomaly Detection (ALAD) algorithm to the problem of detecting new physics processes in proton-proton collisions at the Large Hadron Collider. Anomaly detection based on ALAD matches performances reached by Variational Autoencoders, with a substantial improvement in some cases. Training the ALAD algorithm on 4.4 fb-1 of 8 TeV CMS Open Data, we show how a data-driven anomaly detection and characterization would work in real life, re-discovering the top quark by identifying the main features of the t-tbar experimental signature at the LHC.
△ Less
Submitted 3 October, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
First-principles calculation of electronic density of states and Seebeck coefficient in transition-metal-doped Si-Ge alloys
Authors:
Ryo Yamada,
Akira Masago,
Tetsuya Fukushima,
Hikari Shinya,
Tien Quang Nguyen,
Kazunori Sato
Abstract:
High $ZT$ value and large Seebeck coefficient have been reported in the nanostructured Fe-doped Si-Ge alloys. In this work, the large Seebeck coefficient in Fe-doped Si-Ge systems is qualitatively reproduced from the computed electronic density of states, where a hybrid functional, HSE06, is used for an exchange-correlation functional, as well as a special quasi-random structure (SQS) for a disord…
▽ More
High $ZT$ value and large Seebeck coefficient have been reported in the nanostructured Fe-doped Si-Ge alloys. In this work, the large Seebeck coefficient in Fe-doped Si-Ge systems is qualitatively reproduced from the computed electronic density of states, where a hybrid functional, HSE06, is used for an exchange-correlation functional, as well as a special quasi-random structure (SQS) for a disordered atomic configuration. Furthermore, by replacing Fe with other transition metals, such as Mn, Co, Ni, Cu, Zn, and Au, a better dopant that produces a larger Seebeck coefficient in Si-Ge alloy systems is explored.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Particle Generative Adversarial Networks for full-event simulation at the LHC and their application to pileup description
Authors:
Jesus Arjona Martinez,
Thong Q Nguyen,
Maurizio Pierini,
Maria Spiropulu,
Jean-Roch Vlimant
Abstract:
We investigate how a Generative Adversarial Network could be used to generate a list of particle four-momenta from LHC proton collisions, allowing one to define a generative model that could abstract from the irregularities of typical detector geometries. As an example of application, we show how such an architecture could be used as a generator of LHC parasitic collisions (pileup). We present two…
▽ More
We investigate how a Generative Adversarial Network could be used to generate a list of particle four-momenta from LHC proton collisions, allowing one to define a generative model that could abstract from the irregularities of typical detector geometries. As an example of application, we show how such an architecture could be used as a generator of LHC parasitic collisions (pileup). We present two approaches to generate the events: unconditional generator and generator conditioned on missing transverse energy. We assess generation performances in a realistic LHC data-analysis environment, with a pileup mitigation algorithm applied.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Masked Language Model Scoring
Authors:
Julian Salazar,
Davis Liang,
Toan Q. Nguyen,
Katrin Kirchhoff
Abstract:
Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one. We show that PLLs outperform scores from autoregressive language models like GPT-2 in a variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an end-to-end LibriSpeec…
▽ More
Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one. We show that PLLs outperform scores from autoregressive language models like GPT-2 in a variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on state-of-the-art baselines for low-resource translation pairs, with further gains from domain adaptation. We attribute this success to PLL's unsupervised expression of linguistic acceptability without a left-to-right bias, greatly improving on scores from GPT-2 (+10 points on island effects, NPI licensing in BLiMP). One can finetune MLMs to give scores without masking, enabling computation in a single inference pass. In all, PLLs and their associated pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of pretrained MLMs; e.g., we use a single cross-lingual model to rescore translations in multiple languages. We release our library for language model scoring at https://github.com/awslabs/mlm-scoring.
△ Less
Submitted 31 December, 2020; v1 submitted 31 October, 2019;
originally announced October 2019.
-
Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation
Authors:
Kenton Murray,
Jeffery Kinnison,
Toan Q. Nguyen,
Walter Scheirer,
David Chiang
Abstract:
Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training ru…
▽ More
Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Transformers without Tears: Improving the Normalization of Self-Attention
Authors:
Toan Q. Nguyen,
Julian Salazar
Abstract:
We evaluate three simple, normalization-centric changes to improve Transformer training. First, we show that pre-norm residual connections (PreNorm) and smaller initializations enable warmup-free, validation-based training with large learning rates. Second, we propose $\ell_2$ normalization with a single scale parameter (ScaleNorm) for faster training and better performance. Finally, we reaffirm t…
▽ More
We evaluate three simple, normalization-centric changes to improve Transformer training. First, we show that pre-norm residual connections (PreNorm) and smaller initializations enable warmup-free, validation-based training with large learning rates. Second, we propose $\ell_2$ normalization with a single scale parameter (ScaleNorm) for faster training and better performance. Finally, we reaffirm the effectiveness of normalizing word embeddings to a fixed length (FixNorm). On five low-resource translation pairs from TED Talks-based corpora, these changes always converge, giving an average +1.1 BLEU over state-of-the-art bilingual baselines and a new 32.8 BLEU on IWSLT'15 English-Vietnamese. We observe sharper performance curves, more consistent gradient norms, and a linear relationship between activation scaling and decoder depth. Surprisingly, in the high-resource setting (WMT'14 English-German), ScaleNorm and FixNorm remain competitive but PreNorm degrades performance.
△ Less
Submitted 29 December, 2019; v1 submitted 13 October, 2019;
originally announced October 2019.
-
Partially Pooled Propensity Score Models for Average Treatment Effect Estimation with Multilevel Data
Authors:
You** Lee,
Trang Q. Nguyen,
Elizabeth A. Stuart
Abstract:
Causal inference analyses often use existing observational data, which in many cases has some clustering of individuals. In this paper we discuss propensity score weighting methods in a multilevel setting where within clusters individuals share unmeasured confounders that are related to treatment assignment and the potential outcomes. We focus in particular on settings where models with fixed clus…
▽ More
Causal inference analyses often use existing observational data, which in many cases has some clustering of individuals. In this paper we discuss propensity score weighting methods in a multilevel setting where within clusters individuals share unmeasured confounders that are related to treatment assignment and the potential outcomes. We focus in particular on settings where models with fixed cluster effects are either not feasible or not useful due to the presence of a large number of small clusters. We found, both through numerical experiments and theoretical derivations, that a strategy of grou** clusters with similar treatment prevalence and estimating propensity scores within such cluster groups is effective in reducing bias from unmeasured cluster-level covariates under mild conditions on the outcome model. We apply our proposed method in evaluating the effectiveness of center-based pre-school program participation on children's achievement at kindergarten, using the Early Childhood Longitudinal Study, Kindergarten data.
△ Less
Submitted 22 December, 2020; v1 submitted 12 October, 2019;
originally announced October 2019.
-
Interaction networks for the identification of boosted $H\to b\overline{b}$ decays
Authors:
Eric A. Moreno,
Thong Q. Nguyen,
Jean-Roch Vlimant,
Olmo Cerri,
Harvey B. Newman,
Avikar Periwal,
Maria Spiropulu,
Javier M. Duarte,
Maurizio Pierini
Abstract:
We develop an algorithm based on an interaction network to identify high-transverse-momentum Higgs bosons decaying to bottom quark-antiquark pairs and distinguish them from ordinary jets that reflect the configurations of quarks and gluons at short distances. The algorithm's inputs are features of the reconstructed charged particles in a jet and the secondary vertices associated with them. Describ…
▽ More
We develop an algorithm based on an interaction network to identify high-transverse-momentum Higgs bosons decaying to bottom quark-antiquark pairs and distinguish them from ordinary jets that reflect the configurations of quarks and gluons at short distances. The algorithm's inputs are features of the reconstructed charged particles in a jet and the secondary vertices associated with them. Describing the jet shower as a combination of particle-to-particle and particle-to-vertex interactions, the model is trained to learn a jet representation on which the classification problem is optimized. The algorithm is trained on simulated samples of realistic LHC collisions, released by the CMS Collaboration on the CERN Open Data Portal. The interaction network achieves a drastic improvement in the identification performance with respect to state-of-the-art algorithms.
△ Less
Submitted 28 July, 2020; v1 submitted 26 September, 2019;
originally announced September 2019.
-
JEDI-net: a jet identification algorithm based on interaction networks
Authors:
Eric A. Moreno,
Olmo Cerri,
Javier M. Duarte,
Harvey B. Newman,
Thong Q. Nguyen,
Avikar Periwal,
Maurizio Pierini,
Aidana Serikova,
Maria Spiropulu,
Jean-Roch Vlimant
Abstract:
We investigate the performance of a jet identification algorithm based on interaction networks (JEDI-net) to identify all-hadronic decays of high-momentum heavy particles produced at the LHC and distinguish them from ordinary jets originating from the hadronization of quarks and gluons. The jet dynamics are described as a set of one-to-one interactions between the jet constituents. Based on a repr…
▽ More
We investigate the performance of a jet identification algorithm based on interaction networks (JEDI-net) to identify all-hadronic decays of high-momentum heavy particles produced at the LHC and distinguish them from ordinary jets originating from the hadronization of quarks and gluons. The jet dynamics are described as a set of one-to-one interactions between the jet constituents. Based on a representation learned from these interactions, the jet is associated to one of the considered categories. Unlike other architectures, the JEDI-net models achieve their performance without special handling of the sparse input jet representation, extensive pre-processing, particle ordering, or specific assumptions regarding the underlying detector geometry. The presented models give better results with less model parameters, offering interesting prospects for LHC applications.
△ Less
Submitted 27 January, 2020; v1 submitted 14 August, 2019;
originally announced August 2019.
-
Propensity score analysis with latent covariates: Measurement error bias correction using the covariate's posterior mean, aka the inclusive factor score
Authors:
Trang Quynh Nguyen,
Elizabeth A. Stuart
Abstract:
We address measurement error bias in propensity score (PS) analysis due to covariates that are latent variables. In the setting where latent covariate $X$ is measured via multiple error-prone items $\mathbf{W}$, PS analysis using several proxies for $X$ -- the $\mathbf{W}$ items themselves, a summary score (mean/sum of the items), or the conventional factor score (cFS , i.e., predicted value of…
▽ More
We address measurement error bias in propensity score (PS) analysis due to covariates that are latent variables. In the setting where latent covariate $X$ is measured via multiple error-prone items $\mathbf{W}$, PS analysis using several proxies for $X$ -- the $\mathbf{W}$ items themselves, a summary score (mean/sum of the items), or the conventional factor score (cFS , i.e., predicted value of $X$ based on the measurement model) -- often results in biased estimation of the causal effect, because balancing the proxy (between exposure conditions) does not balance $X$. We propose an improved proxy: the conditional mean of $X$ given the combination of $\mathbf{W}$, the observed covariates $Z$, and exposure $A$, denoted $X_{WZA}$. The theoretical support, which applies whether $X$ is latent or not (but is unobserved), is that balancing $X_{WZA}$ (e.g., via weighting or matching) implies balancing the mean of $X$. For a latent $X$, we estimate $X_{WZA}$ by the inclusive factor score (iFS) -- predicted value of $X$ from a structural equation model that captures the joint distribution of $(X,\mathbf{W},A)$ given $Z$. Simulation shows that PS analysis using the iFS substantially improves balance on the first five moments of $X$ and reduces bias in the estimated causal effect. Hence, within the proxy variables approach, we recommend this proxy over existing ones. We connect this proxy method to known results about weighting/matching functions (Lockwood & McCaffrey, 2016; McCaffrey, Lockwood, & Setodji, 2013). We illustrate the method in handling latent covariates when estimating the effect of out-of-school suspension on risk of later police arrests using Add Health data.
△ Less
Submitted 11 February, 2020; v1 submitted 29 July, 2019;
originally announced July 2019.
-
Clarifying causal mediation analysis for the applied researcher: Defining effects based on what we want to learn
Authors:
Trang Quynh Nguyen,
Ian Schmid,
Elizabeth A. Stuart
Abstract:
The incorporation of causal inference in mediation analysis has led to theoretical and methodological advancements -- effect definitions with causal interpretation, clarification of assumptions required for effect identification, and an expanding array of options for effect estimation. However, the literature on these results is fast-growing and complex, which may be confusing to researchers unfam…
▽ More
The incorporation of causal inference in mediation analysis has led to theoretical and methodological advancements -- effect definitions with causal interpretation, clarification of assumptions required for effect identification, and an expanding array of options for effect estimation. However, the literature on these results is fast-growing and complex, which may be confusing to researchers unfamiliar with causal inference or unfamiliar with mediation. The goal of this paper is to help ease the understanding and adoption of causal mediation analysis. It starts by highlighting a key difference between the causal inference and traditional approaches to mediation analysis and making a case for the need for explicit causal thinking and the causal inference approach in mediation analysis. It then explains in as-plain-as-possible language existing effect types, paying special attention to motivating these effects with different types of research questions, and using concrete examples for illustration. This presentation differentiates two perspectives (or purposes of analysis): the explanatory perspective (aiming to explain the total effect) and the interventional perspective (asking questions about hypothetical interventions on the exposure and mediator, or hypothetically modified exposures). For the latter perspective, the paper proposes tap** into a general class of interventional effects that contains as special cases most of the usual effect types -- interventional direct and indirect effects, controlled direct effects and also a generalized interventional direct effect type, as well as the total effect and overall effect. This general class allows flexible effect definitions which better match many research questions than the standard interventional direct and indirect effects.
△ Less
Submitted 15 May, 2020; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Toward Joint Image Generation and Compression using Generative Adversarial Networks
Authors:
Byeongkeun Kang,
Subarna Tripathi,
Truong Q. Nguyen
Abstract:
In this paper, we present a generative adversarial network framework that generates compressed images instead of synthesizing raw RGB images and compressing them separately. In the real world, most images and videos are stored and transferred in a compressed format to save storage capacity and data transfer bandwidth. However, since typical generative adversarial networks generate raw RGB images,…
▽ More
In this paper, we present a generative adversarial network framework that generates compressed images instead of synthesizing raw RGB images and compressing them separately. In the real world, most images and videos are stored and transferred in a compressed format to save storage capacity and data transfer bandwidth. However, since typical generative adversarial networks generate raw RGB images, those generated images need to be compressed by a post-processing stage to reduce the data size. Among image compression methods, JPEG has been one of the most commonly used lossy compression methods for still images. Hence, we propose a novel framework that generates JPEG compressed images using generative adversarial networks. The novel generator consists of the proposed locally connected layers, chroma subsampling layers, quantization layers, residual blocks, and convolution layers. The locally connected layer is proposed to enable block-based operations. We also discuss training strategies for the proposed architecture including the loss function and the transformation between its generator and its discriminator. The proposed method is evaluated using the publicly available CIFAR-10 dataset and LSUN bedroom dataset. The results demonstrate that the proposed method is able to generate compressed data with competitive qualities. The proposed method is a promising baseline method for joint image generation and compression using generative adversarial networks.
△ Less
Submitted 23 January, 2019;
originally announced January 2019.
-
Random Forest with Learned Representations for Semantic Segmentation
Authors:
Byeongkeun Kang,
Truong Q. Nguyen
Abstract:
In this work, we present a random forest framework that learns the weights, shapes, and sparsities of feature representations for real-time semantic segmentation. Typical filters (kernels) have predetermined shapes and sparsities and learn only weights. A few feature extraction methods fix weights and learn only shapes and sparsities. These predetermined constraints restrict learning and extractin…
▽ More
In this work, we present a random forest framework that learns the weights, shapes, and sparsities of feature representations for real-time semantic segmentation. Typical filters (kernels) have predetermined shapes and sparsities and learn only weights. A few feature extraction methods fix weights and learn only shapes and sparsities. These predetermined constraints restrict learning and extracting optimal features. To overcome this limitation, we propose an unconstrained representation that is able to extract optimal features by learning weights, shapes, and sparsities. We, then, present the random forest framework that learns the flexible filters using an iterative optimization algorithm and segments input images using the learned representations. We demonstrate the effectiveness of the proposed method using a hand segmentation dataset for hand-object interaction and using two semantic segmentation datasets. The results show that the proposed method achieves real-time semantic segmentation using limited computational and memory resources.
△ Less
Submitted 23 January, 2019;
originally announced January 2019.
-
Variational Autoencoders for New Physics Mining at the Large Hadron Collider
Authors:
Olmo Cerri,
Thong Q. Nguyen,
Maurizio Pierini,
Maria Spiropulu,
Jean-Roch Vlimant
Abstract:
Using variational autoencoders trained on known physics processes, we develop a one-sided threshold test to isolate previously unseen processes as outlier events. Since the autoencoder training does not depend on any specific new physics signature, the proposed procedure doesn't make specific assumptions on the nature of new physics. An event selection based on this algorithm would be complementar…
▽ More
Using variational autoencoders trained on known physics processes, we develop a one-sided threshold test to isolate previously unseen processes as outlier events. Since the autoencoder training does not depend on any specific new physics signature, the proposed procedure doesn't make specific assumptions on the nature of new physics. An event selection based on this algorithm would be complementary to classic LHC searches, typically based on model-dependent hypothesis testing. Such an algorithm would deliver a list of anomalous events, that the experimental collaborations could further scrutinize and even release as a catalog, similarly to what is typically done in other scientific domains. Event topologies repeating in this dataset could inspire new-physics model building and new experimental searches. Running in the trigger system of the LHC experiments, such an application could identify anomalous events that would be otherwise lost, extending the scientific reach of the LHC.
△ Less
Submitted 13 June, 2019; v1 submitted 26 November, 2018;
originally announced November 2018.
-
Topology classification with deep learning to improve real-time event selection at the LHC
Authors:
Thong Q. Nguyen,
Daniel Weitekamp III,
Dustin Anderson,
Roberto Castello,
Olmo Cerri,
Maurizio Pierini,
Maria Spiropulu,
Jean-Roch Vlimant
Abstract:
We show how event topology classification based on deep learning could be used to improve the purity of data samples selected in real time at at the Large Hadron Collider. We consider different data representations, on which different kinds of multi-class classifiers are trained. Both raw data and high-level features are utilized. In the considered examples, a filter based on the classifier's scor…
▽ More
We show how event topology classification based on deep learning could be used to improve the purity of data samples selected in real time at at the Large Hadron Collider. We consider different data representations, on which different kinds of multi-class classifiers are trained. Both raw data and high-level features are utilized. In the considered examples, a filter based on the classifier's score can be trained to retain ~99% of the interesting events and reduce the false-positive rate by as much as one order of magnitude for certain background processes. By operating such a filter as part of the online event selection infrastructure of the LHC experiments, one could benefit from a more flexible and inclusive selection strategy while reducing the amount of downstream resources wasted in processing false positives. The saved resources could be translated into a reduction of the detector operation cost or into an effective increase of storage and processing capabilities, which could be reinvested to extend the physics reach of the LHC experiments.
△ Less
Submitted 2 September, 2019; v1 submitted 29 June, 2018;
originally announced July 2018.
-
DPW-SDNet: Dual Pixel-Wavelet Domain Deep CNNs for Soft Decoding of JPEG-Compressed Images
Authors:
Honggang Chen,
Xiaohai He,
Linbo Qing,
Shuhua Xiong,
Truong Q. Nguyen
Abstract:
JPEG is one of the widely used lossy compression methods. JPEG-compressed images usually suffer from compression artifacts including blocking and blurring, especially at low bit-rates. Soft decoding is an effective solution to improve the quality of compressed images without changing codec or introducing extra coding bits. Inspired by the excellent performance of the deep convolutional neural netw…
▽ More
JPEG is one of the widely used lossy compression methods. JPEG-compressed images usually suffer from compression artifacts including blocking and blurring, especially at low bit-rates. Soft decoding is an effective solution to improve the quality of compressed images without changing codec or introducing extra coding bits. Inspired by the excellent performance of the deep convolutional neural networks (CNNs) on both low-level and high-level computer vision problems, we develop a dual pixel-wavelet domain deep CNNs-based soft decoding network for JPEG-compressed images, namely DPW-SDNet. The pixel domain deep network takes the four downsampled versions of the compressed image to form a 4-channel input and outputs a pixel domain prediction, while the wavelet domain deep network uses the 1-level discrete wavelet transformation (DWT) coefficients to form a 4-channel input to produce a DWT domain prediction. The pixel domain and wavelet domain estimates are combined to generate the final soft decoded result. Experimental results demonstrate the superiority of the proposed DPW-SDNet over several state-of-the-art compression artifacts reduction algorithms.
△ Less
Submitted 26 May, 2018;
originally announced May 2018.
-
Correction by Projection: Denoising Images with Generative Adversarial Networks
Authors:
Subarna Tripathi,
Zachary C. Lipton,
Truong Q. Nguyen
Abstract:
Generative adversarial networks (GANs) transform low-dimensional latent vectors into visually plausible images. If the real dataset contains only clean images, then ostensibly, the manifold learned by the GAN should contain only clean images. In this paper, we propose to denoise corrupted images by finding the nearest point on the GAN manifold, recovering latent vectors by minimizing distances in…
▽ More
Generative adversarial networks (GANs) transform low-dimensional latent vectors into visually plausible images. If the real dataset contains only clean images, then ostensibly, the manifold learned by the GAN should contain only clean images. In this paper, we propose to denoise corrupted images by finding the nearest point on the GAN manifold, recovering latent vectors by minimizing distances in image space. We first demonstrate that given a corrupted version of an image that truly lies on the GAN manifold, we can approximately recover the latent vector and denoise the image, obtaining significantly higher quality, comparing with BM3D. Next, we demonstrate that latent vectors recovered from noisy images exhibit a consistent bias. By subtracting this bias before projecting back to image space, we improve denoising results even further. Finally, even for unseen images, our method performs better at denoising better than BM3D. Notably, the basic version of our method (without bias correction) requires no prior knowledge on the noise variance. To achieve the highest possible denoising quality, the best performing signal processing based methods, such as BM3D, require an estimate of the blur kernel.
△ Less
Submitted 12 March, 2018;
originally announced March 2018.
-
Image denoising with generalized Gaussian mixture model patch priors
Authors:
Charles-Alban Deledalle,
Shibin Parameswaran,
Truong Q. Nguyen
Abstract:
Patch priors have become an important component of image restoration. A powerful approach in this category of restoration algorithms is the popular Expected Patch Log-Likelihood (EPLL) algorithm. EPLL uses a Gaussian mixture model (GMM) prior learned on clean image patches as a way to regularize degraded patches. In this paper, we show that a generalized Gaussian mixture model (GGMM) captures the…
▽ More
Patch priors have become an important component of image restoration. A powerful approach in this category of restoration algorithms is the popular Expected Patch Log-Likelihood (EPLL) algorithm. EPLL uses a Gaussian mixture model (GMM) prior learned on clean image patches as a way to regularize degraded patches. In this paper, we show that a generalized Gaussian mixture model (GGMM) captures the underlying distribution of patches better than a GMM. Even though GGMM is a powerful prior to combine with EPLL, the non-Gaussianity of its components presents major challenges to be applied to a computationally intensive process of image restoration. Specifically, each patch has to undergo a patch classification step and a shrinkage step. These two steps can be efficiently solved with a GMM prior but are computationally impractical when using a GGMM prior. In this paper, we provide approximations and computational recipes for fast evaluation of these two steps, so that EPLL can embed a GGMM prior on an image with more than tens of thousands of patches. Our main contribution is to analyze the accuracy of our approximations based on thorough theoretical analysis. Our evaluations indicate that the GGMM prior is consistently a better fit formodeling image patch distribution and performs better on average in image denoising task.
△ Less
Submitted 11 June, 2018; v1 submitted 5 February, 2018;
originally announced February 2018.