Search | arXiv e-print repository

arXiv:2208.01444 [pdf, other]

Inferring random change point from left-censored longitudinal data by segmented mechanistic nonlinear models, with application in HIV surveillance study

Authors: Hongbin Zhang, McKaylee Robertson, Sarah L. Braunstein, Levi Waldron, Denis Nash

Abstract: The primary goal of public health efforts to control HIV epidemics is to diagnose and treat people with HIV infection as soon as possible after seroconversion. The timing of initiation of antiretroviral therapy (ART) treatment after HIV diagnosis is, therefore, a critical population-level indicator that can be used to measure the effectiveness of public health programs and policies at local and na… ▽ More The primary goal of public health efforts to control HIV epidemics is to diagnose and treat people with HIV infection as soon as possible after seroconversion. The timing of initiation of antiretroviral therapy (ART) treatment after HIV diagnosis is, therefore, a critical population-level indicator that can be used to measure the effectiveness of public health programs and policies at local and national levels. However, population-based data on ART initiation are unavailable because ART initiation and prescription are typically measured indirectly by public health departments (e.g., with viral suppression as a proxy). In this paper, we present a random change-point model to infer the time of ART initiation utilizing routinely reported individual-level HIV viral load from an HIV surveillance system. To deal with the left-censoring and the nonlinear trajectory of viral load data, we formulate a flexible segmented nonlinear mixed effects model and propose a Stochastic version of EM (StEM) algorithm, coupled with a Gibbs sampler for the inference. We apply the method to a random subset of HIV surveillance data to infer the timing of ART initiation since diagnosis and to gain additional insights into the viral load dynamics. Simulation studies are also performed to evaluate the properties of the proposed method. △ Less

Submitted 2 August, 2022; originally announced August 2022.

arXiv:2005.03788 [pdf, other]

ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks

Authors: Xinshao Wang, Yang Hua, Elyor Kodirov, David A. Clifton, Neil M. Robertson

Abstract: To train robust deep neural networks (DNNs), we systematically study several target modification approaches, which include output regularisation, self and non-self label correction (LC). Two key issues are discovered: (1) Self LC is the most appealing as it exploits its own knowledge and requires no extra models. However, how to automatically decide the trust degree of a learner as training goes i… ▽ More To train robust deep neural networks (DNNs), we systematically study several target modification approaches, which include output regularisation, self and non-self label correction (LC). Two key issues are discovered: (1) Self LC is the most appealing as it exploits its own knowledge and requires no extra models. However, how to automatically decide the trust degree of a learner as training goes is not well answered in the literature? (2) Some methods penalise while the others reward low-entropy predictions, prompting us to ask which one is better? To resolve the first issue, taking two well-accepted propositions--deep neural networks learn meaningful patterns before fitting noise [3] and minimum entropy regularisation principle [10]--we propose a novel end-to-end method named ProSelfLC, which is designed according to learning time and entropy. Specifically, given a data point, we progressively increase trust in its predicted label distribution versus its annotated one if a model has been trained for enough time and the prediction is of low entropy (high confidence). For the second issue, according to ProSelfLC, we empirically prove that it is better to redefine a meaningful low-entropy status and optimise the learner toward it. This serves as a defence of entropy minimisation. We demonstrate the effectiveness of ProSelfLC through extensive experiments in both clean and noisy settings. The source code is available at https://github.com/XinshaoAmosWang/ProSelfLC-CVPR2021. Keywords: entropy minimisation, maximum entropy, confidence penalty, self knowledge distillation, label correction, label noise, semi-supervised learning, output regularisation △ Less

Submitted 2 June, 2021; v1 submitted 7 May, 2020; originally announced May 2020.

Comments: ProSelfLC is the first method to trust self knowledge progressively and adaptively. ProSelfLC redirects and promotes entropy minimisation, which is in marked contrast to recent practices of confidence penalty [42, 33, 6]

Journal ref: CVPR 2021

arXiv:2001.11103 [pdf, other]

doi 10.24963/ijcai.2021/293

Simulation of electron-proton scattering events by a Feature-Augmented and Transformed Generative Adversarial Network (FAT-GAN)

Authors: Yasir Alanazi, N. Sato, Tianbo Liu, W. Melnitchouk, Pawel Ambrozewicz, Florian Hauenstein, Michelle P. Kuchera, Evan Pritchard, Michael Robertson, Ryan Strauss, Luisa Velasco, Yaohang Li

Abstract: We apply generative adversarial network (GAN) technology to build an event generator that simulates particle production in electron-proton scattering that is free of theoretical assumptions about underlying particle dynamics. The difficulty of efficiently training a GAN event simulator lies in learning the complicated patterns of the distributions of the particles physical properties. We develop a… ▽ More We apply generative adversarial network (GAN) technology to build an event generator that simulates particle production in electron-proton scattering that is free of theoretical assumptions about underlying particle dynamics. The difficulty of efficiently training a GAN event simulator lies in learning the complicated patterns of the distributions of the particles physical properties. We develop a GAN that selects a set of transformed features from particle momenta that can be generated easily by the generator, and uses these to produce a set of augmented features that improve the sensitivity of the discriminator. The new Feature-Augmented and Transformed GAN (FAT-GAN) is able to faithfully reproduce the distribution of final state electron momenta in inclusive electron scattering, without the need for input derived from domain-based theoretical assumptions. The developed technology can play a significant role in boosting the science of existing and future accelerator facilities, such as the Electron-Ion Collider. △ Less

Submitted 27 May, 2021; v1 submitted 29 January, 2020; originally announced January 2020.

Comments: 7 pages, 5 figures, expanded author list, paper accepted in IJCAI21

Report number: JLAB-THY-20-3136

Journal ref: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) Main Track, p. 2126 (2021)

arXiv:1905.11233 [pdf, other]

Derivative Manipulation for General Example Weighting

Authors: Xinshao Wang, Elyor Kodirov, Yang Hua, Neil M. Robertson

Abstract: Real-world large-scale datasets usually contain noisy labels and are imbalanced. Therefore, we propose derivative manipulation (DM), a novel and general example weighting approach for training robust deep models under these adverse conditions. DM has two main merits. First, loss function and example weighting are common techniques in the literature. DM reveals their connection (a loss function d… ▽ More Real-world large-scale datasets usually contain noisy labels and are imbalanced. Therefore, we propose derivative manipulation (DM), a novel and general example weighting approach for training robust deep models under these adverse conditions. DM has two main merits. First, loss function and example weighting are common techniques in the literature. DM reveals their connection (a loss function does example weighting) and is a replacement of both. Second, despite that a loss defines an example weighting scheme by its derivative, in the loss design, we need to consider whether it is differentiable. Instead, DM is more flexible by directly modifying the derivative so that a loss can be a non-elementary format too. Technically, DM defines an emphasis density function by a derivative magnitude function. DM is generic in that diverse weighting schemes can be derived. Extensive experiments on both vision and language tasks prove DM's effectiveness. △ Less

Submitted 3 October, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

arXiv:1903.12141 [pdf, other]

IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude's Variance Matters

Authors: Xinshao Wang, Yang Hua, Elyor Kodirov, David A. Clifton, Neil M. Robertson

Abstract: In this work, we study robust deep learning against abnormal training data from the perspective of example weighting built in empirical loss functions, i.e., gradient magnitude with respect to logits, an angle that is not thoroughly studied so far. Consequently, we have two key findings: (1) Mean Absolute Error (MAE) Does Not Treat Examples Equally. We present new observations and insightful analy… ▽ More In this work, we study robust deep learning against abnormal training data from the perspective of example weighting built in empirical loss functions, i.e., gradient magnitude with respect to logits, an angle that is not thoroughly studied so far. Consequently, we have two key findings: (1) Mean Absolute Error (MAE) Does Not Treat Examples Equally. We present new observations and insightful analysis about MAE, which is theoretically proved to be noise-robust. First, we reveal its underfitting problem in practice. Second, we analyse that MAE's noise-robustness is from emphasising on uncertain examples instead of treating training samples equally, as claimed in prior work. (2) The Variance of Gradient Magnitude Matters. We propose an effective and simple solution to enhance MAE's fitting ability while preserving its noise-robustness. Without changing MAE's overall weighting scheme, i.e., what examples get higher weights, we simply change its weighting variance non-linearly so that the impact ratio between two examples are adjusted. Our solution is termed Improved MAE (IMAE). We prove IMAE's effectiveness using extensive experiments: image classification under clean labels, synthetic label noise, and real-world unknown noise. △ Less

Submitted 1 May, 2023; v1 submitted 28 March, 2019; originally announced March 2019.

Comments: ICLR 2023, RTML Workshop paper. For the source code, based on the requests for academic research and kindness to cite our work, we will release and maintain it in https://github.com/XinshaoAmosWang/DeepCriticalLearning

arXiv:1811.01459 [pdf, other]

Deep Metric Learning by Online Soft Mining and Class-Aware Attention

Authors: Xinshao Wang, Yang Hua, Elyor Kodirov, Guosheng Hu, Neil M. Robertson

Abstract: Deep metric learning aims to learn a deep embedding that can capture the semantic similarity of data points. Given the availability of massive training samples, deep metric learning is known to suffer from slow convergence due to a large fraction of trivial samples. Therefore, most existing methods generally resort to sample mining strategies for selecting nontrivial samples to accelerate converge… ▽ More Deep metric learning aims to learn a deep embedding that can capture the semantic similarity of data points. Given the availability of massive training samples, deep metric learning is known to suffer from slow convergence due to a large fraction of trivial samples. Therefore, most existing methods generally resort to sample mining strategies for selecting nontrivial samples to accelerate convergence and improve performance. In this work, we identify two critical limitations of the sample mining methods, and provide solutions for both of them. First, previous mining methods assign one binary score to each sample, i.e., drop** or kee** it, so they only selects a subset of relevant samples in a mini-batch. Therefore, we propose a novel sample mining method, called Online Soft Mining (OSM), which assigns one continuous score to each sample to make use of all samples in the mini-batch. OSM learns extended manifolds that preserve useful intraclass variances by focusing on more similar positives. Second, the existing methods are easily influenced by outliers as they are generally included in the mined subset. To address this, we introduce Class-Aware Attention (CAA) that assigns little attention to abnormal data samples. Furthermore, by combining OSM and CAA, we propose a novel weighted contrastive loss to learn discriminative embeddings. Extensive experiments on two fine-grained visual categorisation datasets and two video-based person re-identification benchmarks show that our method significantly outperforms the state-of-the-art. △ Less

Submitted 3 December, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

Comments: Learning Robust Representations, Deep Metric Learning, Person Re-identification (AAAI 2019 Oral) Code: https://github.com/XinshaoAmosWang/OSM_CAA_WeightedContrastiveLoss

Showing 1–6 of 6 results for author: Robertson, M