Search | arXiv e-print repository

Fusing Individualized Treatment Rules Using Secondary Outcomes

Authors: Daiqi Gao, Yuanjia Wang, Donglin Zeng

Abstract: An individualized treatment rule (ITR) is a decision rule that recommends treatments for patients based on their individual feature variables. In many practices, the ideal ITR for the primary outcome is also expected to cause minimal harm to other secondary outcomes. Therefore, our objective is to learn an ITR that not only maximizes the value function for the primary outcome, but also approximate… ▽ More An individualized treatment rule (ITR) is a decision rule that recommends treatments for patients based on their individual feature variables. In many practices, the ideal ITR for the primary outcome is also expected to cause minimal harm to other secondary outcomes. Therefore, our objective is to learn an ITR that not only maximizes the value function for the primary outcome, but also approximates the optimal rule for the secondary outcomes as closely as possible. To achieve this goal, we introduce a fusion penalty to encourage the ITRs based on different outcomes to yield similar recommendations. Two algorithms are proposed to estimate the ITR using surrogate loss functions. We prove that the agreement rate between the estimated ITR of the primary outcome and the optimal ITRs of the secondary outcomes converges to the true agreement rate faster than if the secondary outcomes are not taken into consideration. Furthermore, we derive the non-asymptotic properties of the value function and misclassification rate for the proposed method. Finally, simulation studies and a real data example are used to demonstrate the finite-sample performance of the proposed method. △ Less

Submitted 9 March, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Journal ref: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:712-720, 2024

arXiv:2301.12553 [pdf, ps, other]

Asymptotic Inference for Multi-Stage Stationary Treatment Policy with High Dimensional Features

Authors: Daiqi Gao, Yufeng Liu, Donglin Zeng

Abstract: Dynamic treatment rules or policies are a sequence of decision functions over multiple stages that are tailored to individual features. One important class of treatment policies for practice, namely multi-stage stationary treatment policies, prescribe treatment assignment probabilities using the same decision function over stages, where the decision is based on the same set of features consisting… ▽ More Dynamic treatment rules or policies are a sequence of decision functions over multiple stages that are tailored to individual features. One important class of treatment policies for practice, namely multi-stage stationary treatment policies, prescribe treatment assignment probabilities using the same decision function over stages, where the decision is based on the same set of features consisting of both baseline variables (e.g., demographics) and time-evolving variables (e.g., routinely collected disease biomarkers). Although there has been extensive literature to construct valid inference for the value function associated with the dynamic treatment policies, little work has been done for the policies themselves, especially in the presence of high dimensional feature variables. We aim to fill in the gap in this work. Specifically, we first estimate the multistage stationary treatment policy based on an augmented inverse probability weighted estimator for the value function to increase the asymptotic efficiency, and further apply a penalty to select important feature variables. We then construct one-step improvement of the policy parameter estimators. Theoretically, we show that the improved estimators are asymptotically normal, even if nuisance parameters are estimated at a slow convergence rate and the dimension of the feature variables increases with the sample size. Our numerical studies demonstrate that the proposed method has satisfactory performance in small samples, and that the performance can be improved with a choice of the augmentation term that approximates the rewards or minimizes the variance of the value function. △ Less

Submitted 22 May, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

arXiv:2009.00647 [pdf, other]

Lifelong Graph Learning

Authors: Chen Wang, Yuheng Qiu, Dasong Gao, Sebastian Scherer

Abstract: Graph neural networks (GNN) are powerful models for many graph-structured tasks. Existing models often assume that the complete structure of the graph is available during training. In practice, however, graph-structured data is usually formed in a streaming fashion so that learning a graph continuously is often necessary. In this paper, we bridge GNN and lifelong learning by converting a continual… ▽ More Graph neural networks (GNN) are powerful models for many graph-structured tasks. Existing models often assume that the complete structure of the graph is available during training. In practice, however, graph-structured data is usually formed in a streaming fashion so that learning a graph continuously is often necessary. In this paper, we bridge GNN and lifelong learning by converting a continual graph learning problem to a regular graph learning problem so GNN can inherit the lifelong learning techniques developed for convolutional neural networks (CNN). We propose a new topology, the feature graph, which takes features as new nodes and turns nodes into independent graphs. This successfully converts the original problem of node classification to graph classification. In the experiments, we demonstrate the efficiency and effectiveness of feature graph networks (FGN) by continuously learning a sequence of classical graph datasets. We also show that FGN achieves superior performance in two applications, i.e., lifelong human action recognition with wearable devices and feature matching. To the best of our knowledge, FGN is the first method to bridge graph learning and lifelong learning via a novel graph topology. Source code is available at https://github.com/wang-chen/LGL △ Less

Submitted 26 March, 2022; v1 submitted 1 September, 2020; originally announced September 2020.

Comments: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

arXiv:2007.01587 [pdf, other]

Privacy Threats Against Federated Matrix Factorization

Authors: Dashan Gao, Ben Tan, Ce Ju, Vincent W. Zheng, Qiang Yang

Abstract: Matrix Factorization has been very successful in practical recommendation applications and e-commerce. Due to data shortage and stringent regulations, it can be hard to collect sufficient data to build performant recommender systems for a single company. Federated learning provides the possibility to bridge the data silos and build machine learning models without compromising privacy and security.… ▽ More Matrix Factorization has been very successful in practical recommendation applications and e-commerce. Due to data shortage and stringent regulations, it can be hard to collect sufficient data to build performant recommender systems for a single company. Federated learning provides the possibility to bridge the data silos and build machine learning models without compromising privacy and security. Participants sharing common users or items collaboratively build a model over data from all the participants. There have been some works exploring the application of federated learning to recommender systems and the privacy issues in collaborative filtering systems. However, the privacy threats in federated matrix factorization are not studied. In this paper, we categorize federated matrix factorization into three types based on the partition of feature space and analyze privacy threats against each type of federated matrix factorization model. We also discuss privacy-preserving approaches. As far as we are aware, this is the first study of privacy threats of the matrix factorization method in the federated learning framework. △ Less

Submitted 3 July, 2020; originally announced July 2020.

Comments: 6 pages, 2 figures, 1 table, Accepted for Workshop on Federated Learning for Data Privacy and Confidentiality in Conjunction with IJCAI 2020 (FL-IJCAI'20)

arXiv:2004.06846 [pdf, other]

MxPool: Multiplex Pooling for Hierarchical Graph Representation Learning

Authors: Yanyan Liang, Yanfeng Zhang, Dechao Gao, Qian Xu

Abstract: How to utilize deep learning methods for graph classification tasks has attracted considerable research attention in the past few years. Regarding graph classification tasks, the graphs to be classified may have various graph sizes (i.e., different number of nodes and edges) and have various graph properties (e.g., average node degree, diameter, and clustering coefficient). The diverse property of… ▽ More How to utilize deep learning methods for graph classification tasks has attracted considerable research attention in the past few years. Regarding graph classification tasks, the graphs to be classified may have various graph sizes (i.e., different number of nodes and edges) and have various graph properties (e.g., average node degree, diameter, and clustering coefficient). The diverse property of graphs has imposed significant challenges on existing graph learning techniques since diverse graphs have different best-fit hyperparameters. It is difficult to learn graph features from a set of diverse graphs by a unified graph neural network. This motivates us to use a multiplex structure in a diverse way and utilize a priori properties of graphs to guide the learning. In this paper, we propose MxPool, which concurrently uses multiple graph convolution/pooling networks to build a hierarchical learning structure for graph representation learning tasks. Our experiments on numerous graph classification benchmarks show that our MxPool has superiority over other state-of-the-art graph representation learning methods. △ Less

Submitted 14 April, 2020; originally announced April 2020.

arXiv:2004.04631 [pdf, other]

Private Knowledge Transfer via Model Distillation with Generative Adversarial Networks

Authors: Di Gao, Cheng Zhuo

Abstract: The deployment of deep learning applications has to address the growing privacy concerns when using private and sensitive data for training. A conventional deep learning model is prone to privacy attacks that can recover the sensitive information of individuals from either model parameters or accesses to the target model. Recently, differential privacy that offers provable privacy guarantees has b… ▽ More The deployment of deep learning applications has to address the growing privacy concerns when using private and sensitive data for training. A conventional deep learning model is prone to privacy attacks that can recover the sensitive information of individuals from either model parameters or accesses to the target model. Recently, differential privacy that offers provable privacy guarantees has been proposed to train neural networks in a privacy-preserving manner to protect training data. However, many approaches tend to provide the worst case privacy guarantees for model publishing, inevitably impairing the accuracy of the trained models. In this paper, we present a novel private knowledge transfer strategy, where the private teacher trained on sensitive data is not publicly accessible but teaches a student to be publicly released. In particular, a three-player (teacher-student-discriminator) learning framework is proposed to achieve trade-off between utility and privacy, where the student acquires the distilled knowledge from the teacher and is trained with the discriminator to generate similar outputs as the teacher. We then integrate a differential privacy protection mechanism into the learning procedure, which enables a rigorous privacy budget for the training. The framework eventually allows student to be trained with only unlabelled public data and very few epochs, and hence prevents the exposure of sensitive training data, while ensuring model utility with a modest privacy budget. The experiments on MNIST, SVHN and CIFAR-10 datasets show that our students obtain the accuracy losses w.r.t teachers of 0.89%, 2.29%, 5.16%, respectively with the privacy bounds of (1.93, 10^-5), (5.02, 10^-6), (8.81, 10^-6). When compared with the existing works \cite{papernot2016semi,wang2019private}, the proposed work can achieve 5-82% accuracy loss improvement. △ Less

Submitted 5 April, 2020; originally announced April 2020.

Comments: 9 pages, 4 figures, ECAI 2020, the 24th European Conference on Artificial Intelligence

ACM Class: I.2.6

arXiv:2003.10661 [pdf]

doi 10.1121/10.0001125

Training a U-Net based on a random mode-coupling matrix model to recover acoustic interference striations

Authors: Xiaolei Li, Wenhua Song, Dazhi Gao, Wei Gao, Haozhong Wan

Abstract: A U-Net is trained to recover acoustic interference striations (AISs) from distorted ones. A random mode-coupling matrix model is introduced to generate a large number of training data quickly, which are used to train the U-Net. The performance of AIS recovery of the U-Net is tested in range-dependent waveguides with nonlinear internal waves (NLIWs). Although the random mode-coupling matrix model… ▽ More A U-Net is trained to recover acoustic interference striations (AISs) from distorted ones. A random mode-coupling matrix model is introduced to generate a large number of training data quickly, which are used to train the U-Net. The performance of AIS recovery of the U-Net is tested in range-dependent waveguides with nonlinear internal waves (NLIWs). Although the random mode-coupling matrix model is not an accurate physical model, the test results show that the U-Net successfully recovers AISs under different signal-to-noise ratios (SNRs) and different amplitudes and widths of NLIWs for different shapes. △ Less

Submitted 24 March, 2020; originally announced March 2020.

arXiv:1904.00583 [pdf, other]

doi 10.1121/1.5126115

Sound source ranging using a feed-forward neural network with fitting-based early stop**

Authors: **g Chi, Xiaolei Li, Haozhong Wang, Dazhi Gao, Peter Gerstoft

Abstract: When a feed-forward neural network (FNN) is trained for source ranging in an ocean waveguide, it is difficult evaluating the range accuracy of the FNN on unlabeled test data. A fitting-based early stop** (FEAST) method is introduced to evaluate the range error of the FNN on test data where the distance of source is unknown. Based on FEAST, when the evaluated range error of the FNN reaches the mi… ▽ More When a feed-forward neural network (FNN) is trained for source ranging in an ocean waveguide, it is difficult evaluating the range accuracy of the FNN on unlabeled test data. A fitting-based early stop** (FEAST) method is introduced to evaluate the range error of the FNN on test data where the distance of source is unknown. Based on FEAST, when the evaluated range error of the FNN reaches the minimum on test data, stop** training, which will help to improve the ranging accuracy of the FNN on the test data. The FEAST is demonstrated on simulated and experimental data. △ Less

Submitted 1 April, 2019; originally announced April 2019.

arXiv:1810.00139 [pdf, other]

Knowledge-guided Semantic Computing Network

Authors: Guangming Shi, Zhongqiang Zhang, Dahua Gao, Xuemei Xie, Yihao Feng, Xinrui Ma, Danhua Liu

Abstract: It is very useful to integrate human knowledge and experience into traditional neural networks for faster learning speed, fewer training samples and better interpretability. However, due to the obscured and indescribable black box model of neural networks, it is very difficult to design its architecture, interpret its features and predict its performance. Inspired by human visual cognition process… ▽ More It is very useful to integrate human knowledge and experience into traditional neural networks for faster learning speed, fewer training samples and better interpretability. However, due to the obscured and indescribable black box model of neural networks, it is very difficult to design its architecture, interpret its features and predict its performance. Inspired by human visual cognition process, we propose a knowledge-guided semantic computing network which includes two modules: a knowledge-guided semantic tree and a data-driven neural network. The semantic tree is pre-defined to describe the spatial structural relations of different semantics, which just corresponds to the tree-like description of objects based on human knowledge. The object recognition process through the semantic tree only needs simple forward computing without training. Besides, to enhance the recognition ability of the semantic tree in aspects of the diversity, randomicity and variability, we use the traditional neural network to aid the semantic tree to learn some indescribable features. Only in this case, the training process is needed. The experimental results on MNIST and GTSRB datasets show that compared with the traditional data-driven network, our proposed semantic computing network can achieve better performance with fewer training samples and lower computational complexity. Especially, Our model also has better adversarial robustness than traditional neural network with the help of human knowledge. △ Less

Submitted 28 September, 2018; originally announced October 2018.

Comments: 13 pages, 13 figures

arXiv:1302.4141 [pdf, ps, other]

Canonical dual solutions to nonconvex radial basis neural network optimization problem

Authors: Vittorio Latorre, David Yang Gao

Abstract: Radial Basis Functions Neural Networks (RBFNNs) are tools widely used in regression problems. One of their principal drawbacks is that the formulation corresponding to the training with the supervision of both the centers and the weights is a highly non-convex optimization problem, which leads to some fundamentally difficulties for traditional optimization theory and methods. This paper presents… ▽ More Radial Basis Functions Neural Networks (RBFNNs) are tools widely used in regression problems. One of their principal drawbacks is that the formulation corresponding to the training with the supervision of both the centers and the weights is a highly non-convex optimization problem, which leads to some fundamentally difficulties for traditional optimization theory and methods. This paper presents a generalized canonical duality theory for solving this challenging problem. We demonstrate that by sequential canonical dual transformations, the nonconvex optimization problem of the RBFNN can be reformulated as a canonical dual problem (without duality gap). Both global optimal solution and local extrema can be classified. Several applications to one of the most used Radial Basis Functions, the Gaussian function, are illustrated. Our results show that even for one-dimensional case, the global minimizer of the nonconvex problem may not be the best solution to the RBFNNs, and the canonical dual theory is a promising tool for solving general neural networks training problems. △ Less

Submitted 17 February, 2013; originally announced February 2013.

Comments: 10 pages, 7 figures, one table

Showing 1–10 of 10 results for author: Gao, D