Search | arXiv e-print repository

A Generalization Bound of Deep Neural Networks for Dependent Data

Authors: Quan Huu Do, Binh T. Nguyen, Lam Si Tung Ho

Abstract: Existing generalization bounds for deep neural networks require data to be independent and identically distributed (iid). This assumption may not hold in real-life applications such as evolutionary biology, infectious disease epidemiology, and stock price prediction. This work establishes a generalization bound of feed-forward neural networks for non-stationary $φ$-mixing data. Existing generalization bounds for deep neural networks require data to be independent and identically distributed (iid). This assumption may not hold in real-life applications such as evolutionary biology, infectious disease epidemiology, and stock price prediction. This work establishes a generalization bound of feed-forward neural networks for non-stationary $φ$-mixing data. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2206.09076 [pdf, other]

Fair Generalized Linear Models with a Convex Penalty

Authors: Hyungrok Do, Preston Putzel, Axel Martin, Padhraic Smyth, Judy Zhong

Abstract: Despite recent advances in algorithmic fairness, methodologies for achieving fairness with generalized linear models (GLMs) have yet to be explored in general, despite GLMs being widely used in practice. In this paper we introduce two fairness criteria for GLMs based on equalizing expected outcomes or log-likelihoods. We prove that for GLMs both criteria can be achieved via a convex penalty term b… ▽ More Despite recent advances in algorithmic fairness, methodologies for achieving fairness with generalized linear models (GLMs) have yet to be explored in general, despite GLMs being widely used in practice. In this paper we introduce two fairness criteria for GLMs based on equalizing expected outcomes or log-likelihoods. We prove that for GLMs both criteria can be achieved via a convex penalty term based solely on the linear components of the GLM, thus permitting efficient optimization. We also derive theoretical properties for the resulting fair GLM estimator. To empirically demonstrate the efficacy of the proposed fair GLM, we compare it with other well-known fair prediction methods on an extensive set of benchmark datasets for binary classification and regression. In addition, we demonstrate that the fair GLM can generate fair predictions for a range of response variables, other than binary and continuous outcomes. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: Accepted for publication in ICML 2022

arXiv:2105.04648 [pdf, other]

doi 10.1111/biom.13632

Joint Fairness Model with Applications to Risk Predictions for Under-represented Populations

Authors: Hyungrok Do, Shin**i Nandi, Preston Putzel, Padhraic Smyth, Judy Zhong

Abstract: In data collection for predictive modeling, under-representation of certain groups, based on gender, race/ethnicity, or age, may yield less-accurate predictions for these groups. Recently, this issue of fairness in predictions has attracted significant attention, as data-driven models are increasingly utilized to perform crucial decision-making tasks. Existing methods to achieve fairness in the ma… ▽ More In data collection for predictive modeling, under-representation of certain groups, based on gender, race/ethnicity, or age, may yield less-accurate predictions for these groups. Recently, this issue of fairness in predictions has attracted significant attention, as data-driven models are increasingly utilized to perform crucial decision-making tasks. Existing methods to achieve fairness in the machine learning literature typically build a single prediction model in a manner that encourages fair prediction performance for all groups. These approaches have two major limitations: i) fairness is often achieved by compromising accuracy for some groups; ii) the underlying relationship between dependent and independent variables may not be the same across groups. We propose a Joint Fairness Model (JFM) approach for logistic regression models for binary outcomes that estimates group-specific classifiers using a joint modeling objective function that incorporates fairness criteria for prediction. We introduce an Accelerated Smoothing Proximal Gradient Algorithm to solve the convex objective function, and present the key asymptotic properties of the JFM estimates. Through simulations, we demonstrate the efficacy of the JFM in achieving good prediction performance and across-group parity, in comparison with the single fairness model, group-separate model, and group-ignorant model, especially when the minority group's sample size is small. Finally, we demonstrate the utility of the JFM method in a real-world example to obtain fair risk predictions for under-represented older patients diagnosed with coronavirus disease 2019 (COVID-19). △ Less

Submitted 23 February, 2022; v1 submitted 10 May, 2021; originally announced May 2021.

Comments: 34 pages, 4 figures, 1 table

arXiv:2008.12578 [pdf, other]

doi 10.1016/j.eswa.2021.114711

Graph Convolutional Neural Networks with Node Transition Probability-based Message Passing and DropNode Regularization

Authors: Tien Huu Do, Duc Minh Nguyen, Giannis Bekoulis, Adrian Munteanu, Nikos Deligiannis

Abstract: Graph convolutional neural networks (GCNNs) have received much attention recently, owing to their capability in handling graph-structured data. Among the existing GCNNs, many methods can be viewed as instances of a neural message passing motif; features of nodes are passed around their neighbors, aggregated and transformed to produce better nodes' representations. Nevertheless, these methods seldo… ▽ More Graph convolutional neural networks (GCNNs) have received much attention recently, owing to their capability in handling graph-structured data. Among the existing GCNNs, many methods can be viewed as instances of a neural message passing motif; features of nodes are passed around their neighbors, aggregated and transformed to produce better nodes' representations. Nevertheless, these methods seldom use node transition probabilities, a measure that has been found useful in exploring graphs. Furthermore, when the transition probabilities are used, their transition direction is often improperly considered in the feature aggregation step, resulting in an inefficient weighting scheme. In addition, although a great number of GCNN models with increasing level of complexity have been introduced, the GCNNs often suffer from over-fitting when being trained on small graphs. Another issue of the GCNNs is over-smoothing, which tends to make nodes' representations indistinguishable. This work presents a new method to improve the message passing process based on node transition probabilities by properly considering the transition direction, leading to a better weighting scheme in nodes' features aggregation compared to the existing counterpart. Moreover, we propose a novel regularization method termed DropNode to address the over-fitting and over-smoothing issues simultaneously. DropNode randomly discards part of a graph, thus it creates multiple deformed versions of the graph, leading to data augmentation regularization effect. Additionally, DropNode lessens the connectivity of the graph, mitigating the effect of over-smoothing in deep GCNNs. Extensive experiments on eight benchmark datasets for node and graph classification tasks demonstrate the effectiveness of the proposed methods in comparison with the state of the art. △ Less

Submitted 18 March, 2021; v1 submitted 28 August, 2020; originally announced August 2020.

Comments: Expert Systems with Applications, graph-based deep learning, graph neural networks, document classification

Journal ref: Expert Systems with Applications, 174 (2021), Elsevier

arXiv:1905.03042 [pdf, other]

Rumour Detection via News Propagation Dynamics and User Representation Learning

Authors: Tien Huu Do, Xiao Luo, Duc Minh Nguyen, Nikos Deligiannis

Abstract: Rumours have existed for a long time and have been known for serious consequences. The rapid growth of social media platforms has multiplied the negative impact of rumours; it thus becomes important to early detect them. Many methods have been introduced to detect rumours using the content or the social context of news. However, most existing methods ignore or do not explore effectively the propag… ▽ More Rumours have existed for a long time and have been known for serious consequences. The rapid growth of social media platforms has multiplied the negative impact of rumours; it thus becomes important to early detect them. Many methods have been introduced to detect rumours using the content or the social context of news. However, most existing methods ignore or do not explore effectively the propagation pattern of news in social media, including the sequence of interactions of social media users with news across time. In this work, we propose a novel method for rumour detection based on deep learning. Our method leverages the propagation process of the news by learning the users' representation and the temporal interrelation of users' responses. Experiments conducted on Twitter and Weibo datasets demonstrate the state-of-the-art performance of the proposed method. △ Less

Submitted 18 April, 2019; originally announced May 2019.

arXiv:1811.01662 [pdf, other]

Matrix Completion With Variational Graph Autoencoders: Application in Hyperlocal Air Quality Inference

Authors: Tien Huu Do, Duc Minh Nguyen, Evaggelia Tsiligianni, Angel Lopez Aguirre, Valerio Panzica La Manna, Frank Pasveer, Wilfried Philips, Nikos Deligiannis

Abstract: Inferring air quality from a limited number of observations is an essential task for monitoring and controlling air pollution. Existing inference methods typically use low spatial resolution data collected by fixed monitoring stations and infer the concentration of air pollutants using additional types of data, e.g., meteorological and traffic information. In this work, we focus on street-level ai… ▽ More Inferring air quality from a limited number of observations is an essential task for monitoring and controlling air pollution. Existing inference methods typically use low spatial resolution data collected by fixed monitoring stations and infer the concentration of air pollutants using additional types of data, e.g., meteorological and traffic information. In this work, we focus on street-level air quality inference by utilizing data collected by mobile stations. We formulate air quality inference in this setting as a graph-based matrix completion problem and propose a novel variational model based on graph convolutional autoencoders. Our model captures effectively the spatio-temporal correlation of the measurements and does not depend on the availability of additional information apart from the street-network topology. Experiments on a real air quality dataset, collected with mobile stations, shows that the proposed model outperforms state-of-the-art approaches. △ Less

Submitted 5 November, 2018; originally announced November 2018.

arXiv:1712.08091 [pdf, other]

Multiview Deep Learning for Predicting Twitter Users' Location

Authors: Tien Huu Do, Duc Minh Nguyen, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis

Abstract: The problem of predicting the location of users on large social networks like Twitter has emerged from real-life applications such as social unrest detection and online marketing. Twitter user geolocation is a difficult and active research topic with a vast literature. Most of the proposed methods follow either a content-based or a network-based approach. The former exploits user-generated content… ▽ More The problem of predicting the location of users on large social networks like Twitter has emerged from real-life applications such as social unrest detection and online marketing. Twitter user geolocation is a difficult and active research topic with a vast literature. Most of the proposed methods follow either a content-based or a network-based approach. The former exploits user-generated content while the latter utilizes the connection or interaction between Twitter users. In this paper, we introduce a novel method combining the strength of both approaches. Concretely, we propose a multi-entry neural network architecture named MENET leveraging the advances in deep learning and multiview learning. The generalizability of MENET enables the integration of multiple data representations. In the context of Twitter user geolocation, we realize MENET with textual, network, and metadata features. Considering the natural distribution of Twitter users across the concerned geographical area, we subdivide the surface of the earth into multi-scale cells and train MENET with the labels of the cells. We show that our method outperforms the state of the art by a large margin on three benchmark datasets. △ Less

Submitted 21 December, 2017; originally announced December 2017.

Comments: Submitted to the IEEE Transactions on Big Data

arXiv:1501.07506 [pdf, other]

Accuracy of areal interpolation methods for count data

Authors: Van Huyen Do, Christine Thomas-Agnan, Anne Vanhems

Abstract: The combination of several socio-economic data bases originating from different administrative sources collected on several different partitions of a geographic zone of interest into administrative units induces the so called areal interpolation problem. This problem is that of allocating the data from a set of source spatial units to a set of target spatial units. A particular case of that proble… ▽ More The combination of several socio-economic data bases originating from different administrative sources collected on several different partitions of a geographic zone of interest into administrative units induces the so called areal interpolation problem. This problem is that of allocating the data from a set of source spatial units to a set of target spatial units. A particular case of that problem is the re-allocation to a single target partition which is a regular grid. At the European level for example, the EU directive 'INSPIRE', or INfrastructure for SPatial InfoRmation, encourages the states to provide socio-economic data on a common grid to facilitate economic studies across states. In the literature, there are three main types of such techniques: proportional weighting schemes, smoothing techniques and regression based interpolation. We propose a stochastic model based on Poisson point patterns to study the statistical accuracy of these techniques for regular grid targets in the case of count data. The error depends on the nature of the target variable and its correlation with the auxiliary variable. For simplicity, we restrict attention to proportional weighting schemes and Poisson regression based methods. Our conclusion is that there is no technique which always dominates. △ Less

Submitted 29 January, 2015; originally announced January 2015.

arXiv:1201.4714 [pdf, other]

A metric learning perspective of SVM: on the relation of SVM and LMNN

Authors: Huyen Do, Alexandros Kalousis, Jun Wang, Adam Woznica

Abstract: Support Vector Machines, SVMs, and the Large Margin Nearest Neighbor algorithm, LMNN, are two very popular learning algorithms with quite different learning biases. In this paper we bring them into a unified view and show that they have a much stronger relation than what is commonly thought. We analyze SVMs from a metric learning perspective and cast them as a metric learning problem, a view which… ▽ More Support Vector Machines, SVMs, and the Large Margin Nearest Neighbor algorithm, LMNN, are two very popular learning algorithms with quite different learning biases. In this paper we bring them into a unified view and show that they have a much stronger relation than what is commonly thought. We analyze SVMs from a metric learning perspective and cast them as a metric learning problem, a view which helps us uncover the relations of the two algorithms. We show that LMNN can be seen as learning a set of local SVM-like models in a quadratic space. Along the way and inspired by the metric-based interpretation of SVM s we derive a novel variant of SVMs, epsilon-SVM, to which LMNN is even more similar. We give a unified view of LMNN and the different SVM variants. Finally we provide some preliminary experiments on a number of benchmark datasets in which show that epsilon-SVM compares favorably both with respect to LMNN and SVM. △ Less

Submitted 23 January, 2012; originally announced January 2012.

Comments: To appear in AISTATS 2012

Showing 1–9 of 9 results for author: Do, H